# Neural Network Training System

A Neural Network Training System is a model-based training system that implements a neural network training algorithm (to solve a neural network training task which requires a trained neural network).

**AKA:**ANN Trainer.**Context:**- It can range from being an Unsupervised ANN Learning System to Supervised ANN Learning System.
- It can range from being a Single-Layer ANN Training System to being a Multi-Layer ANN Training System (such as a very-deep ANN trainer).
- It can range from being a Undirectional Neural NetWork Training System to being a Bidirectional Neural Network Training System.
- It can be a Neural Network Classifier, to being a Neural Network Ranker, to being a Neural Network Estimator.

**Example(s):**- a 2-Layer ANN Training System, 3-Layer ANN Training System, ...
- a Neural Network Training Framework such as: PyTorch, TensorFlow, ConvNetJS, DeepLearning4J, MXNet, D3NER, MathWorks Neural Network Toolbox[1], or ...
- a Backprop-based Training System;
- a Python-based NNet Trainer;
- a Theano-Recurrence Training System;
- a Pytorch-based Neural Modeling System.

**Counter-Example(s):****See:**Deep Neural Network, Predictive Modeling System, sklearn.neural_network.

## References

### 2016

- (Zhao, 2016) ⇒ Peng Zhao, (2016). "R for Deep Learning (I): Build Fully Connected Neural Network from Scratch"
- QUOTE: Training is to search the optimization parameters (weights and bias) under the given network architecture and minimize the classification error or residuals. This process includes two parts: feed forward and back propagation. Feed forward is going through the network with input data (as prediction parts) and then compute data loss in the output layer by loss function (cost function).
*“Data loss measures the compatibility between a prediction (e.g. the class scores in classification) and the ground truth label.”*In our example code, we selected cross-entropy function to evaluate data loss, see detail in here.After getting data loss, we need to minimize the data loss by changing the weights and bias. The very popular method is to back-propagate the loss into every layers and neuron by gradient descent or stochastic gradient descent which requires derivatives of data loss for each parameter (W1, W2, b1, b2). And back propagation will be different for different activation functions and see here and here for their derivatives formula and method, and Stanford CS231n for more training tips.

- QUOTE: Training is to search the optimization parameters (weights and bias) under the given network architecture and minimize the classification error or residuals. This process includes two parts: feed forward and back propagation. Feed forward is going through the network with input data (as prediction parts) and then compute data loss in the output layer by loss function (cost function).

### 2015

- (Trask, 2015) ⇒ Trask (July 2015). A Neural Network in 11 Lines of Python
- QUOTE: This backpropagation is shown via a small python implementation. Take a look at this neural network in 11 lines of python:

X = np.array([ [0,0,1],[0,1,1],[1,0,1],[1,1,1] ]) y = np.array(0,1,1,0).T syn0 = 2*np.random.random((3,4)) - 1 syn1 = 2*np.random.random((4,1)) - 1 for j in xrange(60000): l1 = 1/(1+np.exp(-(np.dot(X,syn0)))) l2 = 1/(1+np.exp(-(np.dot(l1,syn1)))) l2_delta = (y - l2)*(l2*(1-l2)) l1_delta = l2_delta.dot(syn1.T) * (l1 * (1-l1)) syn1 += l1.T.dot(l2_delta) syn0 += X.T.dot(l1_delta)

- This neural network attempts to use the input for predicting the output. Here programmer tries to predict the output column of the three input columns. Well, this was all I had to tell you about the neural network in 11 lines of python. This problem of simple backpropagation could be used to make a more advanced 2 layer neural network.