Neural Network Weight Size

From GM-RKB
Jump to navigation Jump to search

A Neural Network Weight Size is a number of Artificial Neural Connections between Neural Network Layers.



References

2017

Left: A 2-layer Neural Network (one hidden layer of 4 neurons (or units) and one output layer with 2 neurons), and three inputs.

Right:A 3-layer neural network with three inputs, two hidden layers of 4 neurons each and one output layer. Notice that in both cases there are connections (synapses) between neurons across layers, but not within a layer.

Naming conventions. Notice that when we say N-layer neural network, we do not count the input layer. Therefore, a single-layer neural network describes a network with no hidden layers (input directly mapped to output). In that sense, you can sometimes hear people say that logistic regression or SVMs are simply a special case of single-layer Neural Networks. You may also hear these networks interchangeably referred to as “Artificial Neural Networks” (ANN) or “Multi-Layer Perceptrons” (MLP). Many people do not like the analogies between Neural Networks and real brains and prefer to refer to neurons as units.

Output layer. Unlike all layers in a Neural Network, the output layer neurons most commonly do not have an activation function (or you can think of them as having a linear identity activation function). This is because the last output layer is usually taken to represent the class scores (e.g. in classification), which are arbitrary real-valued numbers, or some kind of real-valued target (e.g. in regression).

Sizing neural networks. The two metrics that people commonly use to measure the size of neural networks are the number of neurons, or more commonly the number of parameters. Working with the two example networks in the above picture:

 ::* The first network (left) has [math]\displaystyle{ 4 + 2 = 6 }[/math] neurons (not counting the inputs), [math]\displaystyle{ [3 \times 4] + [4 \times 2] = 20 }[/math] weights and 4 + 2 = 6 biases, for a total of 26 learnable parameters.

  • The second network (right) has [math]\displaystyle{ 4 + 4 + 1 = 9 }[/math] neurons, [math]\displaystyle{ [3 \times 4] + [4 \times 4] + [4 \times 1] = 12 + 16 + 4 = 32 }[/math] weights and [math]\displaystyle{ 4 + 4 + 1 = 9 }[/math] biases, for a total of 41 learnable parameters.

    To give you some context, modern Convolutional Networks contain on orders of 100 million parameters and are usually made up of approximately 10-20 layers (hence deep learning).

1998