# Neuron Activation Function

(Redirected from neuron activation function)

## References

### 2018c

• (CS231n, 2018) ⇒ Commonly used activation functions. In: CS231n Convolutional Neural Networks for Visual Recognition Retrieved: 2018-01-28.
• QUOTE: Every activation function (or non-linearity) takes a single number and performs a certain fixed mathematical operation on it. There are several activation functions you may encounter in practice:
• Sigmoid. The sigmoid non-linearity has the mathematical form $\sigma(x)=1/(1+e^{−x})$ and is shown in the image above on the left. As alluded to in the previous section, it takes a real-valued number and “squashes” it into range between 0 and 1. In particular, large negative numbers become 0 and large positive numbers become 1. The sigmoid function has seen frequent use historically since it has a nice interpretation as the firing rate of a neuron: from not firing at all (0) to fully-saturated firing at an assumed maximum frequency (1) (...)
• Tanh. The tanh non-linearity is shown on the image above on the right. It squashes a real-valued number to the range [-1, 1]. Like the sigmoid neuron, its activations saturate, but unlike the sigmoid neuron its output is zero-centered. Therefore, in practice the tanh non-linearity is always preferred to the sigmoid nonlinearity. Also note that the tanh neuron is simply a scaled sigmoid neuron, in particular the following holds: $tanh(x)=2\sigma(2x)−1$ (...)
• ReLU. The Rectified Linear Unit has become very popular in the last few years. It computes the function $f(x)=max(0,x)$. (...)
• Leaky ReLU. Leaky ReLUs are one attempt to fix the “dying ReLU” problem. Instead of the function being zero when $x \lt 0$, a leaky ReLU will instead have a small negative slope (of 0.01, or so). That is, the function computes $f(x)=1(x\lt 0)(αx)+1(x\gt =0)(x)$ where $\alpha$ is a small constant (...)
• Maxout. Other types of units have been proposed that do not have the functional form $f(w^Tx+b)$ where a non-linearity is applied on the dot product between the weights and the data (...)

### 2005

a. linear function,
b. threshold function,
c. sigmoid function.