Neuron Activation Function

From GM-RKB
Jump to: navigation, search

A Neuron Activation Function is an decision function in an artificial neuron.



References

2018a

2018b

2018c

  • (CS231n, 2018) ⇒ Commonly used activation functions. In: CS231n Convolutional Neural Networks for Visual Recognition Retrieved: 2018-01-28.
    • QUOTE: Every activation function (or non-linearity) takes a single number and performs a certain fixed mathematical operation on it. There are several activation functions you may encounter in practice:
      • Sigmoid. The sigmoid non-linearity has the mathematical form [math]\sigma(x)=1/(1+e^{−x})[/math] and is shown in the image above on the left. As alluded to in the previous section, it takes a real-valued number and “squashes” it into range between 0 and 1. In particular, large negative numbers become 0 and large positive numbers become 1. The sigmoid function has seen frequent use historically since it has a nice interpretation as the firing rate of a neuron: from not firing at all (0) to fully-saturated firing at an assumed maximum frequency (1) (...)
      • Tanh. The tanh non-linearity is shown on the image above on the right. It squashes a real-valued number to the range [-1, 1]. Like the sigmoid neuron, its activations saturate, but unlike the sigmoid neuron its output is zero-centered. Therefore, in practice the tanh non-linearity is always preferred to the sigmoid nonlinearity. Also note that the tanh neuron is simply a scaled sigmoid neuron, in particular the following holds: [math]tanh(x)=2\sigma(2x)−1[/math] (...)
      • ReLU. The Rectified Linear Unit has become very popular in the last few years. It computes the function [math]f(x)=max(0,x)[/math]. (...)
      • Leaky ReLU. Leaky ReLUs are one attempt to fix the “dying ReLU” problem. Instead of the function being zero when [math]x \lt 0[/math], a leaky ReLU will instead have a small negative slope (of 0.01, or so). That is, the function computes [math]f(x)=1(x\lt 0)(αx)+1(x\gt =0)(x)[/math] where [math]\alpha[/math] is a small constant (...)
      • Maxout. Other types of units have been proposed that do not have the functional form [math]f(w^Tx+b)[/math] where a non-linearity is applied on the dot product between the weights and the data (...)

2018d

2018e

2011

2005

a. linear function,
b. threshold function,
c. sigmoid function.

1986