Leaky Rectified Linear Activation (LReLU) Function

From GM-RKB
Jump to navigation Jump to search

A Leaky Rectified Linear Activation (LReLU) Function is a rectified-based activation function that is based on the mathematical function:

[math]\displaystyle{ f(x)=max(0,x)+\beta∗min(0,x) }[/math],

where [math]\displaystyle{ \beta }[/math] is small non-zero gradient.



References

2018a

  • (Chainer, 2018) ⇒ http://docs.chainer.org/en/stable/reference/generated/chainer.functions.leaky_relu.html Retrieved:2018-2-18
    • QUOTE: chainer.functions.leaky_relu(x, slope=0.2)source

       Leaky Rectified Linear Unit function.

      This function is expressed as

      [math]\displaystyle{ f(x) = \begin{cases} x, & \mbox{if } x \ge 0 \\ ax, & \mbox{if } x \lt 0 \end{cases} }[/math].

      where [math]\displaystyle{ a }[/math] is a configurable slope value.

      Parameters:

      • x (Variable or numpy.ndarray or cupy.ndarray) – Input variable. A [math]\displaystyle{ (s_1,s_2,\cdots,s_N) }[/math]-shaped float array.
      • slope (float) – Slope value [math]\displaystyle{ a }[/math].
Returns: Output variable. A [math]\displaystyle{ (s_1,s_2,\cdots,s_N) }[/math]-shaped float array.
Return type: Variable
Example:
>>> x = np.array([ [-1, 0], [2, -3], [-2, 1] ], 'f')

>>> x

array([ [-1., 0.],

[ 2., -3.],

[-2., 1.] ], dtype=float32)

>>> F.leaky_relu(x, slope=0.2).data

array([ [-0.2, 0. ],

[ 2. , -0.6],

2018b

  • (Pytorch,2018) & rArr; http://pytorch.org/docs/master/nn.html#leakyrelu Retrieved: 2018-2-10.
    • QUOTE: class torch.nn.LeakyReLU(negative_slope=0.01, inplace=False)source

      Applies element-wise, [math]\displaystyle{ f(x)=max(0,x)+negative\_slope∗min(0,x) }[/math]

      Parameters:

      • negative_slope– Controls the angle of the negative slope. Default: 1e-2
      • inplace – can optionally do the operation in-place. Default: False
Shape:
  • Input: (N,∗) where * means, any number of additional dimensions
  • Output: (N,∗), same shape as the input
Examples:

{| class="wikitable" style="margin-left: 30px;border:0px; width:60%;"

|style="font-family:monospace; font-size:10.5pt;font-weight=bold;text-align:top;"| >>> m = nn.LeakyReLU(0.1)

>>> input = autograd.Variable(torch.randn(2))

>>> print(input)

>>> print(m(input)) |}

2018c

  • (Wikipedia, 2018) ⇒ https://en.wikipedia.org/wiki/Rectifier_(neural_networks)#Leaky_ReLUs Retrieved:2018-2-4.
    • Leaky ReLUs allow a small, non-zero gradient when the unit is not active.[1] : [math]\displaystyle{ f(x) = \begin{cases} x & \mbox{if } x \gt 0 \\ 0.01x & \mbox{otherwise} \end{cases} }[/math]

      Parametric ReLUs take this idea further by making the coefficient of leakage into a parameter that is learned along with the other neural network parameters.[2] : [math]\displaystyle{ f(x) = \begin{cases} x & \mbox{if } x \gt 0 \\ a x & \mbox{otherwise} \end{cases} }[/math]

      Note that for [math]\displaystyle{ a\leq1 }[/math], this is equivalent to : [math]\displaystyle{ f(x) = \max(x, ax) }[/math] and thus has a relation to "maxout" networks.

2018d

2017

  • (Mate Labs, 2017) ⇒ Mate Labs Aug 23, 2017. Secret Sauce behind the beauty of Deep Learning: Beginners guide to Activation Functions
    • QUOTE: Leaky rectified linear unit (Leaky ReLU) — Leaky ReLUs allow a small, non-zero gradient when the unit is not active. 0.01 is the small non-zero gradient here

      [math]\displaystyle{ f(x) = \begin{cases} 0, & \mbox{for } 0.01x \lt 0 \\ x, & \mbox{for } x \geq 0 \end{cases} }[/math]

      Range:[math]\displaystyle{ (-\infty, +\infty) }[/math]


  1. Andrew L. Maas, Awni Y. Hannun, Andrew Y. Ng (2014). Rectifier Nonlinearities Improve Neural Network Acoustic Models
  2. He, Kaiming; Zhang, Xiangyu; Ren, Shaoqing; Sun, Jian (2015). “Delving Deep into Rectifiers: Surpassing Human-Level Performance on Image Net Classification". arXiv:1502.01852 Freely accessible [cs.CV].