Perceptron Training Algorithm

(Redirected from Perceptron Algorithm)
Jump to navigation Jump to search

A Perceptron Training Algorithm is a linear supervised model-based binary classification algorithm that is used to train a perceptron network.






  • (Wikipedia, 2011) ⇒
    • Below is an example of a learning algorithm for a single-layer (no hidden layer) perceptron. For multilayer perceptrons, more complicated algorithms such as backpropagation must be used. Alternatively, methods such as the delta rule can be used if the function is non-linear and differentiable, although the one below will work as well.

      The learning algorithm we demonstrate is the same across all the output neurons, therefore everything that follows is applied to a single neuron in isolation. We first define some variables:

      • [math]\displaystyle{ y = f(\mathbf{z}) \, }[/math] denotes the output from the perceptron for an input vector [math]\displaystyle{ \mathbf{z} }[/math].
      • [math]\displaystyle{ b \, }[/math] is the bias term, which in the example below we take to be 0.
      • [math]\displaystyle{ D = \{(\mathbf{x}_1,d_1),\dots,(\mathbf{x}_s,d_s)\} \, }[/math] is the training set of [math]\displaystyle{ s }[/math] samples, where:
        • [math]\displaystyle{ \mathbf{x}_j }[/math] is the [math]\displaystyle{ n }[/math]-dimensional input vector.
        • [math]\displaystyle{ d_j \, }[/math] is the desired output value of the perceptron for that input.
    • We show the values of the nodes as follows:
      • [math]\displaystyle{ x_{j,i} \, }[/math] is the value of the [math]\displaystyle{ i }[/math]th node of the [math]\displaystyle{ j }[/math]th training input vector.
      • [math]\displaystyle{ x_{j,0} = 1 \, }[/math].
    • To represent the weights:
      • [math]\displaystyle{ w_i \, }[/math] is the [math]\displaystyle{ i }[/math]th value in the weight vector, to be multiplied by the value of the [math]\displaystyle{ i }[/math]th input node.
    • An extra dimension, with index [math]\displaystyle{ n+1 }[/math], can be added to all input vectors, with [math]\displaystyle{ x_{j,n+1}=1 \, }[/math], in which case [math]\displaystyle{ w_{n+1} \, }[/math] replaces the bias term.

      To show the time-dependence of [math]\displaystyle{ \mathbf{w} }[/math], we use:

      • [math]\displaystyle{ w_i(t) \, }[/math] is the weight [math]\displaystyle{ i }[/math] at time [math]\displaystyle{ t }[/math].
      • [math]\displaystyle{ \alpha \, }[/math] is the learning rate, where [math]\displaystyle{ 0 \lt \alpha \leq 1 }[/math].
    • Too high a learning rate makes the perceptron periodically oscillate around the solution unless additional steps are taken.


    Sample: (xi,ti), ti in {-1,+1}
    If  ti <wk,xi>  < 0 THEN  /* Error*/
    wk+1 = wk + ti xi
    until (error==false) 
 return k,(wk,bk)   where k is the number of mistakes