1999 LinearNeuralNetworks

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Course, Presentation Slides, Gradient Descent Algorithm.

Notes

Cited By

Quotes

  • To find the gradient G for the entire data set, we sum at each weight the contribution given by equation 6 over all the data points. We can then subtract a small proportion µ (called the learning rate) of G from the weights to perform gradient descent.
  • 1. Initialize all weights to small random values.
  • 2. REPEAT until done
    • 1. For each weight wij set
    • 2. For each data point (x, t)p
      • 1. set input units to x
      • 2. compute value of output units
      • 3. For each weight wij set
  • 3. For each weight wij set
  • An alternative approach is online learning, where the weights are updated immediately after seeing each data point. Since the gradient for a single data point can be considered a noisy approximation to the overall gradient G (Fig. 5), this is also called stochastic (noisy) gradient descent.

References

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1999 LinearNeuralNetworksGenevieve OrrLinear Neural Networkshttp://www.willamette.edu/~gorr/classes/cs449/linear2.html