Stochastic Gradient Langevin Dynamics

From GM-RKB
Jump to navigation Jump to search

A Stochastic Gradient Langevin Dynamics is a stochastic optimization algorithm which combines the stochastic gradient descent and the Langevin dynamics approaches.

  • AKA: SGLD.
  • Context:
    • It updates the parameter vector, [math]\displaystyle{ \theta }[/math], as follows:

[math]\displaystyle{ \nabla θ_t =\frac{\epsilon}{2}\left(\nabla log\;p(\theta_t)+\frac{N}{n}\sum^n_{i=1}\nabla log p\;(\theta_{ti}|\theta_t)\right)+\eta_t }[/math]

at each iteration [math]\displaystyle{ t }[/math] for a subset of [math]\displaystyle{ N }[/math] data items [math]\displaystyle{ X_t = \{x_{t1},\dots , x_{tn}\} }[/math] and where [math]\displaystyle{ p(\theta) }[/math] is the prior probability distribution, [math]\displaystyle{ p(x_{ti}|\theta) }[/math] is the posterior probability distribution, [math]\displaystyle{ \epsilon_t }[/math] is the sequence of step sizes. The last term of the equation, [math]\displaystyle{ \eta_t }[/math], is the noise term as defined in the Langevin dynamics approach and it has a Gaussian probability distribution, [math]\displaystyle{ \eta_t \sim N(0, \epsilon_t) }[/math].


References

2011

Welling, Max, and Yee W. Teh. "Bayesian learning via stochastic gradient Langevin dynamics." Proceedings of the 28th International Conference on Machine Learning (ICML-11). 2011.