# Restricted Boltzmann Machine (RBM)

A Restricted Boltzmann Machine (RBM) is a Boltzmann machine without visible-visible and hidden-hidden connections.

**Context:**- It can be trained by a Restricted Boltzmann Machine Training System that implements a Restricted Boltzmann Machine Training Algorithm.
- It can represent a Generative Stochastic Neural Network (that can learn a probability distribution over its set of inputs).

**Counter-Example(s):****See:**Recurrent Neural Network, Generative Model, Stochastic Neural Network, Artificial Neural Network, Feature Learning, Autoencoder.

## References

### 2017

- (Fisher et al., 2018) ⇒ Charles K. Fisher, Aaron M. Smith, and Jonathan R. Walsh. (2018). “Boltzmann Encoded Adversarial Machines.” In: arXiv preprint arXiv:1804.08682.
- QUOTE: Restricted Boltzmann Machines (RBMs) are a class of generative neural network that are typically trained to maximize a log-likelihood objective function. We argue that likelihood-based training strategies may fail because the objective does not sufficiently penalize models that place a high probability in regions where the training data distribution has low probability.

### 2017a

- (Hinton, 2017) ⇒ Hinton G. (2017). "Boltzmann Machines". In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining.
- QUOTE: A restricted Boltzmann machine (Smolensky 1986) consists of a layer of visible units and a layer of hidden units with no visible-visible or hidden-hidden connections. With these restrictions, the hidden units are conditionally independent given a visible vector, so unbiased samples from [math]\langle s_i s_j\rangle_{data}[/math] can be obtained in one parallel step. To sample from [math]\langle s_is_j\rangle_{model}[/math] still requires multiple iterations that alternate between updating all the hidden units in parallel and updating all of the visible units in parallel. However, learning still works well if [math]\langle s_is_j\rangle_{model}[/math] is replaced by [math]\langle s_is_j\rangle_{reconstruction}[/math] which is obtained as follows:

- Starting with a data vector on the visible units, update all of the hidden units in parallel.
- Update all of the visible units in parallel to get a “reconstruction.”
- Update all of the hidden units again.

- This efficient learning procedure approximates gradient descent in a quantity called “contrastive divergence” and works well in practice (Hinton 2002).

### 2017b

- (Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/restricted_Boltzmann_machine Retrieved:2017-2-27.
- A
**restricted Boltzmann machine**(**RBM**) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs.RBMs were initially invented under the name

**Harmonium**by Paul Smolensky in 1986,^{[1]}, and rose to prominence after Geoffrey Hinton and collaborators invented fast learning algorithms for them in the mid-2000s. RBMs have found applications in dimensionality reduction, classification, collaborative filtering,^{[2]}feature learning^{[3]}and topic modelling.^{[4]}They can be trained in either supervised or unsupervised ways, depending on the task.As their name implies, RBMs are a variant of Boltzmann machines, with the restriction that their neurons must form a bipartite graph:

a pair of nodes from each of the two groups of units (commonly referred to as the "visible" and "hidden" units respectively) may have a symmetric connection between them; and there are no connections between nodes within a group. By contrast, "unrestricted" Boltzmann machines may have connections between hidden units. This restriction allows for more efficient training algorithms than are available for the general class of Boltzmann machines, in particular the gradient-based

**contrastive divergence**algorithm.^{[5]}Restricted Boltzmann machines can also be used in deep learning networks. In particular, deep belief networks can be formed by "stacking" RBMs and optionally fine-tuning the resulting deep network with gradient descent and backpropagation.

- A

### 2017c

- (The Asimov Institute, 2017) ⇒ http://asimovinstitute.org/neural-network-zoo/
- QUOTE: Restricted Boltzmann machines (RBM) are remarkably similar to BMs (surprise) and therefore also similar to HNs. The biggest difference between BMs and RBMs is that RBMs are a better usable because they are more restricted. They don’t trigger-happily connect every neuron to every other neuron but only connect every different group of neurons to every other group, so no input neurons are directly connected to other input neurons and no hidden to hidden connections are made either. RBMs can be trained like FFNNs with a twist: instead of passing data forward and then back-propagating, you forward pass the data and then backward pass the data (back to the first layer). After that you train with forward-and-back-propagation.

- QUOTE: Restricted Boltzmann machines (RBM) are remarkably similar to BMs (surprise) and therefore also similar to HNs. The biggest difference between BMs and RBMs is that RBMs are a better usable because they are more restricted. They don’t trigger-happily connect every neuron to every other neuron but only connect every different group of neurons to every other group, so no input neurons are directly connected to other input neurons and no hidden to hidden connections are made either. RBMs can be trained like FFNNs with a twist: instead of passing data forward and then back-propagating, you forward pass the data and then backward pass the data (back to the first layer). After that you train with forward-and-back-propagation.

### 2014

- http://deeplearning4j.org/restrictedboltzmannmachine.html
- QUOTE: To quote Geoff Hinton, a Google researcher and university professor, a Boltzmann machine is “a network of symmetrically connected, neuron-like units that make stochastic decisions about whether to be on or off.” (Stochastic means “randomly determined.”)
A restricted Boltzmann machine “consists of a layer of visible units and a layer of hidden units with no visible-visible or hidden-hidden connections.” The “restricted” comes from limits imposed on how its nodes connect: intra-layer connections are not allowed, but each node of one layer connects to every node of the next, and that is called “symmetry.”

- QUOTE: To quote Geoff Hinton, a Google researcher and university professor, a Boltzmann machine is “a network of symmetrically connected, neuron-like units that make stochastic decisions about whether to be on or off.” (Stochastic means “randomly determined.”)

### 2012

- http://deeplearning.net/tutorial/rbm.html
- Boltzmann Machines (BMs) are a particular form of log-linear Markov Random Field (MRF), i.e., for which the energy function is linear in its free parameters. To make them powerful enough to represent complicated distributions (i.e., go from the limited parametric setting to a non-parametric one), we consider that some of the variables are never observed (they are called hidden). By having more hidden variables (also called hidden units), we can increase the modeling capacity of the Boltzmann Machine (BM). Restricted Boltzmann Machines further restrict BMs to those without visible-visible and hidden-hidden connections. A graphical depiction of an RBM is shown below.

### 2008

- (Larochelle & Bengio, 2008) ⇒ H. Larochelle, and Yoshua Bengio. (2008). “Classification using discriminative restricted Boltzmann machines". In: Proceedings of the 25th International Conference on Machine learning (ICML 2008). doi:10.1145/1390156.1390224

### 2006

- (Hinton & Salakhutdinov, 2006) ⇒ Geoffrey E. Hinton, and Ruslan R. Salakhutdinov. (2006). “Reducing the Dimensionality of Data with Neural Networks.” In: Science, 313(5786). doi:10.1126/science.1127647
- QUOTE: An ensemble of binary vectors (e.g., images) can be modeled using a two-layer network called a "restricted Boltzmann machine" (RBM) (5, 6) in which stochastic, binary pixels are connected to stochastic, binary feature detectors using symmetrically weighted connections. The pixels correspond to "visible" units of the RBM because their states are observed; the feature detectors correspond to "hidden" units. A joint configuration [math](\mathbf{v}, \mathbf{h})[/math] of the visible and hidden units has an energy (7) given by...

- ↑ Smolensky, Paul (1986). "Chapter 6: Information Processing in Dynamical Systems: Foundations of Harmony Theory" (PDF). In Rumelhart, David E.; McLelland, James L. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations. MIT Press. pp. 194–281. [ISBN 0-262-68053-X].
- ↑ Salakhutdinov, R.; Mnih, A.; Hinton, G. (2007). Restricted Boltzmann machines for collaborative filtering. Proceedings of the 24th International Conference on Machine learning - ICML '07. p. 791. doi:10.1145/1273496.1273596. ISBN 9781595937933
- ↑ Coates, Adam; Lee, Honglak; Ng, Andrew Y. (2011). An analysis of single-layer networks in unsupervised feature learning (PDF). International Conference on Artificial Intelligence and Statistics (AISTATS).
- ↑ Ruslan Salakhutdinov and Geoffrey Hinton (2010). Replicated softmax: an undirected topic model.
*Neural Information Processing Systems***23**. - ↑ Miguel Á. Carreira-Perpiñán and Geoffrey Hinton (2005). On contrastive divergence learning.
*Artificial Intelligence and Statistics*.