Markov Chain Monte Carlo (MCMC) Sampling Algorithm

An Markov Chain Monte Carlo (MCMC) Sampling Algorithm is a stochastic approximate Bayesian inference algorithm that constructs a Markov chain with the desired probability distribution as its equilibrium distribution.

Context:
- It can (typically) approximate High-Dimensional Integrals and Intractable Summations.
- It can (typically) be a Random Walk Monte Carlo Algorithm (to walk through parameter space in such a way that you fairly sample your distribution of interest).
- It can construct a (joint) Bayesian posterior for the parameters of a population dynamics model.
- It can be implemented within an MCMC System (to solve an MCMC analysis task).
- It can be unclear when to stop.
- It can be Computationally Expensive (and so be limited to small datasets).
Example(s):
- a Sequential Monte Carlo Algorithm.
- a Gibbs Sampling Algorithm.
- a Metropolis–Hastings Algorithm.
- It can take the structure of: (Hillborn, 2004)
  - 1. Select an “initial” parameter vector (often the mode of the posterior) and compute its posterior density (the product of the likelihood and the prior).
  - 2. Generate a “new” parameter vector based on the current one (using a “jump” function) and compute its posterior density.
  - 3. Replace the current parameter vector by the new parameter vector with probability equal to the ratio of the new to the current posterior density.
  - 4. Output the current parameter vector.
  - 5. Repeat steps 2)- 4) many times! (Steps 2-4 are referred to as a cycle.).
- …
Counter-Example(s):
- a Variational Inference Algorithm.
- a Spectral Estimation Algorithm.
See: Markov Chain, BEAST System.

References

2017

(Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo Retrieved:2017-6-26.
- In statistics, Markov chain Monte Carlo (MCMC) methods are a class of algorithms for sampling from a probability distribution based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The state of the chain after a number of steps is then used as a sample of the desired distribution. The quality of the sample improves as a function of the number of steps.
  Random walk Monte Carlo methods make up a large subclass of MCMC methods.

2017b

(Wikipedia, 2017) ⇒ https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo#Application_domains Retrieved:2017-6-26.
- MCMC methods are primarily used for calculating numerical approximations of multi-dimensional integrals, for example in Bayesian statistics, computational physics, computational biology and computational linguistics. ^[1] ^[2] * In Bayesian statistics, the recent development of MCMC methods has been a key step in making it possible to compute large hierarchical models that require integrations over hundreds or even thousands of unknown parameters.
  - They are also used for generating samples that gradually populate the rare failure region in rare event sampling.

2012

http://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo
- Markov chain Monte Carlo (MCMC) methods (which include random walk Monte Carlo methods) are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. The state of the chain after a large number of steps is then used as a sample of the desired distribution. The quality of the sample improves as a function of the number of steps.
  Usually it is not hard to construct a Markov chain with the desired properties. The more difficult problem is to determine how many steps are needed to converge to the stationary distribution within an acceptable error. A good chain will have rapid mixing — the stationary distribution is reached quickly starting from an arbitrary position — described further under Markov chain mixing time.
  Typical use of MCMC sampling can only approximate the target distribution, as there is always some residual effect of the starting position. More sophisticated MCMC-based algorithms such as coupling from the past can produce exact samples, at the cost of additional computation and an unbounded (though finite in expectation) running time.
  The most common application of these algorithms is numerically calculating multi-dimensional integrals. In these methods, an ensemble of "walkers" moves around randomly. At each point where the walker steps, the integrand value at that point is counted towards the integral. The walker then may make a number of tentative steps around the area, looking for a place with reasonably high contribution to the integral to move into next. Random walk methods are a kind of random simulation or Monte Carlo method. However, whereas the random samples of the integrand used in a conventional Monte Carlo integration are statistically independent, those used in MCMC are correlated. A Markov chain is constructed in such a way as to have the integrand as its equilibrium distribution.

2009

(Ghahramani, 2009b) =. Zoubin Ghahramani. (2009). http://learning.eng.cam.ac.uk/zoubin/mcmc.html
- QUOTE: Markov chain Monte Carlo (MCMC) methods use sampling to approximate high dimensional integrals and intractable sums. MCMC methods are widely used in many areas of science, applied mathematics and engineering. They are an indispensable approximate inference tool for Bayesian statistics and machine learning.

2006

(Bishop, 2006) ⇒ Christopher M. Bishop. (2006). “Pattern Recognition and Machine Learning." Springer, Information Science and Statistics. ISBN:0387310738
- 11.2 Markov Chain Monte Carlo

2005

(Blei & Jordan, 2005) ⇒ David M. Blei, and Michael I. Jordan (2005), "Variational Methods for Dirichlet Process Mixtures.” In: Bayesian Analysis, 1.
- QUOTE: The methodology of Monte Carlo Markov chain (MCMC) sampling has energized Bayesian statistics for more than a decade, providing a systematic approach to the computation of likelihoods and posterior distributions, and permitting the deployment of Bayesian methods in a rapidly growing number of applied problems. … MCMC methods It can be slow to converge and their convergence can be difficult to diagnose. While further research on sampling is needed, it is also important to explore alternatives, particularly in the context of large-scale problems.

2004a

(Griffiths & Steyvers, 2004) ⇒ Thomas L. Griffiths, and Mark Steyvers. (2004). “Finding Scientific Topics.” In: PNAS, 101(Suppl. 1). doi:10.1073/pnas.0307752101
- … We then present a Markov chain Monte Carlo algorithm for inference in this model. We use this algorithm to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics. ...

2004b

(Hilborn, 2004) ⇒ Ray Hilborn. (2004). “The Markov Chain Monte Carlo (MCMC) Algorithm.” Course Lecture Notes: FISH 558, Spring 2004. Advanced Analysis in Fisheries Stock Assessment. http://faculty.washington.edu/rayh/558_2004/ppt_files_lectures/MCMC.PPT
- Objectives for MCMC
  - MCMC is a way to construct a (joint) Bayesian posterior for the parameters of a population dynamics model.
  - The basic idea is to set up a (long) “chain” which starts at a pre-specified parameter vector and that then traverses the posterior distribution.
  - The sample from the posterior distribution is then every n’th element in the chain (ignoring the first “few” elements of the chain).
- Overview of the MCMC algorithm.
  - Select an “initial” parameter vector (often the mode of the posterior) and compute its posterior density (the product of the likelihood and the prior).
  - Generate a “new” parameter vector based on the current one (using a “jump” function) and compute its posterior density.
  - Replace the current parameter vector by the new parameter vector with probability equal to the ratio of the new to the current posterior density.
  - Output the current parameter vector.
  - Repeat steps 2)- 4) many times!
  - (Steps 2-4 are referred to as a cycle.)

2003

(Blei, Ng & Jordan, 2003) ⇒ David M. Blei, Andrew Y. Ng, and Michael I. Jordan. (2003). “Latent Dirichlet Allocation.” In: The Journal of Machine Learning Research, 3.
(Andrieu et al., 2003) ⇒ Christophe Andrieu, Nando De Freitas, Arnaud Doucet, and Michael I. Jordan. (2003). “An Introduction to MCMC for Machine Learning.” In: Machine Learning, 50(1). doi:10.1023/A:1020281327116

1996

(Gilks & Spiegelhalter, 1996) ⇒ Walter R. Gilks, and D. J. Spiegelhalter, editors. (1996). “Markov Chain Monte Carlo in Practice." CRC Press. ISBN:0412055511.
- Key words and phrases: Gibbs sampler, Bayes factor, MCMC, posterior distribution, Metropolis-Hastings algorithm, rejection sampling, Bayesian inference, marginal likelihood, genotypes, hyperparameters, mixture model, prior distribution, posterior probability, maximum likelihood, Bayesian Statistics, random-effects model, gibbsit, Metropolis algorithm, conditional independence, credible intervals

1995

(Gelman et al., 1995) ⇒ Andrew Gelman, Hal S. Stern, John B. Carlin, and Donald B. Rubin. (1995). “Bayesian Data Analysis.” Chapman and Hall. ISBN:0412039915

1984

(Smith, 1984) ⇒ R. L. Smith. (1984). “Efficient Monte Carlo Procedures for Generating Points Uniformly Distributed Over Bounded Regions.” In: Operations Research, 32.

↑ See Gill 2008.
↑ See Robert & Casella 2004.

[1] See Gill 2008.

[2] See Robert & Casella 2004.

[1]

[2]

Markov Chain Monte Carlo (MCMC) Sampling Algorithm

References

2017

2017b

2012

2009

2006

2005

2004a

2004b

2003

1996

1995

1984

Navigation menu

Search