Dirichlet Process

AKA: DP.
Context:
- It is a Probability Distribution over probability distributions.
  - A sample drawn from a DP is a Random Discrete Distribution.
- …
Example(s):
- a Two-Parameter Poisson-Dirichlet Process.
- a Hierarchical Dirichlet Process.
- …
Counter-Example(s):
- a Gaussian Process.
See: Dirichlet Process Prior, Dirichlet Distribution Function, Dirichlet Process Mixture Model, Nonparametric Bayesian Algorithm, Bayesian Methods; Bayesian Nonparametrics; Clustering; Density Estimation.

References

(Yee Whye The, 2011) ⇒ Yee Whye The. (2011). “Dirichlet Process.” In: (Sammut & Webb, 2011) p.280

http://en.wikipedia.org/wiki/Dirichlet_process#Related_distributions
- The Pitman–Yor distribution (also known as the 'two-parameter Poisson-Dirichlet process') is a generalisation of the Dirichlet process.
- The hierarchical Dirichlet process extends the ordinary Dirichlet process for modelling grouped data.

(Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Dirichlet_process
- In probability theory, a Dirichlet process over a set [math]\displaystyle{ S }[/math] is a stochastic process whose sample path is a probability distribution on S. The finite dimensional distributions are from the Dirichlet distribution: If [math]\displaystyle{ M }[/math] is a finite measure on [math]\displaystyle{ S }[/math] and [math]\displaystyle{ X }[/math] is a random distribution drawn from a Dirichlet process, written as: [math]\displaystyle{ X \sim \mathrm{DP}\left(M\right) }[/math] then for any partition of [math]\displaystyle{ S }[/math], say [math]\displaystyle{ \left\{B_i\right\}_{i=1}^{n} }[/math], we have that [math]\displaystyle{ \left(X\left(B_1\right),\dots,X\left(B_n\right)\right) \sim \mathrm{Dirichlet}\left(M\left(B_1\right),\dots,M\left(B_n\right)\right) }[/math]

(Teh et al., 2006) ⇒ Yee Whye Teh, Michael I. Jordan, Matthew J. Beal, and David M. Blei. (2006). “Hierarchical Dirichlet Processes.” In: Journal of the American Statistical Association, 101(476). doi:10.1198/016214506000000302
- QUOTE: Our approach to the problem of sharing clusters among multiple, related groups is a nonparametric Bayesian approach, reposing on the Dirichlet process (Ferguson 1973). The Dirichlet process [math]\displaystyle{ \operatorname{DP}(\alpha_0,G_0) }[/math] is a measure on measures. It has two parameters, a scaling parameter [math]\displaystyle{ \alpha_0 \gt 0 }[/math] and a base probability measure [math]\displaystyle{ G_0 }[/math]. An explicit representation of a draw from a Dirichlet process (DP) was given by Sethuraman (1994), who showed that if [math]\displaystyle{ G ~ \operatorname{DP}(\alpha_0,G_0) }[/math], then with probability one:
  :[math]\displaystyle{ G = \Sigma_{k=1}^{\infty} \beta_k \delta_{\phi_k} , \text{(1)} }[/math]
  where the [math]\displaystyle{ \phi_k }[/math] are independent random variables distributed according to [math]\displaystyle{ G0 }[/math], where [math]\displaystyle{ \delta_{\phi_k} }[/math] is an atom at [math]\displaystyle{ \phi_k }[/math], and where the “stick-breaking weights” [math]\displaystyle{ \beta_k }[/math] are also random and depend on the parameter [math]\displaystyle{ \alpha_0 }[/math] (the definition of the [math]\displaystyle{ \beta_k }[/math] is provided in Section 3.1).

(Rasmussen, 2000) ⇒ Carl Rasmussen. (2000). “The Infinite Gaussian Mixture Model." In: Advances in Neural Information Processing Systems (NIPS) 12.

(Sethuraman, 1994) ⇒ J. Sethuraman. (1994). “A Constructive Definition of Dirichlet Priors." Statistica Sinica, 4.

(Ferfuson, 1973) ⇒ Thomas Ferguson. (1973). “Bayesian Analysis of Some Nonparametric Problems.” In: Annals of Statistics 1 (2) doi:10.1214/aos/1176342360
- ABSTRACT: The Bayesian approach to statistical problems, though fruitful in many ways, has been rather unsuccessful in treating nonparametric problems. This is due primarily to the difficulty in finding workable prior distributions on the parameter space, which in nonparametric ploblems is taken to be a set of probability distributions on a given sample space. There are two desirable properties of a prior distribution for nonparametric problems. (I) The support of the prior distribution should be large--with respect to some suitable topology on the space of probability distributions on the sample space. (II) Posterior distributions given a sample of observations from the true probability distribution should be manageable analytically. These properties are antagonistic in the sense that one may be obtained at the expense of the other. This paper presents a class of prior distributions, called Dirichlet process priors, broad in the sense of (I), for which (II) is realized, and for which treatment of many nonparametric statistical problems may be carried out, yielding results that are comparable to the classical theory.