# Multinomial Probability Function

(Redirected from Multinomial Mass Function)

## References

### 2014

• (Wikipedia, 2014) ⇒ http://en.wikipedia.org/wiki/Multinomial_distribution Retrieved:2014-10-29.
• In probability theory, the multinomial distribution is a generalization of the binomial distribution. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

The binomial distribution is the probability distribution of the number of

successes for one of just two categories in n independent Bernoulli trials, with the same probability of success on each trial. In a multinomial distribution, the analog of the Bernoulli distribution is the categorical distribution, where each trial results in exactly one of some fixed finite number k possible outcomes, with probabilities p1, ..., pk (so that pi ≥ 0 for i = 1, ..., k and $\displaystyle{ \sum_{i=1}^k p_i = 1 }$), and there are n independent trials. Then if the random variables Xi indicate the number of times outcome number i is observed over the n trials, the vector X = (X1, ..., Xk) follows a multinomial distribution with parameters n and p, where p = (p1, ..., pk). Note that while the trials are independent, their outcomes X are dependent because they must be summed to n.

Note that, in some fields, such as natural language processing, the categorical and multinomial distributions are conflated, and it is common to speak of a "multinomial distribution" when a categorical distribution is actually meant. This stems from the fact that it is sometimes convenient to express the outcome of a categorical distribution as a "1-of-K" vector (a vector with one element containing a 1 and all other elements containing a 0) rather than as an integer in the range $\displaystyle{ 1 \dots K }$; in this form, a categorical distribution is equivalent to a multinomial distribution over a single trial.

• parameters =$\displaystyle{ n \gt 0 }$ number of trials (integer)
$\displaystyle{ p_1, \ldots, p_k }$ event probabilities ($\displaystyle{ \Sigma p_i = 1 }$)|
• support =$\displaystyle{ X_i \in \{0,\dots,n\} }$
$\displaystyle{ \Sigma X_i = n\! }$|
• pdf =$\displaystyle{ \frac{n!}{x_1!\cdots x_k!} p_1^{x_1} \cdots p_k^{x_k} }$|
• cdf =|
• mean =$\displaystyle{ E\{X_i\} = np_i }$|
• median =|
• mode =|
• variance =$\displaystyle{ \textstyle{\mathrm{Var}}(X_i) = n p_i (1-p_i) }$
$\displaystyle{ \textstyle {\mathrm{Cov}}(X_i,X_j) = - n p_i p_j~~(i\neq j) }$|
• skewness =|
• kurtosis =|
• entropy =|
• mgf =$\displaystyle{ \biggl( \sum_{i=1}^k p_i e^{t_i} \biggr)^n }$|
• char =$\displaystyle{ \left(\sum_{j=1}^k p_je^{it_j}\right)^n }$ where $\displaystyle{ i^2= -1 }$|
• pgf = $\displaystyle{ \biggl( \sum_{i=1}^k p_i z_i \biggr)^n\text{ for }(z_1,\ldots,z_k)\in\mathbb{C}^k }$|
• conjugate =Dirichlet: $\displaystyle{ \mathrm{Dir}(\alpha+\beta) }$|

### 2009

• http://stattrek.com/Tables/multinomial.aspx#experiment
• A multinomial distribution is a probability distribution. It refers to the probabilities associated with each of the possible outcomes in a multinomial experiment. For example, suppose we toss a toss a pair of dice one time. This multinomial experiment has 11 possible outcomes: the numbers from 1 to 12. The probabilities associated with each possible outcome are an example of a multinomial distribution
• (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Multinomial_distribution
• In probability theory, the multinomial distribution is a generalization of the binomial distribution.
• The binomial distribution is the probability distribution of the number of "successes" in n independent Bernoulli trials, with the same probability of "success" on each trial. In a multinomial distribution, the analog of the Bernoulli distribution is the categorical distribution, where each trial results in exactly one of some fixed finite number k of possible outcomes, with probabilities p1, ..., pk (so that pi ≥ 0 for i = 1, ..., k and \sum_{i=1}^k p_i = 1), and there are n independent trials. Then let the random variables Xi indicate the number of times outcome number i was observed over the n trials. The vector X = (X1, ..., Xk) follows a multinomial distribution with parameters n and p, where p = (p1, ..., pk).