Pointwise Mutual Information (PMI) Measure

From GM-RKB
(Redirected from PMI Measure)
Jump to: navigation, search

A Pointwise Mutual Information (PMI) Measure is a binary random variable measure of association for [math]x,y[/math] based the ratio between the co-occurrence probability [math]P(x,y)[/math] and the independent probability of observing [math]x,y[/math] by chance, [math]p(x)p(y)[/math].



References

2016

  • (Wikipedia, 2016) ⇒ http://en.wikipedia.org/wiki/Pointwise_mutual_information#Definition Retrieved:2016-2-10.
    • The PMI of a pair of outcomes x and y belonging to discrete random variables X and Y quantifies the discrepancy between the probability of their coincidence given their joint distribution and their individual distributions, assuming independence. Mathematically: :[math] \operatorname{pmi}(x;y) \equiv \log\frac{p(x,y)}{p(x)p(y)} = \log\frac{p(x|y)}{p(x)} = \log\frac{p(y|x)}{p(y)}.[/math] The mutual information (MI) of the random variables X and Y is the expected value of the PMI over all possible outcomes (with respect to the joint distribution [math]p(x,y)[/math]).

      The measure is symmetric ([math]\operatorname{pmi}(x;y)=\operatorname{pmi}(y;x)[/math]). It can take positive or negative values, but is zero if X and Y are independent. Note that even though PMI may be negative or positive, its expected outcome over all joint events (MI) is positive. PMI maximizes when X and Y are perfectly associated (i.e. [math]p(x|y)[/math] or [math]p(y|x)=1[/math]), yielding the following bounds: :[math] -\infty \leq \operatorname{pmi}(x;y) \leq \min\left[ -\log p(x), -\log p(y) \right] . [/math]

      Finally, [math]\operatorname{pmi}(x;y)[/math] will increase if [math]p(x|y)[/math] is fixed but [math]p(x)[/math]decreases.

      Here is an example to illustrate:

      Using this table we can marginalize to get the following additional table for the individual distributions:

      With this example, we can compute four values for [math]pmi(x;y)[/math]. Using base-2 logarithms:

      (For reference, the mutual information [math]\operatorname{I}(X;Y)[/math] would then be 0.214170945)

2016

2016

2011

  • (Wikipedia, 2011) ⇒ http://en.wikipedia.org/wiki/Pointwise_mutual_information
    • Pointwise mutual information (PMI), or specific mutual information, is a measure of association used in information theory and statistics.

      The PMI of a pair of outcomes [math]x[/math] and [math]y[/math] belonging to discrete random variables [math]X[/math] and [math]Y[/math] quantifies the discrepancy between the probability of their coincidence given their joint distribution and the probability of their coincidence given only their individual distributions, assuming independence. Mathematically:
      [math]SI(x,y) = \log\frac{p(x,y)}{p(x)p(y)}.[/math]

      The mutual information (MI) of the random variables [math]X[/math] and [math]Y[/math] is the expected value of the PMI over all possible outcomes.

      The measure is symmetric ([math]SI(x,y)=SI(y,x)[/math]). It can take on both negative and positive values but is zero if [math]X[/math] and [math]Y[/math] are independent, and equal to [math]-\log(p(x))[/math] if [math]X[/math] and [math]Y[/math] are perfectly associated. Finally, [math]SI(x,y)[/math] will increase if [math]p(x|y)[/math] is fixed but [math]p(x)[/math]decreases.

2009

2006

  • http://search.cpan.org/dist/Text-NSP/lib/Text/NSP/Measures/2D/MI/pmi.pm
    • Assume that the frequency count data associated with a bigram <word1><word2> is stored in a 2x2 contingency table: [math] \begin{array}{c|cc|c} & \neg {word_2 } & ~word_2 & \\ \hline word_1 & n_{11} & n_{12} & n_{1p} \\ \neg word_1 & n_{21} & n_{22} & n_{2p} \\ \hline & n_{p1} & n_{p2} &n_{pp} \end{array} [/math] where [math]n_{11}[/math] is the number of times <word1><word2> occur together, and n12 is the number of times <word1> occurs with some word other than word2, and n1p is the number of times in total that word1 occurs as the first word in a bigram.

      The expected values for the internal cells are calculated by taking the product of their associated marginals and dividing by the sample size, for example: [math]m_{11} = \frac {n_{p1} n_{1p}}{n_{pp}}[/math]

      Pointwise Mutual Information (pmi) is defined as the log of the deviation between the observed frequency of a bigram (n11) and the probability of that bigram if it were independent (m11). :[math] PMI = \log \Bigl( \frac{n_{11}}{m_{11}} \Bigr)[/math] The Pointwise Mutual Information tends to overestimate bigrams with low observed frequency counts. To prevent this sometimes a variation of pmi is used which increases the influence of the observed frequency. :[math] PMI = \log \bigl(\frac{(n_{11})^{$exp}}{m_{11}})[/math] The $exp is 1 by default, so by default the measure will compute the Pointwise Mutual Information for the given bigram. To use a variation of the measure, users can pass the $exp parameter using the --pmi_exp command line option in statistic.pl or by passing the $exp to the initializeStatistic() method from their program.

1994

1991

1989