Perplexity Performance (PP) Measure

From GM-RKB
(Redirected from Perplexity Measure)
Jump to navigation Jump to search

An Perplexity Performance (PP) Measure is an intrinsic performance measure that measures how well a probability model predicts a sample.



References

2020

2019

2018

2016

  • https://www.slideshare.net/alopezfoo/edinburgh-mt-lecture-11-neural-language-models
    • QUOTE: Given: [math]\displaystyle{ \bf{w}, \it{p}_{\text{LM}} }[/math]; [math]\displaystyle{ \text{PPL} = 2 \frac{1}{\bf{w}} \log_w \it{p}_{\text{LM}} (\bf{w}) }[/math]; [math]\displaystyle{ 0 \le \text{PPL} \le \infty }[/math]
    • Perplexity is a generation of the notion of branching factor: How many choices to I have at each position?
    • State-of-the-art English LMs have a PPL of ~100 word choices per position
    • A uniform LM has a perplexity of [math]\displaystyle{ |\Sigma| }[/math]
    • Humans do much better … and bad models can do even worse than uniform!

2017

  • https://web.stanford.edu/class/cs124/lec/languagemodeling2017.pdf
    • QUOTE: The best language model is one that best predicts an unseen test setGives the highest P(sentence).
      • Perplexity is the inverse probability of the test set, normalized by the number of words:
        [math]\displaystyle{ \text{PP}(\bf{w}) = \it{p} (w_1,w_2, ..., w_n)^{-\frac{1}{N}} = \sqrt[N]{\frac{1}{ \it{p}(w_1,w_2, ..., w_n)}} }[/math]
      • Chain rule: [math]\displaystyle{ \text{PP}(\bf{w}) = \sqrt[N]{ \Pi^{N}_{i=1} \frac{1}{ \it{p}(w_i | w_1,w_2, ..., w_n)}} }[/math]
      • For bigrams: [math]\displaystyle{ \text{PP}(\bf{w}) = \sqrt[N]{ \Pi^{N}_{i=1} \frac{1}{ \it{p}(w_i | w_{i-1})}} }[/math]
    • Minimizing perplexity is the same as maximizing probability
    • Lower perplexity = better model
      Training 38 million words, test 1.5 million words, WSJ: Unigram=162 ; Bigram=170 ; Trigram = 109.

2009

1977