Extrinsic Performance Measure
- AKA: Real-World Impact Evaluation.
- See: Empirical Analysis.
- (Jurafsky & Martin, 2009) ⇒ Daniel Jurafsky, and James H. Martin. (2009). “Speech and Language Processing, 2nd edition." Pearson Education. ISBN:0131873210
- The best way to evaluate the performance of a language model is to embed it in an application and measure the total performance of the application. Such end-to-end evaluation is called extrinsic evaluation, and also sometimes called in vivo evaluation (Sparck Jones and Gallers, 1996). Extrinsic evaluation is the only way to know if a particular improvement in a component is really going to help the task at hand. … Unfortunately, end-to-end evaluation is often very expensive; evaluating a large speech recognition test set, for example, take hours or even days.
- (Spärck Jones & Gallers, 1996) ⇒ Karen Spärck Jones, and Julia Rose Galliers. (1996). “Evaluating Natural Language Processing Systems, An Analysis and Review.” Springer. ISBN:3540613099