Meteor (Metric for Evaluation of Translation with Explicit ORdering) Score

From GM-RKB
(Redirected from Meteor score)
Jump to navigation Jump to search

A Meteor (Metric for Evaluation of Translation with Explicit ORdering) Score is a NLP task performance measure that is based on the harmonic mean of unigrams' precision and recall.



References

2020a

  • (Wikipedia, 2020) ⇒ https://en.wikipedia.org/wiki/METEOR Retrieved:2020-11-22.
    • METEOR (Metric for Evaluation of Translation with Explicit ORdering) is a metric for the evaluation of machine translation output. The metric is based on the harmonic mean of unigram precision and recall, with recall weighted higher than precision. It also has several features that are not found in other metrics, such as stemming and synonymy matching, along with the standard exact word matching. The metric was designed to fix some of the problems found in the more popular BLEU metric, and also produce good correlation with human judgement at the sentence or segment level. This differs from the BLEU metric in that BLEU seeks correlation at the corpus level.

      Results have been presented which give correlation of up to 0.964 with human judgement at the corpus level, compared to BLEU's achievement of 0.817 on the same data set. At the sentence level, the maximum correlation with human judgement achieved was 0.403.

2020b

2014