ROUGE-L Metric
(Redirected from ROUGE-L)
		
		
		
		Jump to navigation
		Jump to search
		A ROUGE-L Metric is a ROUGE metric that ...
- Context:
- ...
 
- Example(s):
- ...
 
- Counter-Example(s):
- ...
 
- See: Longest Common Subsequence (LCS).
References
2023
- GBard
- ROUGE-L (Recall-Oriented Understudy for Gisting Evaluation for Longest Common Subsequence) is a metric for evaluating the quality of text summarization systems. It is based on the longest common subsequence (LCS) between the generated summary and the reference summary. The LCS is the longest sequence of words that appears in both summaries, in the same order.
- ROUGE-L is calculated as follows:
- ROUGE-L = LCS(generated_summary, reference_summary) / length(reference_summary)
 
- where:
- LCS(generated_summary, reference_summary) is the longest common subsequence between the generated summary and the reference summary.
- length(reference_summary) is the length of the reference summary in words.
 
- ROUGE-L is a recall-oriented metric, meaning that it measures how much of the reference summary is covered by the generated summary. It does not take into account the precision of the generated summary, meaning that it does not penalize the generated summary for including words that are not in the reference summary.
- ROUGE-L is one of the most widely used metrics for evaluating text summarization systems. It is relatively easy to calculate and has been shown to correlate well with human judgments of summary quality.
- Here is an example of how to calculate ROUGE-L:
- Generated summary: The cat sat on the mat.
- Reference summary: The cat sat on the mat and looked at the bird.
 
- The LCS of the two summaries is "The cat sat on the mat". The length of the reference summary is 6 words. Therefore, the ROUGE-L score is:
- ROUGE-L = 4 / 6 = 0.666
 
- This means that 66.6% of the words in the reference summary are also in the generated summary.
- ROUGE-L scores typically range from 0 to 1, with higher scores indicating better summary quality. A score of 0 means that the generated summary does not contain any of the words in the reference summary, while a score of 1 means that the generated summary is identical to the reference summary.