DUC-2005 Summarization Task
(Redirected from DUC 2005 Benchmark Task)
A DUC-2005 Summarization Task is a NLP benchmark task that evaluates the performance of topic-focused multi-document text summarization systems.
- It is part of the DUC Workshop Series.
- It evaluated text summarization system using 5 linguistic quality questions which measured linguistic qualities: Grammaticality (Q1), Non-redundancy (Q2), Referential clarity (Q3), Focus (Q4), and Structure and Coherence (Q5).
- It evaluated the performance of all text summarization systems included Dang (2005)
- See: Performance Metric, Text Summarization System, Natural Language Processing System, SQuASH Project, ROUGE.
- (DUC, 2020) ⇒ https://duc.nist.gov/duc2005/tasks.html Retrieved: 2020-10-11.
- QUOTE: The main goals in DUC 2005 and their associated actions are listed below.
- 1) Inclusion of user/task context information for systems and human summarizers
- 2) Evaluation of content in terms of more basic units of meaning
- 3) Better understanding of normal human variability in a summarization task and how it may affect evaluation of summarization systems
- (DUC, 2005) ⇒ http://www-nlpir.nist.gov/projects/duc/duc2005/
- QUOTE: The system task in 2005 will be to synthesize from a set of 25-50 documents a brief, well-organized, fluent answer to a need for information that cannot be met by just stating a name, date, quantity, etc. This task will model real-world complex question answering
- (Dang, 2005) ⇒ Hoa Trang Dang (2005, October). "Overview of DUC 2005". In: Proceedings of the document understanding conference (Vol. 2005, pp. 1-12).
- QUOTE: The focus of DUC 2005 was on developing new evaluation methods that take into account variation in content in human-authored summaries. Therefore, DUC 2005 had a single user-oriented, question-focused summarization task that allowed the community to put some time and effort into helping with the new evaluation framework. The summarization task was to synthesize from a set of 25-50 documents a well-organized, fluent answer to a complex question. The relatively generous allowance of 250 words for each answer reveals how difficult it is for current summarization systems to produce fluent multi-document summaries.