2013 MCTestAChallengeDatasetfortheOp

From GM-RKB
Jump to navigation Jump to search

Subject Headings: MC-Test Dataset; Reading Comprehension Dataset.

Notes

Cited By

Quotes

Abstract

We present MCTest, a freely available set of stories and associated questions intended for research on the machine comprehension of text. Previous work on machine comprehension (e.g., semantic modeling) has made great strides, but primarily focuses either on limited-domain datasets, or on solving a more restricted goal (e.g., open-domain relation extraction). In contrast, MCTest requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Reading comprehension can test advanced abilities such as causal reasoning and understanding the world, yet, by being multiple-choice, still provide a clear metric. By being fictional, the answer typically can be found only in the story itself. The stories and questions are also carefully limited to those a young child would understand, reducing the world knowledge that is required for the task. We present the scalable crowd-sourcing methods that allow us to cheaply construct a dataset of 500 stories and 2000 questions. By screening workers (with grammar tests) and stories (with grading), we have ensured that the data is the same quality as another set that we manually edited, but at one tenth the editing cost. By being open-domain, yet carefully restricted, we hope MCTest will serve to encourage research and provide a clear metric for advancement on the machine comprehension of text. Reading Comprehension: A major goal for NLP is for machines to be able to understand text as well as people. Several research disciplines are focused on this problem: for example, information extraction, relation extraction, semantic role labeling, and recognizing textual entailment. Yet these techniques are necessarily evaluated individually, rather than by how much they advance us towards the end goal. On the other hand, the goal of semantic parsing is the machine comprehension of text (MCT), yet its evaluation requires adherence to a specific knowledge representation, and it is currently unclear what the best representation is, for open-domain text. We believe that it is useful to directly tackle the top-level task of MCT. For this, we need a way to measure progress. One common method for evaluating someone's understanding of text is by giving them a multiple-choice reading comprehension test. This has the advantage that it is objectively gradable (vs. essays) yet may test a range of abilities such as causal or counterfactual reasoning, inference among relations, or just basic understanding of the world in which the passage is set. Therefore, we propose a multiple-choice reading comprehension task as a way to evaluate progress on MCT. We have built a reading comprehension dataset containing 500 fictional stories, with 4 multiple choice questions per story. It was built using methods which can easily scale to at least 5000 stories, since the stories were created, and the curation was done, using crowdsourcing almost entirely, at a total of $4.00 per story. We plan to periodically update the dataset to ensure that methods are not overfitting to the existing data. The dataset is open-domain, yet restricted to concepts and words that a 7 year old is expected to understand. This task is still beyond the capability of today'™s computers and algorithms.

1. Reading Comprehension

2. Previous Work

3. Generating the Stories and Questions

4. Rating the Stories and Questions

5. Dataset Analysis

6. Baseline System and Results

7. Recognizing Textual Entailment Results

8. Making Data and Results an Ongoing Resource

9. Future Work

10. Conclusion

Acknowledgments

References

BibTeX

@inproceedings{2013_MCTestAChallengeDatasetfortheOp,
  author    = {Matthew Richardson and
               Christopher J. C. Burges and
               Erin Renshaw},
  title     = {MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension
               of Text},
  booktitle = {Proceedings of the 2013 Conference on Empirical Methods in Natural
               Language Processing (EMNLP 2013). A meeting of SIGDAT, a Special
               Interest Group of the ACL},
  date      = {18-21 October 2013},
  address   = {Grand Hyatt Seattle, Seattle, Washington, USA},
  pages     = {193--203},
  publisher = {{ACL}},
  year      = {2013},
  url       = {https://www.aclweb.org/anthology/D13-1020/},
}


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2013 MCTestAChallengeDatasetfortheOpMatthew Richardson
Christopher J. C. Burges
Erin Renshaw
MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text2013