2019 NaturalQuestionsABenchmarkforQu
- (Kwiatkowski et al., 2019) ⇒ Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur P. Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. (2019). “Natural Questions: A Benchmark for Question Answering Research.” In: Transactions of the Association for Computational Linguistics, 7.
Subject Headings: Natural Questions Dataset; Question-Answering Dataset; GPT-2 Benchmark Task.
Notes
- Repository:
- Datasets available at: https://ai.google.com/research/NaturalQuestion.
Cited By
- Google Scholar: ~ 286 Citations, Retrieved: 2021-01-03.
Quotes
Abstract
We present the Natural Questions corpus, a question answering dataset. Questions consist of real anonymized, aggregated queries issued to the Google search engine. An annotator is presented with a question along with a Wikipedia page from the top 5 search results, and annotates a long answer (typically a paragraph) and a short answer (one or more entities) if present on the page, or marks null if no long/short answer is present. The public release consists of 307,373 training examples with single annotations; 7,830 examples with 5-way annotations for development data; and a further 7,842 examples 5-way annotated sequestered as test data. We present experiments validating quality of the data. We also describe analysis of 25-way annotations on 302 examples, giving insights into human variability on the annotation task. We introduce robust metrics for the purposes of evaluating question answering systems; demonstrate high human upper bounds on these metrics; and establish baseline results using competitive methods drawn from related literature.
References
BibTeX
@article{2019_NaturalQuestionsABenchmarkforQu,
author = {Tom Kwiatkowski and
Jennimaria Palomaki and
Olivia Redfield and
Michael Collins and
Ankur P. Parikh and
Chris Alberti and
Danielle Epstein and
Illia Polosukhin and
Jacob Devlin and
Kenton Lee and
Kristina Toutanova and
Llion Jones and
Matthew Kelcey and
Ming-Wei Chang and
Andrew M. Dai and
Jakob Uszkoreit and
Quoc Le and
Slav Petrov},
title = {Natural Questions: a Benchmark for Question Answering Research},
journal = {Transactions of the Association for Computational Linguistics},
volume = {7},
pages = {452--466},
year = {2019},
url = {https://transacl.org/ojs/index.php/tacl/article/view/1455},
}