2015 GrammarAsaForeignLanguage

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Attention-enhanced Encoder-Decoder RNN

Notes

Cited By

Quotes

Abstract

Syntactic constituency parsing is a fundamental problem in natural language processing and has been the subject of intensive research and engineering for decades. As a result, the most accurate parsers are domain specific, complex, and inefficient. In this paper we show that the domain agnostic attention-enhanced sequence-to-sequence model achieves state-of-the-art results on the most widely used syntactic constituency parsing dataset, when trained on a large synthetic corpus that was annotated using existing parsers. It also matches the performance of standard parsers when trained only on a small human-annotated dataset, which shows that this model is highly data-efficient, in contrast to sequence-to-sequence models without the attention mechanism. Our parser is also fast, processing over a hundred sentences per second with an unoptimized CPU implementation.

References

  • 1. Ilya Sutskever, Oriol Vinyals, and Quoc VV Le. Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems, Pages 3104-3112, 2014.
  • 2. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv Preprint ArXiv:1409.0473, 2014.
  • 3. Thang Luong, Ilya Sutskever, Quoc V Le, Oriol Vinyals, and Wojciech Zaremba. Addressing the Rare Word Problem in Neural Machine Translation. arXiv Preprint ArXiv:1410.8206, 2014.
  • 4. Sébastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. On Using Very Large Target Vocabulary for Neural Machine Translation. arXiv Preprint ArXiv:1412.2007, 2014.
  • 5. Sepp Hochreiter, Jürgen Schmidhuber, Long Short-Term Memory, Neural Computation, v.9 n.8, p.1735-1780, November 15, 1997
  • 6. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. arXiv Preprint ArXiv:1301.3781, 2013.
  • 7. Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, Ralph Weischedel, OntoNotes: The 90% Solution, Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers, p.57-60, June 04-09, 2006, New York, New York
  • 8. Slav Petrov and Ryan McDonald. Overview of the 2012 Shared Task on Parsing the Web. Notes of the First Workshop on Syntactic Analysis of Non-Canonical Language (SANCL), 2012.
  • 9. John Judge, Aoife Cahill, Josef Van Genabith, QuestionBank: Creating a Corpus of Parse-annotated Questions, Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, p.497-504, July 17-18, 2006, Sydney, Australia
  • 10. Mitchell P. Marcus, Mary Ann Marcinkiewicz, Beatrice Santorini, Building a Large Annotated Corpus of English: The Penn Treebank, Computational Linguistics, v.19 n.2, June 1993
  • 11. Zhenghua Li, Min Zhang, and Wenliang Chen. Ambiguity-aware Ensemble Training for Semi-supervised Dependency Parsing. In Proceedings of ACL'14, Pages 457-467. ACL, 2014.
  • 12. Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein, Learning Accurate, Compact, and Interpretable Tree Annotation, Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, p.433-440, July 17-18, 2006, Sydney, Australia
  • 13. Muhua Zhu, Yue Zhang, Wenliang Chen, Min Zhang, and Jingbo Zhu. Fast and Accurate Shift-reduce Constituent Parsing. In ACL. ACL, August 2013.
  • 14. Slav Petrov, Products of Random Latent Variable Grammars, Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, p.19-27, June 02-04, 2010, Los Angeles, California
  • 15. Zhongqiang Huang, Mary Harper, Self-training PCFG Grammars with Latent Annotations Across Languages, Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2, August 06-07, 2009, Singapore
  • 16. David McClosky, Eugene Charniak, Mark Johnson, Effective Self-training for Parsing, Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, p.152-159, June 04-09, 2006, New York, New York
  • 17. David Hall, Taylor Berg-Kirkpatrick, John Canny, and Dan Klein. Sparser, Better, Faster Gpu Parsing. In ACL, 2014.
  • 18. Michael Collins, Three Generative, Lexicalised Models for Statistical Parsing, Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, p.16-23, July 07-12, 1997, Madrid, Spain
  • 19. Dan Klein, Christopher D. Manning, Accurate Unlexicalized Parsing, Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, p.423-430, July 07-12, 2003, Sapporo, Japan
  • 20. James Henderson, Inducing History Representations for Broad Coverage Statistical Parsing, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.24-31, May 27-June 01, 2003, Edmonton, Canada
  • 21. James Henderson, Discriminative Training of a Neural Network Statistical Parser, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p.95-es, July 21-26, 2004, Barcelona, Spain
  • 22. Ivan Titov and James Henderson. Constituent Parsing with Incremental Sigmoid Belief Networks. In ACL. ACL, June 2007.
  • 23. Ronan Collobert. Deep Learning for Efficient Discriminative Parsing. In International Conference on Artificial Intelligence and Statistics, 2011.
  • 24. Richard Socher, Cliff C Lin, Chris Manning, and Andrew Y Ng. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. In ICML, 2011.
  • 25. Adwait Ratnaparkhi. A Linear Observed Time Statistical Parser based on Maximum Entropy Models. In Second Conference on Empirical Methods in Natural Language Processing, 1997.
  • 26. Michael Collins, Brian Roark, Incremental Parsing with the Perceptron Algorithm, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p.111-es, July 21-26, 2004, Barcelona, Spain
  • 27. Alex Graves. Generating Sequences with Recurrent Neural Networks. arXiv Preprint ArXiv:1308.0850, 2013.
  • 28. Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. End-to-end Continuous Speech Recognition Using Attention-based Recurrent Nn: First Results. arXiv Preprint ArXiv:1412.1602, 2014.
  • 29. Nal Kalchbrenner and Phil Blunsom. Recurrent Continuous Translation Models. In EMNLP, Pages 17001709, 2013.
  • 30. Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. Show and Tell: A Neural Image Caption Generator. arXiv Preprint ArXiv:1411.4555, 2014.
  • 31. Zoubin Ghahramani. A Neural Network for Learning how to Parse Tree Adjoining Grammar. B.S.Eng Thesis, University of Pennsylvania, 1990.

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2015 GrammarAsaForeignLanguageGeoffrey E. Hinton
Terry Koo
Slav Petrov
Ilya Sutskever
Oriol Vinyals
Lukasz Kaiser
Grammar As a Foreign Language