2004 FastDeepLinguisticStatisticalDepParsing

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Dependency Grammar, Long-Distance Relationship.

Notes

  • It describes an implemented Dependency Parser.
  • It has a freely available implementation. [1]
  • Though it would likely take some time to install and integrate
  • Pro3Gres stands for PRObability-based, PROlog-implemented Parser for RObust Grammatical Relation Extraction System. It is a fast, broad-coverage, deep-syntactic parsing system. It is a flexible and perspicuous hybrid parser using easily editable hand-written rules and statistical lexicalization from the Penn Treebank. Its performance is state-of-the-art or almost state-of-the-art. Its statistical model is based on the decisions that (a human or a machine) parser has to take during the parsing process.

Cited By

Quotes

Abstract

  • We present and evaluate an implemented statistical minimal parsing strategy exploiting DG charateristics to permit fast, robust, deep-linguistic analysis of unrestricted text, and compare its probability model to (Collins, 1999) and an adaptation, (Dubey and Keller, 2003). We show that DG allows for the expression of the majority of English LDDs in a context-free way and o ers simple yet powerful statistical models.

1 Introduction

  • We present a fast, deep-linguistic statistical parser that pro ts from DG characteristics and that uses am minimal parsing strategy. First, we rely on nite-state based approaches as long as possible, secondly where parsing is necessary we keep it context-free as long as possible1. For low-level syntactic tasks, tagging and base-NP chunking is used, parsing only takes place between heads of chunks. Robust, successful parsers (Abney, 1995; Collins, 1999) have shown that this division of labour is particularly attractive for DG.
  • Deep-linguistic, Formal Grammar parsers have carefully crafted grammars written by professional linguists. But unrestricted real-world texts still pose a problem to NLP systems that are based on Formal Grammars. Few handcrafted, deep linguistic grammars achieve the coverage and robustness needed to parse large corpora (see (Riezler et al., 2002), (Burke et al., 2004) and (Hockenmaier and Steedman, 2002) for exceptions), and speed remains a serious challenge. The typical problems can be grouped as follows.



References

  • Steven P. Abney. (1995). Chunks and dependencies: Bringing processing evidence to bear on syntax. In Jennifer Cole, Georgia Green, and Jerry Morgan, editors, Computational Linguistics and the Foundations of Linguistic Theory, pages 145{164. CSLI.
  • M. Burke, A. Cahill, R. O'Donovan, J. van Genabith, and A. Way. (2004). Treebank-based acquisistion of wide-coverage, probabilistic LFG resources: Project overview, results and evaluation. In The First International Joint Conference on Natural Language Processing (IJCNLP-04), Workshop "Beyond shallow analyses - Formalisms and statistical modeling for deep analyses", Sanya City, China.
  • Aoife Cahill, Michael Burke, Ruth O'Donovan, Josef van Genabith, and Andy Way. (2004). Long-distance dependency resolution in automatically acquired wide-coverage PCFG-based LFG approximations. In: Proceedings of ACL-2004, Barcelona, Spain.
  • John Carroll, Guido Minnen, and Ted Briscoe. (1999). Corpus annotation for parser evaluation. In: Proceedings of the EACL-99 Post-Conference Workshop on Linguistically Interpreted Corpora, Bergen, Norway.
  • Eugene Charniak. (2000). A maximum-entropy-inspired parser. In: Proceedings of the North American Chapter of the ACL, pages 132{139.
  • Noam Chomsky. (1995). The Minimalist Program. The MIT Press, Cambridge, Massachusetts.
  • Hoojung Chung and Hae-Chang Rim. (2003). A

new probabilistic dependency parsing model for head- nal, free word order languages. IE- ICE Transaction on Information & System, E86-D, No. 11:2490{2493. Michael Collins and James Brooks. 1995. Prepositional attachment through a backed- o model. In: Proceedings of the Third Work- shop on Very Large Corpora, Cambridge, MA.

Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania, Philadel- phia, PA.

  • Michael A. Covington. (1994). An empirically

motivated reinterpretation of Dependency Grammar. Technical Report AI1994-01, Uni- versity of Georgia, Athens, Georgia. Amit Dubey and Frank Keller. (2003). Proba- bilistic parsing for German using sister-head dependencies. In: Proceedings of the 41st An- nual Meeting of the Association for Compu- tational Linguistics, Sapporo. Jason Eisner. (2000). Bilexical grammars and their cubic-time parsing algorithms. In Harry Bunt and Anton Nijholt, editors, Advances in Probabilistic and Other Parsing Technologies. Kluwer. Christiane Fellbaum, editor. (1998). WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA. James Henderson. (2003). Inducing history representations for broad coverage statisti- cal parsing. In: Proceedings of HLT-NAACL 2003, Edmonton, Canada. Julia Hockenmaier and Mark Steedman. 2002. Generative models for statistical parsing with combinatory categorial grammar. In: Proceedingseed- ings of 40th Annual Meeting of the Associa- tion for Computational Linguistics, Philadel- phia.

  • Richard Hudson. (1984). Word Grammar. Basil

Blackwell, Oxford. Mark Johnson. (2002). A simple pattern- matching algorithm for recovering empty nodes and their antecedents. In: Proceedings of the 40th Meeting of the ACL, University of Pennsylvania, Philadelphia. J.D. Kim, T. Ohta, Y. Tateisi, and J. Tsu- jii. (2003). Genia corpus - a semantically an- notated corpus for bio-textmining. Bioinfor- matics, 19(1):i180{i182. Beth C. Levin. (1993). English Verb Classes and Alternations: a Preliminary Investiga- tion. University of Chicago Press, Chicago, IL. Dekang Lin. (1995). A dependency-based method for evaluating broad-coverage parsers. In: Proceedings of IJCAI-95, Mon- treal. Dekang Lin. (1998). Dependency-based evalua- tion of MINIPAR. In Workshop on the Eval- uation of Parsing Systems, Granada, Spain. Mitch Marcus, Beatrice Santorini, and M.A. Marcinkiewicz. (1993). Building a large anno- tated corpus of English: the Penn Treebank. Computational Linguistics, 19:313{330.

  • Igor Mel' cuk. (1988). Dependency Syntax: theory

and practice. State University of New York Press, New York. Diego Moll a, Gerold Schneider, Rolf Schwit- ter, and Michael Hess. (2000). Answer Extraction using a Dependency Grammar in ExtrAns. Traitement Automatique de Langues (T.A.L.), Special Issue on Depen- dency Grammar, 41(1):127{156. Peter Neuhaus and Norbert Br?oker. (1997). The complexity of recognition of linguistically ad- equate dependency grammars. In: Proceedings of the 35th ACL and 8th EACL, pages 337{ 343, Madrid, Spain. Joakim Nivre. (2004). Inductive dependency parsing. In: Proceedings of Promote IT, Karl- stad University. Judita Preiss. (2003). Using grammatical rela- tions to compare parsers. In: Proceedings of EACL 03, Budapest, Hungary.

  • Stefan Riezler, Tracy H. King, Ronald M. Kaplan, Richard Crouch, John T. Maxwell, and Mark Johnson. (2002). Parsing the Wall Street Journal using a Lexical-Functional Grammar and discriminative estimation techniques. In: Proceedings of the 40th Annual Meeting of the Association for Computational Lin-

guistics (ACL'02), Philadephia, PA.

  • Fabio Rinaldi, James Dowdall, Gerold Schneider, and Andreas Persidis. 2004a. Answering Questions in the Genomics Domain. In ACL 2004 Workshop on Question Answering in restricted domains, Barcelona, Spain, 21{26 July.
  • Fabio Rinaldi, Michael Hess, James Dowdall, Diego Moll a, and Rolf Schwitter. 2004b. Question answering in terminology-rich technical domains. In Mark Maybury, editor, New Directions in Question Answering. MIT/AAAI Press.
  • (Sarkar et al., 2000) ⇒ Anoop Sarkar, Fei Xia, and Aravind Joshi. (2000). “Some Experiments on Indicators of Parsing Complexity for Lexicalized Grammars.” In: Proceedings of COLING 2000.
  • Gerold Schneider. (2003). Extracting and using trace-free Functional Dependencies from the Penn Treebank to reduce parsing complexity. In: Proceedings of Treebanks and Linguistic Theories (TLT) 2003, V?axj?o, Sweden.
  • Wojciech Skut, Brigitte Krenn, Thorsten

Brants, and Hans Uszkoreit. (1997). An anno- tation scheme for free word order languages. In: Proceedings of the Fifth Conference on Ap- plied Natural Language Processing (ANLP- 97), Washington, DC.

  • Pasi Tapanainen and Timo J?arvinen. (1997). A

non-projective dependency parser. In Pro- ceedings of the 5th Conference on Applied Natural Language Processing, pages 64{71. Association for Computational Linguistics.

  • Lucien Tesni ere. 1959. El ements de Syntaxe

Structurale. Librairie Klincksieck, Paris.

BibTeX

@proceedings {

 AUTHOR = "Gerold Schneider, Fabio Rinaldi, James Dowdall",
 TITLE = "Fast, Deep-Linguistic Statistical Minimalist Dependency Parsing",
 JOURNAL = "COLING-2004 workshop on Recent Advances in Dependency Grammars",
 YEAR = "2004",

} ,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2004 FastDeepLinguisticStatisticalDepParsingGerold Schneider
Fabio Rinaldi
James Dowdall
Fast, Deep-linguistic Statistical Minimalist Dependency Parsinghttp://acl.ldc.upenn.edu/coling2004/W4/pdf/5.pdf