2003 AccurateUnlexicalizedParsing

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Stanford Parser, Natural Language Parser, Unlexicalized PCFG.

Notes

Cited By

~714 http://scholar.google.com/scholar?cites=9225993212759532262

Quotes

Abstract

We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-the-art. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.

References

  • James K. Baker. 1979. Trainable grammars for speech recognition. In D. H. Klatt and J. J.Wolf, editors, Speech Communication Papers for the 97th Meeting of the Acoustical Society of America, pages 547–550.
  • Taylor L. Booth and Richard A. Thomson. 1973. Applying probability measures to abstract languages. IEEE Transactions on Computers, C-22:442–450.
  • Sharon A. Caraballo and Eugene Charniak. (1998). New figures of merit for best-first probabilistic chart parsing. Computational Linguistics, 24:275–298.
  • Eugene Charniak, Sharon Goldwater, and Mark Johnson. (1998). Edge-based best-first chart parsing. In: Proceedings of the Sixth Workshop on Very Large Corpora, pages 127–133.
  • Eugene Charniak. (1996). Tree-bank grammars. In: Proceedings of the 13th National Conference on Artificial Intelligence, pp. 1031–1036.
  • Eugene Charniak. (1997). Statistical Parsing with a Context-Free Grammar and Word Statistics. In: Proceedings of the 14th National Conference on Artificial Intelligence, pp. 598–603.
  • Eugene Charniak. (2000). A maximum-entropy-inspired parser. In NAACL 1, pages 132–139.
  • Eugene Charniak. (2001). Immediate-head parsing for language models. In ACL 39.
  • Noam Chomsky. 1965. Aspects of the Theory of Syntax. MIT Press, Cambridge, MA.
  • Michael Collins. (1996). A new statistical parser based on bigram lexical dependencies. In ACL 34, pages 184–191.
  • Michael Collins. (1999). Head-Driven Statistical Models for Natural Language Parsing. Ph.D. thesis, Univ. of Pennsylvania.
  • Jason Eisner and Giorgio Satta. (1999). Efficient parsing for bilexical context-free grammars and head-automaton grammars. In ACL 37, pages 457–464.
  • Marilyn Ford, Joan Bresnan, and Ronald M. Kaplan. (1982). A competence-based theory of syntactic closure. In Joan Bresnan, editor, The Mental Representation of Grammatical Relations, pages 727–796. MIT Press, Cambridge, MA.
  • Daniel Gildea. (2001). Corpus variation and parser performance. In 2001 Conference on Empirical Methods in Natural Language Processing (EMNLP).
  • Donald Hindle andMats Rooth. (1993). Structural ambiguity and lexical relations. Computational Linguistics, 19(1):103–120.
  • Mark Johnson. (1998). PCFG models of linguistic tree representations. Computational Linguistics, 24:613–632.
  • Dan Klein and Christopher D. Manning. (2001). “Parsing with Treebank Grammars: Empirical bounds, theoretical models, and the structure of the Penn treebank.” In: Proceedings of ACL 39/EACL 10.
  • David M. Magerman. (1995). Statistical decision-tree models for parsing. In ACL 33, pages 276–283.
  • Andrew Radford. (1988). Transformational Grammar. Cambridge University Press, Cambridge.
  • Dana Ron, Yoram Singer, and Naftali Tishby. (1994). The power of amnesia. Advances in Neural Information Processing Systems, volume 6, pages 176–183. Morgan Kaufmann.

BibTeX

@inproceedings{DBLP:conf/acl/KleinM03,

 author    = {Dan Klein and
             Christopher D. Manning},
 title     = {Accurate Unlexicalized Parsing.},
 booktitle = {ACL},
 year      = {2003},
 pages     = {423-430},
 ee        = {http://acl.ldc.upenn.edu/acl2003/main/pdfs/Klein.pdf},
 bibsource = {DBLP, http://dblp.uni-trier.de}

} ,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2003 AccurateUnlexicalizedParsingDan Klein
Christopher D. Manning
Accurate Unlexicalized ParsingProceedings of ACL 2003http://nlp.stanford.edu/~manning/papers/unlexicalized-parsing.pdf2003