2005 ThePropositionBank

From GM-RKB
Jump to navigation Jump to search

Subject Headings: PropBank, Semantic Role Labeling

Notes

Experiements

  • Below are some experiments on the paper's sample sentences using the Link Tagger.

Phrase-Structure Grammar [ S [NP-SBJ [NNP [Paul] Daniel S. Weld Daniel S. Weld [VP [VBD [broke] Daniel S. Weld [NP [DT [the] Daniel S. Weld [NN [window] Daniel S. Weld Daniel S. Weld Daniel S. Weld [. [.] Daniel S. Weld Daniel S. Weld

Link Grammar

   +---Xp--+
  |             +-Os-+    |
   +---Wd--+--Ss-+     +--Ds--+    |
  |      |    |    |     |    |

LEFT-WALL Paul broke.v the window.n .

Phrase-Structure Grammar

Link Grammar

   +Xp+
   +--Wd-+            |
  |      +--Ds--+---Ss--+    |
  |     |     |      |    |

LEFT-WALL the window.n broke.v .

Phrase-Structure Grammar [ S [NP-SBJ [DT [The] Daniel S. Weld [NN [window] Daniel S. Weld Daniel S. Weld [VP [VBZ [is] Daniel S. Weld [VP [VBN [broken] Daniel S. Weld Daniel S. Weld Daniel S. Weld [. [.] Daniel S. Weld Daniel S. Weld

Link Grammar

   +---Xp---+
   +--Wd-+                  |
  |      +--Ds--+--Ss--+--Pv--+    |
  |     |     |     |     |    |

LEFT-WALL the window.n is.v broken.v .

Cited By

Quotes

Abstract

The Proposition Bank project takes a practical approach to semantic representation, adding a layer of predicate-argument information, or semantic role labels, to the syntactic structures of the Penn Treebank. The resulting resource can be thought of as shallow, in that it does not represent coreference, quantification, and many other higher-order phenomena, but also broad, in that it covers every instance of every verb in the corpus and allows representative statistics to be calculated.

We discuss the criteria used to define the sets of semantic roles used in the annotation process, and analyze the frequency of syntactic/semantic alternations in the corpus. We describe an automatic system for semantic role tagging trained on the corpus, and discuss the effect on its performance of various types of information, including a comparison of full syntactic parsing with a flat representation, and the contribution of the empty “trace” categories of the Treebank.


References

  • Abeillé, Anne, editor. (2003). Building and Using Parsed Corpora. Language and Speech series. KLUWER, Dordrecht.
  • Alshawi, Hiyan, editor. (1992). The Core Language Engine. MIT Press, Cambridge, MA.
  • Baker, Collin F., Charles J. Fillmore, and
  • John B. Lowe. (1998). The Berkeley FrameNet project. In: Proceedings of COLING/ACL,pages 86–90, Montreal.
  • Bangalore, Srinivas and Aravind K. Joshi. (1999). Supertagging: An approach to almostparsing. Computational Linguistics, 25(2):237–265.
  • Brent, Michael R. (1993). From grammar to lexicon: Unsupervised learning of lexicalsyntax. Computational Linguistics, 19(2):243–262.
  • Briscoe, Ted and John Carroll. (1997). Automatic extraction of subcategorization from corpora. In Fifth Conference on Applied Natural Language Processing, pages 356–363, Washington, D.C. ACL.
  • Carreras, Xavier and Llu´is M`arquez. (2004). Introduction to the CoNLL-2004 shared task: Semantic role labeling. In HLT-NAACL 2004 Workshop: Eighth Conference on Computational Natural Language Learning (CoNLL-2004), pages 89–97, Boston, MA.
  • Carroll, John, Ted Briscoe, and Antonio Sanfilippo. (1998). Parser evaluation: a survey and a new proposal. In: Proceedings of the 1st International Conference on Language Resources and Evaluation, pages 447–454, Granada, Spain.
  • Eugene Charniak. (2000). A maximum-entropy-inspired parser. In: Proceedings of the 1st Annual Meeting of the North American Chapter of the ACL (NAACL), pages 132–139, Seattle, Washington.
  • Chen, John and Owen Rambow. (2003). Use of deep linguistic features for the recognition and labeling of semantic arguments. In Michael Collins and Mark Steedman, editors, Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pages 41–48.
  • Collins, Michael. (2000). Discriminative reranking for natural language parsing. In: Proceedings of the International Conference on Machine Learning (ICML), Stanford, California.
  • Michael Collins. (1999). Head-driven Statistical Models for Natural Language Parsing. Ph.D. thesis, University of Pennsylvania, Philadelphia.
  • Dang, Hoa Trang, Karin Kipper, Martha Palmer, and Joseph Rosenzweig. (1998). Investigating regular sense extensions based on intersective Levin classes. In COLING/ACL-98, pages 293–299, Montreal. ACL.
  • Dienes, Peter and Amit Dubey. (2003). Antecedent recovery: Experiments with a trace tagger. In 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP), Sapporo, Japan.
  • Dorr, Bonnie J. and Douglas Jones. (2000). Acquisition of semantic lexicons: Using word sense disambiguation to improve precision. In Evelyn Viegas, editor, Breadth and Depth of Semantic Lexicons. Kluwer Academic Publishers, Norwell, MA, pages 79–98.
  • Dowty, David R. (1991). Thematic proto-roles and argument selection. Language, 67(3):547–619.
  • Fillmore, Charles J. 1976. Frame semantics and the nature of language. In Annals of the New York Academy of Sciences: Conference on the 31 Computational Linguistics Volume XX, Number X Origin and Development of Language and Speech, volume 280, pages 20–32.
  • Fillmore, Charles J. and B. T. S. Atkins. (1998). FrameNet and lexicographic relevance. In: Proceedings of the First International Conference on Language Resources and Evaluation, Granada, Spain.
  • Fillmore, Charles J. and Collin F. Baker. (2001). Frame semantics for text understanding. In: Proceedings of NAACL WordNet and Other Lexical Resources Workshop, Pittsburgh, June.
  • Gildea, Daniel and Julia Hockenmaier. (2003). Identifying semantic roles using combinatory categorial grammar. In 2003 Conference on Empirical Methods in Natural Language Processing (EMNLP), Sapporo, Japan.
  • Gildea, Daniel and Daniel Jurafsky. (2002). Automatic labeling of semantic roles. Computational Linguistics, 28(3):245–288.
  • Gildea, Daniel and Martha Palmer. (2002). The necessity of syntactic parsing for predicate argument recognition. In: Proceedings of ACL-02, pages 239–246, Philadelphia, PA.
  • Hajiˇcova, Eva and Ivona Kuˇcerov´a. (2002). Argument/Valency Structure in PropBank, LCS Database and Prague Dependency Treebank: A Comparative Pilot Study. In: Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), pages 846–851. ELRA.
  • He, Shan and Daniel Gildea. (2004). Semantic roles labeling by maximum entropy model. Technical Report 847, University of Rochester.
  • Hobbs, Jerry R., Douglas E. Appelt, John Bear, David Israel, Megumi Kameyama, Mark E. Stickel, and Mabry Tyson. (1997). FASTUS: A cascaded finite-state transducer for extracting information from natural-language text. In Emmanuel Roche and Yves Schabes, editors, Finite-State Language Processing. MIT Press, Cambridge, MA, pages 383–406.
  • Hockenmaier, Julia and Mark Steedman. (2002). Generative models for statistical parsing with Combinatory Categorial Grammar. In: Proceedings of ACL-02, pages 335–342,Philadelphia, PA.
  • Johnson, Christopher R., Charles J. Fillmore, Miriam R. L. Petruck, Collin F. Baker, Michael Ellsworth, Josef Ruppenhofer, and Esther J. Wood. (2002). FrameNet: Theory and practice. Version 1.0, http://www.icsi.berkeley.edu/framenet/.
  • Johnson, Mark. (2002). A simple pattern-matching algorithm for recovering empty nodes and their antecedents. In: Proceedings of ACL-02, Philadelphia, PA.
  • Kipper, Karin, Hoa Trang Dang, and Martha Palmer. (2000). Class-based construction of a verb lexicon. In: Proceedings of the seventh National Conference on Artificial Intelligence (AAAI-2000), Austin, TX, July-August.
  • Kipper, Karin, Martha Palmer, and Owen Rambow. (2002). Extending PropBank with VerbNet semantic predicates. Unpublished manuscript, presented at Workshop on Applied Interlinguas, AMTA-2002, October.
  • Korhonen, Anna and Ted Briscoe. (2004). Extended lexical-semantic classification of english verbs. In: Proceedings of the HLT/NAACL Workshop on Computational Lexical Semantics, Boston, MA.
  • Korhonen, Anna, Yuval Krymolowsky, and Zvika Marx. (2003). Clustering polysemic subcategorization frame distributions semantically. In: Proceedings of the 41st Annual Conference of the Association for Computational Linguistics (ACL-03), Sapporo, Japan.
  • Levin, Beth. (1993). English Verb Classes And Alternations: A Preliminary Investigation. University of Chicago Press, Chicago. Manning, Christopher D. (1993). Automatic acquisition of a large subcategorization dictionary from corpora. In: Proceedings of the 31th Annual Meeting of the Association for Computational Linguistics, pages 235–242.
  • Ohio State University, Columbus, Ohio. Marcus, Mitchell P., Beatrice Santorini, and Mary Ann Marcinkiewicz. (1993). Building a large annotated corpus of English: The Penn treebank. Computational Linguistics, 19(2):313–330, June.
  • McCarthy, Diana. (2000). Using semantic preferences to identify verbal participation in role switching alternations. In: Proceedings of the 1st Annual Meeting of the North American Chapter of the ACL (NAACL), pages 256–263, Seattle, Washington.
  • Merlo, Paola and Suzanne Stevenson. (2001). Automatic verb classification based on statistical distribution of argument structure. Computational Linguistics, 27(3):373–408, September.
  • Miller, Scott, Michael Crystal, Heidi Fox, Lance Ramshaw, Richard Schwartz, Rebecca Stone, 32 The Proposition Bank Palmer et al.
  • Ralph Weischedel, and the Annotation Group. (1998). Algorithms that learn to extract information – BBN: Description of the SIFT system as used for MUC-7. In: Proceedings of the Seventh Message Understanding Conference (MUC-7), April.
  • Palmer, Martha, Olga Babko-Malaya, and Hoa Trang Dang. (2004). Different sense granularities for different applications. In 2nd Workshop on Scalable Natural Language Understanding Systems at HLT/NAACL-04, Boston, MA.
  • Palmer, Martha, Joseph Rosenzweig, and Scott Cotton. (2001). Predicate argument analysis of the Penn treebank. In: Proceedings of HLT 2001, First International Conference on Human Language Technology Research, San Diego, CA, March.
  • Pradhan, S., K. Hacioglu, W. Ward, J. Martin, and Daniel Jurafsky. (2003). Semantic role parsing: Adding semantic structure to unstructured text. In: Proceedings of the International Conference on Data Mining (ICDM-2003). Melbourne, FL.
  • Rambow, Owen, Bonnie J. Dorr, Karin Kipper, Ivona Kuˇcerov´a, and Martha Palmer. (2003). Automatically deriving tectogrammatical labels from other resources: A comparison of semantic labels across frameworks. The Prague Bulletin of Mathematical Linguistics, 80.
  • Ratnaparkhi, Adwait. (1997). A linear observed time statistical parser based on maximum entropy models. In: Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pages 1–10, Providence, Rhode Island. ACL.
  • Ray, Soumya and Mark Craven. (2001). Representing sentence structure in hidden markov model for information extraction. In Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01), Seattle, Washington.
  • Schulte im Walde, Sabine. (2000). Clustering verbs semantically according to their alternation behaviour. In: Proceedings of the 18th International Conference on Computational Linguistics (COLING-00), pages 747–753, Saarbr ¨ ucken, Germany.
  • Schulte im Walde, Sabine and Chris Brew. (2002). Inducing german semantic verb classes from purely syntactic subcategorisation information. In: Proceedings of ACL-02, pages 223–230, Philadelphia, PA.
  • Siegel, Sidney and N. John Castellan, Jr. 1988. Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill, New York, second edition.
  • Steedman, Mark. (2000). The Syntactic Process. The MIT Press, Cambridge Mass.
  • Surdeanu, Mihai, Sanda M. Harabagiu, John Williams, and Paul Aarseth. (2003). Using predicate-argument structures for information extraction. In: Proceedings of the 41th Annual Conference of the Association for Computational Linguistics (ACL-03), pages 8–15.
  • Erik Tjong Kim Sang, and Sabine Buchholz. (2000). “Introduction to the CoNLL-2000 shared task: Chunking.” In: Proceedings of CoNLL-2000 and LLL-2000.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2005 ThePropositionBankMartha Palmer
Daniel Gildea
Paul Kingsbury
The Proposition Bank: An Annotated Corpus with Semantic RolesComputational Linguistics Journalhttp://www.cs.rochester.edu/~gildea/palmer-propbank-cl.pdf10.1162/08912010536302642005