2006 SolvingTheProblemOfCascadingErrors

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Nonpipelined Algorithm, Joint Inference Algorithm.

Notes

Quotes

Abstract

Introduction

  • Almost any system for natural language understanding must recover hidden linguistic structure at many different levels: parts of speech, syntactic dependencies, named entities, etc. For example, modern semantic role labeling (SRL) systems use the parse of the sentence, and question answering requires question type classification, parsing, named entity tagging, semantic role labeling, and often other tasks, many of which are dependent on one another and must be pipelined together. Pipelined systems are ubiquitous in NLP: in addition to the above examples, commonly parsers and named entity recognizers use part of speech tags and chunking information, and also word segmentation for languages such as Chinese. Almost no NLP task is truly standalone.
  • Most current systems for higher-level, aggregate NLP tasks employ a simple 1-best feed forward architecture: they greedily take the best output at each stage in the pipeline and pass it on to the next stage. This is the simplest architecture to build (particularly if reusing existing component systems), but errors are frequently made during this pipeline of annotations, and when a system is given incorrectly labeled input it is much harder for that system to do its task correctly. For example, when doing semantic role labeling, if no syntactic constituent of the parse actually corresponds to a given semantic role, then that semantic role will almost certainly be misidentified. It is therefore disappointing, but not surprising, that F-measures on SRL drop more than 10% when switching from gold parses to automatic parses (for instance, from 91.2 to 80.0 for the joint model of Toutanova (2005)).
  • A common improvement on this architecture is to pass k-best lists between processing stages, for example (Sutton and McCallum, 2005; Wellner et al., 2004). Passing on a k-best list gives useful improvements (e.g., in Koomen et al. (2005)), but efficiently enumerating k-best lists often requires very substantial cognitive and engineering effort, e.g., in (Huang and Chiang, 2005; Toutanova et al., 2005).

3.3 k-Best Lists

  • At first glance, k-best lists may seem like they should outperform sampling, because in effect they are the k best samples. However, there are several important reasons why one might prefer sampling. One reason is that the k best paths through a word lattice, or the k best derivations in parse forest do not necessarily correspond to the k best sentences or parse trees. In fact, there are no known sub-exponential algorithms for the best outputs in these models, when there are multiple ways to derive the same output.3 This is not just a theoretical concern – the Stanford parser uses such a grammar, and we found that when generating a 50-best derivation list that on average these derivations corresponded to about half as many unique parse trees. Our approach circumvents this issue entirely, because the samples are generated from the actual output distribution.

References

  • Rens Bod. (1995). The problem of computing the most probable tree in data-oriented parsing and stochastic tree grammars. In: Proceedings of EACL 1995.
  • Xavier Carreras and Llu´is M`arquez. (2004). Introduction to the CoNLL-2004 shared task: Semantic role labeling. In Proceedings of CoNLL 2004.
  • Xavier Carreras and Llu´is M`arquez. (2005). Introduction to the CoNLL-2005 shared task: Semantic role labeling. In Proceedings of CoNLL 2005.
  • Eugene Charniak. (2000). A maximum-entropy-inspired parser. In: Proceedings of the 14th National Conference on Artificial Intelligence.
  • Robert G. Cowell, A. Philip Dawid, Steffen L. Lauritzen, and David J. Spiegelhalter. (2003). Probabilistic Networks and Expert Systems. Springer.
  • Richard Crouch. (2005). Packed rewriting for mapping semantics to KR. In: Proceedings of the 6th International Workshop on Computational Semantics.
  • Ido Dagan, Oren Glickman, and Bernardo Magnini. (2005). The PASCAL recognizing textual entailment challenge. In Proceedings of the PASCAL Challenges Workshop on Recognizing Textual Entailment.
  • Stuart Geman and Mark Johnson. (2002). Dynamic programming for parsing and estimation of stochastic unificationbased grammars. In: Proceedings of ACL 2002.
  • Joshua Goodman. (1998). Parsing Inside-Out. Ph.D. thesis, Harvard University.
  • Aria Haghighi, Kristina Toutanova, and Christopher D. Manning. (2005). A joint model for semantic role labeling. In Proceedings of CoNLL 2005.
  • Liang Huang and David Chiang. (2005). Better k-best parsing. In: Proceedings of the 9th International Workshop on Parsing Technologies.
  • Lauri Karttunen. (2000). Applications of finite-state transducers in natural-language processing. In: Proceedingseesings of the Fifth International Conference on Implementation and Application of Automata.
  • Dan Klein and Christopher D. Manning. (2003). Accurate unlexicalized parsing. In: Proceedings of ACL 2003.
  • Peter Koomen, Vasin Punyakanok, Dan Roth, and Wen tau Yih. (2005). Generalized inference with multiple semantic role labeling systems. In: Proceedings of CoNLL 2005, pages 181–184.
  • John Lafferty, Andrew McCallum, and Fernando Pereira. (2001). Conditional Random Fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, pages 282–289.
  • Bill MacCartney, Trond Grenager, Marie deMarneffe, Daniel Cer, and Christopher D. Manning. (2006). Learning to recognize features of valid textual entailments. In: Proceedings of NAACL-HTL 2006.
  • Christopher D. Manning and Hinrich Schütze. (1999). Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, Massachusetts.
  • John T.Maxwell, III and Ronald M. Kaplan. (1995). A method for disjunctive constraint satisfaction. In Mary Dalrymple, Ronald M. Kaplan, John T. Maxwell III, and Annie Zaenen, editors, Formal Issues in Lexical-Functional Grammar, number 47 in CSLI Lecture Notes Series, chapter 14, pages 381–481. CSLI Publications.
  • Andrew Ng and]]Michael Jordan]]. (2001). Convergence rates of the voting Gibbs classifier, with application to Bayesian feature selection. In: Proceedings of the Eighteenth International Conference on Machine Learning.
  • Charles Sutton, and Andrew McCallum. (2005). Joint parsing and semantic role labeling.” In: Proceedings of CoNLL 2005, pages 225–228.
  • Kristina Toutanova, Aria Haghighi, and Christopher D. Manning. (2005). Joint learning improves semantic role labeling. In: Proceedings of ACL 2005.
  • Kristina Toutanova. (2005). Effective statistical models for syntactic and semantic disambiguation. Ph.D. thesis, Stanford University.
  • Ben Wellner, Andrew McCallum, Fuchun Peng, and Michael Hay. (2004). An integrated, conditional model of information extraction and coreference with application to citation matching. In: Proceedings of the 20th Annual Conference on Uncertainty in Artificial Intelligence.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2006 SolvingTheProblemOfCascadingErrorsJenny Rose Finkel
Christopher D. Manning
Andrew Y. Ng
Solving the Problem of Cascading Errors: Approximate Bayesian Inference for Linguistic Annotation Pipelineshttp://www.stanford.edu/~jrfinkel/papers/pipeline.pdf