2006 OnlineLearningOfApproxDepParsing

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Dependency Parsing Algorithm, Higher Order Feature.

Notes

Quotes

Abstract

1 Introduction

  • Dependency representations of sentences (Hudson, 1984; Me´lˇcuk, 1988) model head-dependent syntactic relations as edges in a directed graph. Figure 1 displays a dependency representation for the sentence John hit the ball with the bat. This sentence is an example of a projective (or nested) tree representation, in which all edges can be drawn in the plane with none crossing. Sometimes a non-projective representations are preferred, as in the sentence in Figure 2.1 In particular, for freer-word order languages, non-projectivity is a common phenomenon since the relative positional constraints on dependents is much less rigid. The dependency structures in Figures 1 and 2 satisfy the tree constraint: they are weakly connected graphs with a unique root node, and each non-root node has a exactly one parent. Though trees are more common, some formalisms allow for words to modify multiple parents (Hudson, 1984).
  • Recently, McDonald et al. (2005c) have shown that treating dependency parsing as the search for the highest scoring maximum spanning tree (MST) in a graph yields efficient algorithms for both projective and non-projective trees. When combined with a discriminative online learning algorithm and a rich feature set, these models provide state-of-the-art performance across multiple languages. However, the parsing algorithms require that the score of a dependency tree factors as a sum of the scores of its edges. This first-order factorization is very restrictive since it only allows for features to be defined over single attachment decisions. Previous work has shown that conditioning on neighboring decisions can lead to significant improvements in accuracy (Yamada and Matsumoto, 2003; Charniak, 2000).
  • In this paper we extend the MST parsing framework to incorporate higher-order feature representations of bounded-size connected subgraphs. We also present an algorithm for acyclic dependency graphs, that is, dependency graphs in which a word may depend on multiple heads. In both cases parsing is in general intractable and we provide novel approximate algorithms to make these cases tractable. We evaluate these algorithms within an online learning framework, which has been shown to be robust with respect approximate inference, and describe experiments displaying that these new models lead to state-of-the-art accuracy for English and the best accuracy we know of for Czech and Danish.

References


,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2006 OnlineLearningOfApproxDepParsingRyan T. McDonald
Fernando Pereira
Online Learning of Approximate Dependency Parsing Algorithmshttp://acl.ldc.upenn.edu/eacl2006/main/papers/04 2 mcdonaldpereira 26.pdf