2007 ComplexitiesOfDependenciesInNaturalLanguage

From GM-RKB
Jump to navigation Jump to search
  • (Joshi, 2007) ⇒ Aravind K. Joshi. (2007). “Complexity of Dependencies in Natural Language.” Presentation given at Simon Fraser University, Vancouver, Feb 22.

Subject Headings: Tree Adjoining Grammar, Discourse-level Analysis Task, Penn Discourse Treebank.

Notes

Cited By

Quotes

Abstract

  • Center Embedding of Complement Clauses in Dutch: Jan1 Piet2 Marie3 zag1 laten2 zwemmen3 (i.e. Jan saw Piet let Marie swim). It is possible to obtain a wide range of complex dependencies, i.e., complex combinations of nested and crossed dependencies. Such patterns arise in word order phenomena such as Scrambling and Clitic Movement and also due to Scope Ambiguity.
  • "TAGs (more precisely, languages of TAGs) belong to the class of languages called Mildly Context-Sensitive Languages (MCSL) characterized by: 1) polynomial parsing complexity 2) Grammars for the languages in this class can characterize a limited set of patterns of nested and crossed dependencies and their combinations. 3) languages in this class have the constant growth property (i.e., sentences, if arranged in increasing order of length, grow only by a bounded amount). 4) this class properly includes CFLs"
  • CFG's Domain of Locality (its building blocks) are one level trees. E.g. S -> NP VP; VP -> V NP
  • TAG's Domain of Localisty can extend more than one level e.g. alpha1 ⇒ S (NP') (VP (likes NP'))
  • Weak Lexicalization gives the same set of strings but not the same set trees, of i.e., the same set of structural descriptions.
  • Strong Lexicalization gives the same set of strings and the same set of trees, i.e., the same set of structural descriptions.
  • Derivation Tree
  • Example Who does Bill think Harry likes.(likes (who) (think (does) (Bill)) (Harry))
  • Transition from Sentence to Discourse.
  • Discourse Connectives and their Argument Structure is similar to Predicate-Argument Structure at the Sentence-level.
  • Discourse Relations provide a level of description that is: 1) theoretically interesting, linking sentences (clauses) and discourse; 2) identifiable more or less reliably on a sufficiently large scale; and 3) capable of supporting a level of Inference potentially relevant to many NLP Tasks.
  • Broadly, there are two ways of specifying discourse relations: 1) Abstract specification and 2) Lexically grounded
    • In abstract specification: 1) relations between two given Abstract Objects are always inferred, and declared by choosing from a pre-defined set of abstract categories; 2) Lexical elements can serve as partial, ambiguous evidence for inference.
    • In lexically grounded relations: 1) Relations can be grounded in lexical elements. 2) Relations may be inferred where lexical elements are absent.
  • Explicit connectives are the lexical elements that trigger discourse relations.
  • PDTB Penn Discourse Treebank.
    • Annotation text source: the Wall Street Journal. Same as the Penn Treebank. 2304 articles, ~1M words.
    • PDTB first release (PDTB-1.0) appeared in March 2006.
    • http://www.seas.upenn.edu/~pdtb
    • PDTB final release (PDTB-2.0) is planned for April 2007.
    • Collaborators: Rashmi Prasad, Alan Lee, Nikhil Dinesh, Eleni Miltsakaki, and Bonnie Webber (U. Edinburgh)

References

  • Katherine Forbes-Reilly, Bonnie Webber and Aravind Joshi (2006). Computing Discourse Semantics in D-LTAG. Journal of Semantics 23, pp. 55--106.
  • Michael Halliday and Ruqaiya Hasan (1976). Cohesion in English, Longman.
  • William Mann and Sandra Thompson (1988). Rhetorical Structure Theory. Text 8(3), pp. 243--281.
  • Florian Wolf and Edward Gibson (2005). Representing discourse coherence: A corpus-based study. Computational Linguistics 31:249--287.
  • Bonnie Webber, Matthew Stone, Aravind Joshi & Ali Knott (2003). Anaphora and discourse structure. Computational Linguistics, 29(4), 545--587.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 ComplexitiesOfDependenciesInNaturalLanguageAravind K. JoshiComplexity of Dependencies in Natural Language2007