2002 TowardsAnswerExtraction

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Link Grammar, Natural Language Processing, Question Answering

Notes

Cited By

Quotes

Abstract

The shortcomings of traditional Information Retrieval are most evident when users require exact information rather than relevant documents. This practical need is pushing the research community towards systems that can exactly pinpoint those parts of documents that contain the information requested. Answer Extraction (AE) systems aim to satisfy this need. This paper presents one such system (ExtrAns) which works by transforming documents and queries into a semantic representation called Minimal Logical Form (MLF) and derives the answers by logical proof from the documents. MLFs use underspecification to overcome the problems associated with a complete semantic representation and offer the possibility of monotonic, non-destructive extension.

3. Syntactic Processing

The syntactic analysis uses the robust dependency-based parser Link Grammar (LG) [16], which is able to handle a wide range of syntactic structures [17]. Syntactically unresolvable ambiguities, such as prepositional phrase attachment or gerund and infinitive constructions, are treated with a corpus-based approach [Brill & Resnik, COLING-94].

LG uses linkages to describe the syntactic structure of a sentence (see figure 2). Links connect pairs of words in such a way that the requirements of each word described in the sentences are satisfied, that the links do not cross, and that the words form a connected graph. Despite some extensions at the lexical and syntactic level, processing the frequent occurrences of multi-word, domain specific terminology proved problematic for LG. The addition of a new module, capable of identifying these previously detected terms, ensures they are parsed as single syntactic units. This reduces the complexity of parsing the AMM, by as much as 50%. Also, the output of LG has been extended to include the direction of the linkages as this information is vital for anaphora resolution and semantic analysis.

As LG returns all possible parses, it is necessary to disambiguate among them [13]. The two possibilities for the prepositional phrase attachment returned in figure 2, will be reduced to (b) by the disambiguator as this linkage correctly identifies the dependency relations. The link Wd connects the subject coax cable to the wall. The wall functions as a dummy word at the beginning of every sentence and has linking requirements like any other word. Ss links the transitive verb connects with the subject on the left, the verbal head on the right. The transitive verb and its direct object external antenna, that acts as the head of a noun phrase, are connected by the Os link. MVp connects the verb to the modifying prepositional phrase. Finally, the link Js connects the preposition to with its object ANT connection.

These dependency relations are used to generate the semantic representation of the sentence. LG has a robust component, parsing complex or ungrammatical structures, so that ExtrAns may still produce MLFs, extended with special predicates that mark the unprocessed words as “keywords”.

Sentences that contain nominalizations are dealt with using a small hand-crafted resource (lexicon of nominalizations) 3 which helps us to cope with the most important cases, e.g. “to edit ??a text ” ?? “editor of ??a text ”/“??text editor”. The system also includes hyponomy and synonymy relations based on the WordNet model.

References

  • [1] Eric Breck, John Burger, Lisa Ferro, Warren Greiff, Marc Light, Inderjeet Mani, and Jason Rennie, ‘Another sys called Qanda’, In Voorhees and Harman [21].
  • [2]Eric D. Brill and Philip Resnik, ‘A rule-based approach to prepositional phrase attachment disambiguation’, In: Proceedings of COLING ’94, volume 2, pp. 998–1004, Kyoto, Japan, (1994).
  • [3] C.L.A. Clarke, G.V. Cormack, D.I.E. Kisman, and T.R. Lynam, ‘Question answering by passage selection (MultiText experiments for TREC-9)’, In Voorhees and Harman [21].
[4] Michael Collins, ‘A new statistical parser based on bigram lexical dependencies’, in: Proceedings of the 34st Annual Meeting of the Association for Computational Linguistics, ACL-96, pp. 184–191, (1996).
[5] Ann Copestake, Dan Flickinger, and Ivan A. Sag, ‘Minimal recursion semantics: an introduction’, Technical report, CSLI, Stanford University, Stanford, CA, (1997).
[6] David Elworthy, ‘Question answering using a large NLP system’, In Voorhees and Harman [21].
[7] Olivier Ferrett, Brigitte Grau, Martine Hurault-Plantet, and Gabriel Illouz, ‘Qualc - the question-answering system of limsi-cnrs’, In Voorhees and Harman [21].
[8] Sanda M. Harabagiu, Dan Moldovan, Marius Pas¸ca, Rada Mihalcea, Mihai Surdeanu, Razvan C. Bunescu, Roxana Gˆırju, Vasile Rus, and Paul Morarescu, ‘FALCON: Boosting knowledge for answer engines’, In Voorhees and Harman [21].
[9] Jerry R. Hobbs, ‘Ontological promiscuity’, In: Proceedings of  ACL’85, pp. 61–69. University of Chicago, Association for Computational Linguistics, (1985).
[10] Eduard Hovy, Laurie Gerber, Ulf Hermjakob, Michael Junk, and Chin-Yew Lin, ‘Question answering in webclopedia’, In Voorhees and Harman [21].
[11] Shalom Lappin and Herbert J. Leass, ‘An algorithm for pronominal anaphora resolution’, Computational Linguistics, 20(4), 535–561, (1994).
  • [12] Adam Meyers, Catherine Macleod, Roman Yangarber, Ralph Grishman, Leslie Barrett, and Ruth Reeves, ‘Using NOMLEX to produce nominalization patterns for information extraction’, In: Proceedings: the Computational Treatment of Nominals, Montreal, Canada, (Coling-ACL98 workshop), (August 1998).
[13] Diego Moll´a and Michael Hess, ‘Dealing with ambiguities in an answer extraction system’, in Workshop on Representation and Treatment of Syntactic Ambiguity in Natural Language Processing, pp. 21–24, Paris, (2000). ATALA.
[14] DiegoMoll´a, Rolf Schwitter, Michael Hess, and Rachel Fournier, ‘Extrans,

an answer extraction system’, T.A.L. special issue on Information Retrieval oriented Natural Language Processing, (2000). [15] Fabio Rinaldi, Michael Hess, Diego Moll´a, Rolf Schwitter, James Dowdall, Gerold Schneider, and Rachel Fournier, ‘Answer extraction in technical domains’, in Computational Linguistics and Intelligent Text Processing, ed., A. Gelbukh, volume 2276 of Lecture Notes in Computer Science, 360–369, Springer-Verlag, (2002). [16] Daniel D. Sleator and Davy Temperley, ‘Parsing English with a link grammar’, In: Proceedings of Third International Workshop on Parsing Technologies, pp. 277–292, (1993). [17] Richard F. E. Sutcliffe and Annette McElligott, ‘Using the link parser of Sleator and Temperley to analyse a software manual corpus’, in Industrial Parsing of Software Manuals, eds., Richard F. E. Sutcliffe, Heinz- Detlev Koch, and Annette McElligott, chapter 6, 89–102, Rodopi, Amsterdam, (1996). [18] Ellen M. Voorhees, ‘The TREC-8 Question Answering Track Evaluation’, In Voorhees and Harman [20]. [19] Ellen M. Voorhees, ‘The TREC-8 Question Answering Track Report’, In Voorhees and Harman [20]. [20] Ellen M. Voorhees and Donna Harman, eds. The Eighth Text REtrieval Conference (TREC-8). NIST, 2000. [21] Ellen M. Voorhees and Donna Harman, eds. Proceedings of the Ninth Text REtrieval Conference (TREC-9), Gaithersburg, Maryland, November 13-16, 2000, 2001. [22] W.A. Woods, Stephen Green, and Paul Martin, ‘Halfway to question answering’, In Voorhees and Harman [21].

BibTeX

@inproceedings{DBLP:conf/ecai/RinaldiDHAS02,

 author    = { Fabio Rinaldi and
              James Dowdall and
              Michael Hess and
              Diego Mollá Aliod and
              Rolf Schwitter},
 title     = {Towards Answer Extraction: An application to Technical Domains.},
 booktitle = {ECAI},
 year      = {2002},
 pages     = {460-464},
 crossref  = {DBLP:conf/ecai/2002},
 bibsource = {DBLP, http://dblp.uni-trier.de}

}

@proceedings{DBLP:conf/ecai/2002,

 editor    = {Frank van Harmelen},
 title     = {Proceedings of the 15th European Conference on Artificial
              Intelligence, ECAI'2002, Lyon, France, July 2002},
 booktitle = {ECAI},
 publisher = {IOS Press},
 year      = {2002},
 bibsource = {DBLP, http://dblp.uni-trier.de}

} ,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2002 TowardsAnswerExtractionFabio Rinaldi
James Dowdall
Michael Hess
Diego Molla
Rolf Schwitter
Towards Answer Extraction: An application to technical domainshttp://web.science.mq.edu.au/~rolfs/papers/ecai02.pdf