2007 SoftPatternMatchingModelsforDef

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Probabilitistic Lexico-Syntactic Pattern, Definitional Sentence Identification.

Notes

Cited By

2010

Quotes

Author Keywords

Abstract

We explore probabilistic lexico-syntactic pattern matching, also known as soft pattern matching, in a definitional question answering system. Most current systems use regular expression-based hard matching patterns to identify definition sentences. Such rigid surface matching often fares poorly when faced with language variations. We propose two soft matching models to address this problem: one based on bigrams and the other on the Profile Hidden Markov Model (PHMM). Both models provide a theoretically sound method to model pattern matching as a probabilistic process that generates token sequences. We demonstrate the effectiveness of the models on definition sentence retrieval for definitional question answering. We show that both models significantly outperform the state-of-the-art manually constructed hard matching patterns on recent TREC data.

A critical difference between the two models is that the PHMM has a more complex topology. We experimentally show that the PHMM can handle language variations more effectively but requires more training data to converge.

While we evaluate soft pattern models only on definitional question answering, we believe that both models are generic and can be extended to other areas where lexico-syntactic pattern matching can be applied.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 SoftPatternMatchingModelsforDefMin-Yen Kan
Hang Cui
Tat-Seng Chua
Soft Pattern Matching Models for Definitional Question Answering10.1145/1229179.1229182