2006 EspressoAutoHarvestingSemanticRelations

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Relation Mention Recognition Algorithm, Semi-Supervised Learning Algorithm, Espresso Algorithm.

Notes

Cited By

2007

Quotes

Abstract

In this paper, we present Espresso, a weakly-supervised, general-purpose, and accurate algorithm for harvesting semantic relations. The main contributions are: i) a method for exploiting generic patterns by filtering incorrect instances using the Web; and ii) a principled measure of pattern and instance reliability enabling the filtering algorithm. We present an empirical comparison of Espresso with various state of the art systems, on different size and genre corpora, on extracting various general and specific relations. Experimental results show that our exploitation of generic patterns substantially increases system recall with small effect on overall precision.


References

  • 1. Matthew Berland, Eugene Charniak, Finding parts in very large corpora, Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, p.57-64, June 20-26, 1999, College Park, Maryland doi:10.3115/1034678.1034697
  • 2. Brown, T. L.; LeMay, H. E.; Bursten, B. E.; and Burdge, J. R. (2003). Chemistry: The Central Science, Ninth Edition. Prentice Hall.
  • 3. Sharon A. Caraballo, Automatic construction of a hypernym-labeled noun hierarchy from text, Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, p.120-126, June 20-26, 1999, College Park, Maryland doi:10.3115/1034678.1034705
  • 4. Thomas M. Cover, Joy A. Thomas, Elements of information theory, Wiley-Interscience, New York, NY, 1991
  • 5. David Day, John Aberdeen, Lynette Hirschman, Robyn Kozierok, Patricia Robinson, Marc Vilain, Mixed-initiative development of language processing systems, Proceedings of the fifth Conference on Applied Natural Language Processing, p.348-355, March 31-April 03, 1997, Washington, DC doi:10.3115/974557.974608
  • 6. Downey, D.; Oren Etzioni; and Soderland, S. (2005). A Probabilistic model of redundancy in information extraction. In: Proceedings of IJCAI-05. pp. 1034--1041. Edinburgh, Scotland.
  • 7. Oren Etzioni, Michael J. Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates, Unsupervised named-entity extraction from the web: an experimental study, Artificial Intelligence, v.165 n.1, p.91-134, June 2005 doi:10.1016/j.artint.2005.03.001
  • 8. C. Fellbaum. (1998). WordNet: An Electronic Lexical Database. MIT Press.
  • 9. Maayan Geffet, Ido Dagan, The distributional inclusion hypotheses and lexical entailment, Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, p.107-114, June 25-30, 2005, Ann Arbor, Michigan doi:10.3115/1219840.1219854
  • 10. Roxana Girju, Adriana Badulescu, Dan Moldovan, Automatic Discovery of Part-Whole Relations, Computational Linguistics, v.32 n.1, p.83-135, March 2006 doi:10.1162/coli.2006.32.1.83
  • 11. Marti A. Hearst, Automatic acquisition of hyponyms from large text corpora, Proceedings of the 14th conference on Computational linguistics, August 23-28, 1992, Nantes, France doi:10.3115/992133.992154
  • 12. Donald Hindle, Noun classification from predicate-argument structures, Proceedings of the 28th annual meeting on Association for Computational Linguistics, p.268-275, June 06-09, 1990, Pittsburgh, Pennsylvania doi:10.3115/981823.981857
  • 13. Justeson J. S. and Katz S. M. (1995). Technical Terminology: some linguistic properties and algorithms for identification in text. In: Proceedings of ICCL-95. pp.539--545. Nantes, France.
  • 14. Chin-Yew Lin, Eduard Hovy, The automated acquisition of topic signatures for text summarization, Proceedings of the 18th conference on Computational linguistics, p.495-501, July 31-August 04, 2000, Saarbrücken, Germany doi:10.3115/990820.990892
  • 15. Dekang Lin, Patrick Pantel, Concept discovery from text, Proceedings of the 19th International Conference on Computational linguistics, p.1-7, August 24-September 01, 2002, Taipei, Taiwan doi:10.3115/1072228.1072372
  • 16. Gideon S. Mann, Fine-grained proper noun ontologies for question answering, COLING-02 on SEMANET: building and using semantic networks, p.1-7, September 01, 2002 doi:10.3115/1118735.1118746
  • 17. Patrick Pantel, and Ravichandran, D. (2004). Automatically labeling semantic classes. In: Proceedings of HLT/NAACL-04. pp. 321--328. Boston, MA.
  • 18. Patrick Pantel, Deepak Ravichandran, Eduard Hovy, Towards terascale knowledge acquisition, Proceedings of the 20th International Conference on Computational Linguistics, p.771-es, August 23-27, 2004, Geneva, Switzerland doi:10.3115/1220355.1220466
  • 19. Marius Paşca, and Sanda M. Harabagiu (2001). The informative role of WordNet in Open-Domain Question Answering. In: Proceedings of NAACL-01 Workshop on WordNet and Other Lexical Resources. pp. 138--143. Pittsburgh, PA.
  • 20. * (Ravichandran and Hovy, 2002) ⇒ Deepak Ravichandran, and Eduard Hovy. (2002). “Learning surface text patterns for a Question Answering system./ In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, July 07-12, 2002, Philadelphia, Pennsylvania.
  • 21. R. Riloff, and Shepherd, J. (1997). A corpus-based approach for building semantic lexicons. In: Proceedings of EMNLP-97.
  • 22. Siegel, S. and Castellan Jr., N. J. (1988). Nonparametric Statistics for the Behavioral Sciences. McGraw-Hill.
  • 23. Szpektor, I.; Tanev, H.; I. Dagan; and Coppola, B. (2004). Scaling web-based acquisition of entailment relations. In: Proceedings of EMNLP-04. Barcelona, Spain.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2006 EspressoAutoHarvestingSemanticRelationsPatrick Pantel
Marco Pennacchiotti
Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relationshttp://acl.ldc.upenn.edu/P/P06/P06-1015.pdf