2007 UsingCorpusStatsOnEntsToImproveSemiSupRelExFromWeb

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Abstract

Many errors produced by unsupervised and semi-supervised relation extraction (RE) systems occur because of wrong recognition of entities that participate in the relations. This is especially true for systems that do not use separate named-entity recognition components, instead relying on general-purpose shallow parsing. Such systems have greater applicability, because they are able to extract relations that contain attributes of unknown types. However, this generality comes with the cost in accuracy. In this paper we show how to use corpus statistics to validate and correct the arguments of extracted relation instances, improving the overall RE performance. We test the methods on SRES – a self-supervised Web relation extraction system. We also compare the performance of corpus-based methods to the performance of validation and correction methods based on supervised NER components.

References

  • Eugene Agichtein and L. Gravano (2000). Snowball: Extracting Relations from Large Plain-Text Collections. Proceedings of the 5th ACM International Conference on Digital Libraries (DL).
  • Brin, S. (1998). Extracting Patterns and Relations from the World Wide Web. WebDB Workshop at 6th International Conference on Extending Database Technology, EDBT’98, Valencia, Spain.
  • Chen, J., D. Ji, C. L. Tan and Z. Niu (2005). Unsupervised Feature Selection for Relation Extraction. IJCNLP-05, Jeju Island, Korea.
  • Downey, D., M. Broadhead and Oren Etzioni (2007). Locating Complex Named Entities in Web Text. IJCAI-07.
  • Oren Etzioni, Michael J. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderland, D. Weld and A. Yates (2005). Unsupervised named-entity extraction from the Web: An experimental study. Artificial Intelligence 165(1): 91-134.
  • Ronen Feldman and B. Rosenfeld (2006). Boosting Unsupervised Relation Extraction by Using NER. EMNLP-06, Sydney, Australia.
  • Ronen Feldman and B. Rosenfeld (2006). Self-Supervised Relation Extraction from the Web. ISMIS-2006, Bari, Italy.
  • Hasegawa, T., Satoshi Sekine and Ralph Grishman (2004). Discovering Relations among Named Entities from Large Corpora. ACL 2004.
  • Ravichandran, D. and Eduard Hovy (2002). Learning Surface Text Patterns for a Question Answering System. 40th ACL Conference.
  • Riloff, E. and R. Jones (1999). Learning Dictionaries for Information Extraction by Multi-level Bootstrapping. AAAI-99.
  • Rosenfeld, B., M. Fresko and Ronen Feldman (2005). A Systematic Comparison of Feature-Rich Probabilistic Classifiers for NER Tasks. PKDD.
  • Su, K.-Y., M.-W. Wu and J.-S. Chang (1994). A Corpus-based Approach to Automatic Compound Extraction. Meeting of the Association for Computational Linguistics: 242-247.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 UsingCorpusStatsOnEntsToImproveSemiSupRelExFromWebRonen Feldman
Benjamin Rosenfeld
Using Corpus Statistics on Entities to Improve Semi-supervised Relation Extraction from the Webhttp://www.aclweb.org/anthology/P/P07/P07-1076.pdf2007