- (Hasegawa et al., 2004) ⇒ Takaaki Hasegawa, Satoshi Sekine, Ralph Grishman. (2004). “Discovering Relations among Named Entities from Large Corpora.” In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004).
- (Shinyama & Sekine, 2006) ⇒ Yusuke Shinyama, and Satoshi Sekine. (2006). “Preemptive Information Extraction Using Unrestricted Relation Discovery.” In: Proceedings of the HLT-NAACL Conference (HLT-NAACL 2006).
- QUOTE: Several existing works have tried to extract a certain type of relation by manually choosing different pairs of entities (Brin, 1998; Ravichandran and Hovy, 2002). Hasegawa et al. (2004) tried to extract multiple relations by choosing entity types.
Discovering the significant relations embedded in documents would be very useful not only for information retrieval but also for question answering and summarization. Prior methods for relation discovery, however, needed large annotated corpora which cost a great deal of time and effort. We propose an unsupervised method for relation discovery from large corpora. The key idea is clustering pairs of named entities according to the similarity of context words intervening between the named entities. Our experiments using one year of newspapers reveals not only that the relations among named entities could be detected with high recall and precision, but also that appropriate labels could be automatically provided for the relations.
- Eugene Agichtein, and Luis Gravano. (2000). Snowball: Extracting relations from large plain-text collections. In: Proceedings of the 5th ACM International Conference on Digital Libraries (ACM DL’00), pages 85–94.
- Sergey Brin. (1998). Extracting patterns and relations from world wide web. In: Proceedings of WebDB Workshop at 6th International Conference on Extending Database Technology (WebDB’98), pages 172–183.
- Defense Advanced Research Projects Agency. (1995). Proceedings of the Sixth Message Understanding Conference (MUC-6). Morgan Kaufmann Publishers, Inc.
- (Lin & Pantel) ⇒ Dekang Lin, and Patrick Pantel. (2001). “Dirt - Discovery of inference rules from text.” In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2001).
- National Institute of Standards and Technology. (2000). Automatic Content Extraction. http://www.nist.gov/speech/tests/ace/index.htm.
- Deepak Ravichandran and Eduard Hovy. (2002). Learning surface text patterns for a question answering system. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002), pages 41–47.
- Satoshi Sekine, Kiyoshi Sudo, and Chikashi Nobata. (2002). “Extended Named Entity Hierarchy.” In: Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002).
- Satoshi Sekine. (2001). “OAK System (English Sentence Analyzer).” http://nlp.cs.nyu.edu/oak/.
- Dmitry Zelenko, Chinatsu Aone, and Anthony Richardella. (2002). Kernel methods for relation extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-2002), pages 71–78.
|Author||Takaaki Hasegawa +, Satoshi Sekine + and Ralph Grishman +|
|journal||Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics +|
|title||Discovering Relations among Named Entities from Large Corpora +|