2009 NamedEntityDisambigByLevWikipedia

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Named Entity Disambiguation Task.

Notes

Quotes

Abstract

Name ambiguity problem has raised an urgent demand for efficient, high-quality named entity disambiguation methods. The key problem of named entity disambiguation is to measure the similarity between occurrences of names. The traditional methods measure the similarity using the bag of words (BOW) model. The BOW, however, ignores all the semantic relations such as social relatedness between named entities, associative relatedness between concepts, polysemy and synonymy between key terms. So the BOW cannot reflect the actual similarity. Some research has investigated social networks as background knowledge for disambiguation. Social networks, however, can only capture the social relatedness between named entities, and often suffer the limited coverage problem.

To overcome the previous methods' deficiencies, this paper proposes to use Wikipedia as the background knowledge for disambiguation, which surpasses other knowledge bases by the coverage of concepts, rich semantic information and up-to-date content. By leveraging Wikipedia's semantic knowledge like social relatedness between named entities and associative relatedness between concepts, we can measure the similarity between occurrences of names more accurately. In particular, we construct a large-scale semantic network from Wikipedia, in order that the semantic knowledge can be used efficiently and effectively. Based on the constructed semantic network, a novel similarity measure is proposed to leverage Wikipedia semantic knowledge for disambiguation. The proposed method has been tested on the standard WePS data sets. Empirical results show that the disambiguation performance of our method gets 10.7% improvement over the traditional BOW based methods and 16.7% improvement over the traditional social network based methods.

References

  • 1. Amit Bagga, Breck Baldwin, Entity-based cross-document coreferencing using the Vector Space Model, Proceedings of the 17th International Conference on Computational linguistics, August 10-14, 1998, Montreal, Quebec, Canada doi:10.3115/980451.980859
  • 2. B. Malin. Unsupervised Name Disambiguation via Social Network Similarity, In: Proceedings of SIAM, 2005.
  • 3. B. Malin and E. Airoldi. A Network Analysis Model for Disambiguation of Names in Lists. In: Proceedings of CMOT, 2005.
  • 4. Cheng Niu, Wei Li, Rohini K. Srihari, Weakly supervised learning for cross-document person name disambiguation supported by Information Extraction, Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p.597-es, July 21-26, 2004, Barcelona, Spain doi:10.3115/1218955.1219031.
  • 5. David Milne, Ian H. Witten, Learning to link with wikipedia, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA doi:10.1145/1458082.1458150
  • 6. D. Milne and Ian H. Witten. An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In: Proceedings of AAAI, 2008.
  • 7. David Milne, Olena Medelyan, Ian H. Witten, Mining Domain-Specific Thesauri from Wikipedia: A Case Study, Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, p.442-448, December 18-22, 2006 doi:10.1109/WI.2006.119.
  • 8. Dmitri V. Kalashnikov, Rabia Nuray-Turan, Sharad Mehrotra, Towards breaking the quality curse.: a web-querying approach to web people search., Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore doi:10.1145/1390334.1390342
  • 9. Evgeniy Gabrilovich, Shaul Markovitch, Feature generation for text categorization using world knowledge, Proceedings of the 19th international joint conference on Artificial intelligence, p.1048-1053, July 30-August 05, 2005, Edinburgh, Scotland.
  • 10. Einat Minkov, William W. Cohen, Andrew Y. Ng , Contextual search and name disambiguation in email using graphs, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA doi:10.1145/1148170.1148179
  • 11. Enrique Amigó, Julio Gonzalo, Javier Artiles, Felisa Verdejo, A comparison of extrinsic clustering evaluation metrics based on formal constraints, Information Retrieval, v.12 n.4, p.461-486, August 2009 doi:10.1007/s10791-008-9066-8
  • 12. Evgeniy Gabrilovich, Shaul Markovitch, Computing semantic relatedness using Wikipedia-based explicit semantic analysis, Proceedings of the 20th international joint conference on Artifical intelligence, p.1606-1611, January 06-12, 2007, Hyderabad, India
  • 13. Gideon S. Mann, David Yarowsky, Unsupervised personal name disambiguation, Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003, p.33-40, May 31, 2003, Edmonton, Canada doi:10.3115/1119176.1119181
  • 14. Javier Artiles, Julio Gonzalo, Satoshi Sekine, The SemEval-2007 WePS evaluation: establishing a benchmark for the web people search task, Proceedings of the 4th International Workshop on Semantic Evaluations, p.64-69, June 23-24, 2007, Prague, Czech Republic
  • 15. Javier Artiles, Julio Gonzalo and Satoshi Sekine. WePS2 Evaluation Campaign: Overview of the Web People Search Clustering Task. In WePS2, WWW 2009, 2009.. 16. Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua Li, Qiang Yang, Zheng Chen, Enhancing text clustering by leveraging Wikipedia semantics, Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, July 20-24, 2008, Singapore, Singapore doi:10.1145/1390334.1390367
  • 17. J. Hassell, B. Aleman-Meza and IB Arpinar. Ontology-Driven Automatic Entity Disambiguation in Unstructured Text. In: Proceedings of ISWC, 2006
  • 18. Kai-Hsiang Yang, Kun-Yan Chiou, Hahn-Ming Lee, Jan-Ming Ho, Web Appearance Disambiguation of Personal Names Based on Network Motif, Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, p.386-389, December 18-22, 2006 doi:10.1109/WI.2006.189
  • 19. O. Medelyan, Ian H. Witten and D. Milne. Topic Indexing with Wikipedia. In WIKIAI, AAAI 2008. 2008.. 20. Rada Mihalcea, Andras Csomai, Wikify!: linking documents to encyclopedic knowledge, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, November 06-10, 2007, Lisbon, Portugal doi:10.1145/1321440.1321475
  • 21. Michael Ben Fleischman. Multi-Document Person Name Resolution, In: Proceedings of ACL, 2004.
  • 22. Razvan Bunescu and Marius Paşca. Using Encyclopedic Knowledge for Named Entity Disambiguation. In: Proceedings of EACL, 2006.. 23. Ron Bekkerman, Andrew McCallum, Disambiguating Web appearances of people in a social network, Proceedings of the 14th International Conference on World Wide Web, May 10-14, 2005, Chiba, Japan doi:10.1145/1060745.1060813
  • 24. Silviu Cucerzan. Large-Scale Named Entity Disambiguation Based on Wikipedia Data. In: Proceedings of EMNLP, 2007.
  • 25. Michael Strube, Simone Paolo Ponzetto

, WikiRelate! computing semantic relatedness using wikipedia, proceedings of the 21st national conference on Artificial intelligence, p.1419-1424, July 16-20, 2006, Boston, Massachusetts

  • 26. Ted Pedersen, Amruta Purandare and Anagha Kulkarni. Name Discrimination by Clustering Similar Contexts. In: Proceedings of CICLing, 2005.
  • 27. Xiaojun Wan, Jianfeng Gao, Mu Li, Binggong Ding, Person resolution in person search results: WebHawk, Proceedings of the 14th ACM International Conference on Information and knowledge management, October 31-November 05, 2005, Bremen, Germany doi:10.1145/1099554.1099585
  • 28. Ying Chen, James Martin. Towards Robust Unsupervised Personal Name Disambiguation. In: Proceedings of EMNLP, 2007.
  • 29. Birger Hjørland, Semantics and knowledge organization, Annual Review of Information Science and Technology, v.41 n.1, p.367-405, January 2008 doi:10.1002/aris.144.v41:1,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 NamedEntityDisambigByLevWikipediaXianpei Han
Jun Zhao
Named Entity Disambiguation by Leveraging Wikipedia Semantic KnowledgeProceedings of the Eighteenth Conference on Information and Knowledge Management10.1145/1645953.16459832009