2010 SupervisedIdentCMentionsAndLinkingToOntology

Jump to navigation Jump to search

Subject Headings: SDOI Algorithm, Supervised Ontology-based Concept Mention Identification, Supervised Concept Mention to Ontology Linking.


Cited By




We propose a purely supervised learning approach to the task of identifying concept mentions within a document and of linking these mentions to their corresponding concept in a given ontology. Concept mention identification is performed with a trained CRF sequential model. Each mention is associated with a set of candidate ontology concepts, and binary training feature vectors are generated for these pairings. We formalize the feature space to expand on those those proposed in the literature, and also propose the inclusion of features derived from the training corpus. Iterative classification is proposed as a method of handling collective decisions in a supervised manner. The approach, named SCMILO, is validated against the ability to identify the concept mentions within the 139 KDD-2009 conference paper abstracts, and to link these mentions to a domain-specific ontology for the field of data mining.

1. Introduction


  • 1. Satanjeev Banerjee, Ted Pedersen, An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet, Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing, p.136-145, February 17-23, 2002
  • 2. Rudi L. Cilibrasi, Paul M. B. Vitanyi, The Google Similarity Distance, IEEE Transactions on Knowledge and Data Engineering, v.19 n.3, p.370-383, March 2007 doi:10.1109/TKDE.2007.48
  • 3. Eugene Charniak, A Maximum-entropy-inspired Parser, Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, p.132-139, April 29-May 04, 2000, Seattle, Washington
  • 4. Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, Soumen Chakrabarti, Collective Annotation of Wikipedia Entities in Web Text, Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, June 28-July 01, 2009, Paris, France doi:10.1145/1557019.1557073
  • 5. Andrew McCallum, Wei Li, Early Results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-enhanced Lexicons, Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, p.188-191, May 31, 2003, Edmonton, Canada doi:10.3115/1119176.1119206
  • 6. Gabor Melli. (2010a). “Concept Mentions Within KDD-2009 Abstracts (kdd09cma1) Linked to a KDD Ontology (kddo1).” In: Proceedings of LREC 2010.
  • 7. Gabor Melli. (2010b). Supervised Document to Ontology Interlinking. PhD Thesis, Simon Fraser University.
  • 8. Rada Mihalcea, Andras Csomai, Wikify!: Linking Documents to Encyclopedic Knowledge, Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, November 06-10, 2007, Lisbon, Portugal doi:10.1145/1321440.1321475
  • 9. David Milne, Ian H. Witten, Learning to Link with Wikipedia, Proceedings of the 17th ACM Conference on Information and Knowledge Management, October 26-30, 2008, Napa Valley, California, USA doi:10.1145/1458082.1458150
  • 10. Roberto Navigli, Paola Velardi, Aldo Gangemi, Ontology Learning and Its Application to Automated Terminology Translation, IEEE Intelligent Systems, v.18 n.1, p.22-31, January 2003 doi:10.1109/MIS.2003.1179190
  • 11. Jennifer Neville, and David Jensen. (2000). Iterative Classification in Relational Data. In: Proceedings of the Workshop on Statistical Relational Learning.
  • 12. Francesco Sclano, and Paola Velardi. (2007). TermExtractor: A Web Application to Learn the Common Terminology of Interest Groups and Research Communities. In: Proc. of the 9th Conference on Terminology and AI (TIA 2007).
  • 13. Fei Sha, Fernando Pereira, Shallow Parsing with Conditional Random Fields, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, p.134-141, May 27-June 01, 2003, Edmonton, Canada doi:10.3115/1073445.1073473
  • 14. Pierre Zweigenbaum, Dina Demner-Fushman, Hong Yu, and Kevin B. Cohen. (2007). Frontiers of Biomedical Text Mining: Current Progress. In: Briefings in Bioinformatics 2007, 8(5). Oxford Univ Press.


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2010 SupervisedIdentCMentionsAndLinkingToOntologyMartin Ester
Gabor Melli
Supervised Identification and Linking of Concept Mentions to a Domain-Specific Ontologyhttp://dl.acm.org/authorize?39985210.1145/1871437.1871712