2008 AFrameworkForIdentityResolutionAndMerging

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Entity Mention Normalization Task, Ontology-based Information Extraction, OntoText Lab.

Notes

that is based on OWLIM (Kiryakov et al., 2005) and Sesame.

Cited By

Quotes

Abstract

In the context of ontology-based information extraction, identity resolution is the process of deciding whether an instance extracted from text refers to a known entity in the target domain (e.g. the ontology). We present an ontology-based framework for identity resolution which can be customised to different application domains and extraction tasks. Rules for identify resolution, which compute similarities between target and source entities based on class information and instance properties and values, can be defined for each class in the ontology. We present a case study of the application of the framework to the problem of multi-source job vacancy extraction.


References

  • Niraj Aswani, Kalina Bontcheva, and Hamish Cunningham. (2006). Mining information for instance unification. In 5th International Semantic Web Conference (ISWC2006), Athens, Georgia.
  • A. Bagga and B. Baldwin. (1998). Entity-based Cross- Document Coreferencing Using the Vector Space Model. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (COLING-ACL’98), pages 79–85.
  • A. Bagga and A. W. Biermann. (2000). A methodology for cross-document coreference. In: Proceedings of the Fifth Joint Conference on Information Sciences (JCIS 2000), pages 207–210.
  • Mikhail Bilenko and Raymond Mooney. (2003). Employing trainable string similarity metrics for information integration. In IJCAI-2003, Mexico.
  • Ahmed K. Elmagarmid, Panagiotis G. Ipeirotis, and Vassilios S. Verykios. (2007). Duplicate record detection: A survey. Technical report, TKDE, January.
  • Adam Funk, Diana Maynard, Horacio Saggion, and Kalina Bontcheva. (2007). Ontological integration of information extraction from multiple sources. In International Workshop on Multi-source, Multi-lingual Information Extraction and Summarisaton.
  • F. Giunchiglia, P. Shvaiko, and M. Yatskevich. (2004). Smatch: an algorithm and an implementation of semantic matching. In ESWS, pages 61–75.
  • Ralph Grishman. (1997). Information Extraction: Techniques and Challenges. In Information Extraction: a Multidisciplinary Approach to an Emerging Information Technology.
  • Atanas Kiryakov, Damyan Ognyanov, and Dimitar Mano. (2005). Owlim a pragmatic semantic repository for owl. In SSWS 2005, WISE, USA.
  • Michal C.A. Klein, Peter Mika, and Stefan Schlobach. (2007). Approximate instance unification using roughowl. In Workshop on Uncertainty Reasoning for the Semantic Web (URSW).
  • G. S. Mann and David Yarowsky. (2003). Unsupervised personal name disambiguation. In W. Daelemans and M. Osborne, editors, Proceedings of the 7th Conference on Natural Language Learning (CoNLL-2003), pages 33–40. Edmonton, Canada, May.
  • George A. Miller. (1994). Wordnet: a lexical database for english. In HLT ’94, USA.
  • X.-H. Phan, L.-M. Nguyen, and S. Horiguchi. (2006). Personal name resolution crossover documents by a semantics-based approach. IEICE Trans. Inf. & Syst., Feb 2006.
  • Borislav Popov, Atanas Kiryakov, Damyan Ognyanoff, Dimitar Manov, and Angel Kirilov. (2004). Kim - a semantic platform for information extraction and retrieval. In Journal of Natural Language Engineering. Cambridge University Press.
  • H. Saggion. (2008). Experiments on semantic-based clustering for cross-document coreference. In International Joint Conference on Natural Language Processing, Hyderabad, India, January. AFNLP.
  • Ivan Terziev, Atanas Kiryakov, and Dimitar Mano. (2005). Base upper-level ontology (bulo) guidance. Technical Report Deliverable 1.8.1, SEKT project, UK, July.

,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 AFrameworkForIdentityResolutionAndMergingHamish Cunningham
Milena Yankova
Horacio Saggion
A Framework for Identity Resolution and Merging for Multi-source Information ExtractionProceedings of LREC Conferencehttp://www.lrec-conf.org/proceedings/lrec2008/pdf/347 paper.pdf2008