An SDOI Project is a research project by Gabor Melli whose goal is to advance our ability to interlink documents and ontologies by supervised means.
- See: KDD-2009 Annotated Abstracts Dataset, KDD-2009 Abstracts Analysis, RKB Research Project, Term Mention Recognition, Term Mention Linking, Ontology, Research Paper, Data Mining Discipline, Concept Mention Identification and Linking Task, Supervised Learning Task.
- We explore the automated semantic annotation of concept mentions within a document to their corresponding page in a semantic wiki, if such a page exists, by supervised means. Unlike the related task of identifying concept mentions in a document that can be linked to a Wikipedia page, our task also requires the identification of concept mentions not yet found in the knowledge base. Our approach creates feature vectors for all candidate text spans based on both local and global information. We propose a novel set of expanded features including information available in the other documents in the training corpus. The challenge of identifying previously unseen mentions is handled with a trained CRF sequential model. The addition of iterative classification to the process is explored. Experiments against a corpus based on annotated KDD 2009 abstracts and a data analysis semantic wiki shows a lift in F-measure against baseline algorithms. We analyze feature space to understand which features carry the most predictive power on this task, and which ones are correlated and redundant.