Subject Headings: Supervised IE; Document to Ontology Interlinking; SDOI Algorithm; Ontology-based IE from Text.


Concept Mention; Relation Mention; Ontology; Semantic Annotation; Reference Resolution; Supervised Classification.


The value from the growing availability of online documents and ontologies will increase significantly once these two resources become deeply interlinked at the semantic level. We focus our investigation on the automated identification and the linking of concepts and relations mentioned in a document that are (or should be) in a domain-specific ontology. Such semantic information can allow for improved navigation of the information space: users can more quickly retrieve documents that mention the relations sought; Ontology engineers can enhance concepts with relations extracted from the literature; and more advanced natural language-based applications such as text summarization, textual entailment, and machine reading become ever more possible.

In this thesis, we present the task of supervised semantic interlinking of documents to an ontology. We also propose a supervised algorithm that identifies and links concept mentions that are (or should be) in the ontology, and also identify mentions of binary relations that are (or should be) in the ontology. The resulting system, SDOI, is tested on a novel corpus and ontology from the data mining field on intrinsic measures such as accuracy, and extrinsic measures such time saved by the annotator in the annotation process.

One day many high-value documents and ontologies will be interlinked to each other. This thesis presents a principled step towards that outcome.


7.3 Text Graph Representation

Figure 7 -

A sample of the text graph representation (for a highly summarized document) that SDOIRMI would use to create feature vectors for the task of relation mention identification.

File:TextGraph Melli2010b Figure7.png


