2008 FromGlossariesToOntologies

Jump to: navigation, search

Subject Headings: Ontology Learning from Text, Semantic Relation Learning, Glossary Formalization.


Cited By


Author Keywords

Ontology Learning, Semantic Relation Learning, Glossary Formalization.


Learning ontologies requires the acquisition of relevant domain concepts and taxonomic, as well as non-taxonomic, relations. In this chapter, we present a methodology for automatic ontology enrichment and document annotation with concepts and relations of an existing domain core ontology. Natural language definitions from available glossaries in a given domain are processed and regular expressions are applied to identify general-purpose and domain-specific relations. We evaluate the methodology performance in extracting hypernymy and nontaxonomic relations. To this end, we annotated and formalized a relevant fragment of the glossary of Art and Architecture (AAT) with a set of 10 relations (plus the hypernymy relation) defined in the CRM CIDOC cultural heritage core ontology, a recent W3C standard. Finally, we assessed the generality of the approach on a set of web pages from the domains of history and biography.


The Semantic Web [1], i.e. the vision of a next-generation web where content is conceptually indexed, requires applications to process and exploit the semantics implicitly encoded in on-line and off-line resources. The large-scale, automatic semantic annotation of web documents based on well-established domain ontologies would allow Semantic Web applications to emerge and gain acceptance. Wide coverage ontologies are indeed available for general applications (e.g. WordNet2, CYC3, SUMO4), however semantic annotation in unconstrained areas seems still out of reach for state-of-the-art systems. Domain-specific ontologies are preferable since they would limit the semantic coverage needed and make the applications feasible.


  • [1] T. Berners-Lee, J. Hendler, and O. Lassila, The Semantic Web, Scientific American, 284(5) (2001).
  • [2] M. S. Fox, M. Barbuceanu, M. Gruninger, and J. Lin, An Organisation Ontology for Enterprise Modeling, In Simulating Organizations: Computational Models of Institutions and Groups, M. Prietula, K. Carley and L. Gasser (Eds), Menlo Park CA: AAAI/MIT Press (1997), 131–152.
  • [3] M. Uschold, M. King, S. Moralee and Y. Zorgios, The Enterprise Ontology, In The Knowledge Engineering Review, 13 (1998).
  • [4] M. Doerr, The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata, AI Magazine, 24(3) (2003).
  • [5] R. Navigli and P. Velardi, Structural Semantic Interconnections: a knowledge-based approach to word sense disambiguation, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(7) (2005), 1063–1074.
  • [6] G. A. Miller, WordNet: a lexical database for English, Communications of the ACM, 38(11) (1995), 39–41.
  • [7] J. E. F. Friedl, Mastering Regular Expressions, O’Reilly (1997).
  • [8] P. Velardi, R. Navigli and M. Pétit, Semantic Indexing of a Competence Map to support Scientific Collaboration in a Research Community, In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Hyderabada, India (2007).
  • [9] N. Ide, J. Véronis, Refining Taxonomies Extracted from Machine-Readable Dictionaries. In Hockey, S., Ide, N. Research in Humanities Computing II, Oxford University Press (2003), 145–59.
  • [10] E. Charniak, Statistical Techniques for Natural Language Parsing, AI Magazine, 18(4) (1997), 33–44.
  • [11] M. Thelen and E. Riloff, A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts, In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2002).
  • [12] M. E. Califf and R.J. Mooney, Bottom-up relational learning of pattern matching rules for information extraction, Machine Learning research, 4(2) (2004), 177–210.
  • [13] P. Cimiano, G. Ladwig and S. Staab, Gimme the context: context-driven automatic semantic annotation with C-PANKOW, In: Proceedings of the 14th International WWW Conference, Chiba, Japan (2005).
  • [14] A. G. Valarakos, G. Paliouras, V. Karkaletsis and G. Vouros, Enhancing Ontological Knowledge through Ontology Population and Enrichment, In: Proceedings of the 14th EKAW conference, Springer-Verlag (2004), 144–156.
  • [15] R. Snow, D. Jurafsky, and A. Y. Ng, Learning syntactic patterns for automatic hypernym discovery, In NIPS (2005).
  • [16] E. Morin and C. Jacquemin, Automatic acquisition and expansion of hypernym links, Computer and the Humanities, 38 (2004), 363–396.
  • [17] V. Kashyap, C. Ramakrishnan, T. Rindflesch, Toward (Semi)-Automatic Generation of Bio-medical Ontologies, In: Proceedings of American Medical Informatics Association (2003).
  • [18] S. A. Caraballo, Automatic construction of a hypernym-labeled noun hierarchy from text, In: Proceedings of 37th Annual Meeting of the Association for Computational Linguistics (1999), 120–126.
  • [19] A. Maedche, V. Pekar and S. Staab, Ontology learning part One: On Discovering Taxonomic Relations from the Web, In Web Intelligence, Chapter 1, Springer, 2002.
  • [20] D. Widdows, Unsupervised methods for developing taxonomies by combining syntactic and statistical information, In: Proceedings of HLT-NAACL, Edmonton, Canada (2003), 197–204.
  • [21] L. Reeve and H. Han, Survey of Semantic Annotation Platforms, In: Proceedings of the 20th Annual ACM Symposium on Applied Computing (2005). 15
  • [22] R. Navigli and P. Velardi, Automatic Acquisition of a Thesaurus of Interoperability Terms, In: Proceedings of 16th IFAC World Congress, Praha, Czech Republic (2005).


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2008 FromGlossariesToOntologiesRoberto Navigli
Paola Velardi
From Glossaries to Ontologies: Extracting Semantic Structure from Textual DefinitionsOntology Learning and Population: Bridging the Gap between Text and Knowledgehttp://www.dsi.uniroma1.it/~navigli/pubs/Navigli Velardi IOS 2008.pdf2008