A Thesaurus is a lexical database restricted to thesaurus records (representing synonym sets)



A hierarchy of subject headings — canonic titles of themes and topics, the titles serving as search keys.




  • (ANSI Z39.19, 2005) ⇒ ANSI. (2005). “ANSI/NISO Z39.19 - Guidelines for the Construction, Format, and Management of Monolingual Controlled Vocabularies." ANSI.
    • QUOTE: "thesaurus (plural: thesauruses, thesauri)
      • A controlled vocabulary arranged in a known order and structured so that the various relationships among terms are displayed clearly and identified by standardized relationship indicators. Relationship indicators should be employed reciprocally.
      • Its purpose is to promote consistency in the indexing of content objects, especially for postcoordinated information storage and retrieval systems, and to facilitate browsing and searching by linking entry terms with terms. Thesauri may also facilitate the retrieval of content objects in free text searching.
      • NOTES: The term “Thesaurus” is the Latin form of the Greek word thesauros, originally meaning “treasure store.” In the 16th century, it began to be used as a synonym for “dictionary” (a treasure store of words), but later it fell into disuse. Peter Mark Roget resurrected the term in 1852 for the title of his dictionary of synonyms. The purpose of that work is to give the user a choice among similar terms when the one first thought of does not quite seem to fit. A hundred years later, in the early 1950s, the word “thesaurus” began to be employed again as the name for a word list, but one with the exactly opposite aim: to prescribe the use of only one term for a concept that may have synonyms. A similarity between Roget’s Controlled Thesaurus and thesauri for indexing and information retrieval is that both list terms that are related hierarchically or associatively to terms, in addition to synonyms.
  • (Woodley, 2005b) ⇒ Mary S. Woodley, Gail Clement, and Pete Winn. (2005). “DCMI Glossary." Dublin Core Metadata Initiative.
    • thesaurus
      • A structured vocabulary make up of names, words, and other information, typically including synonyms and/or hierarchical relationships for the purpose of cross-referencing in order to organize a collection of concepts for reference and retrieval. See the ANSI/NISO Standard for thesaurus construction Z39.19-2003 (R1998; ISO 2788). A controlled vocabulary of terms or concepts that are structured hierarchically (parent/child relationships) or as equivalences (synonyms), and related terms (associative). See also Subject headings and glossary. A thesaurus is a taxonomy.


  • (Fellbaum, 2002) ⇒ Christine Fellbaum. (2002). “On the Semantics of Troponymy.” In: The Semantics of Relationships: An Interdisciplinary. R. Green, C. Bean, and S. Myaeng (eds.). Dordrecht, Holland: Kluwer.
    • A third type of knowledge, world or encyclopedic knowledge, cannot be expressed in terms of relations, as it lists information about the concept behind the word in language not bound to formulas such as definiendum-definitions.
    • A thesaurus lists words in semantically related groups. It is intended for users who have a certain concept in mind, and are looking either for alternative words to express this concept or for words that express similar concepts. Because its purpose is to suggest words that may be substitutable for each other, a thesaurus is necessarily organized paradigmatically. But the semantic relations between the members of a word group are not made explictly, nor are all words within a group related in the same way.
    • The lexical database [WordNet]] (George A. Miller, 1990; Fellbaum, 1998) resembles a thesaurus in that it represents word meanings primarily in terms of conceptual-semantic and lexical relations. Relations among groups of cognitively synonymous words are given straightforwardly, without being woven in the definitions, as in conventional lexicography. But unlike a standard thesaurus, the relations are transparent and explicitly labeled; moreover, they have been deliberately limited in number. The resultant structure is a large semantic network for nouns, verbs, adjectives, and adverbs.
    • The bulk of WordNet, as indeed of any lexicon of English, is comprised of nouns; there are far more distinct noun forms than verbs or adjectives in the language. In constructing a semantic net of nouns, hyponymy and meronymy relations could be applied in a straightforward manner (but see George A. Miller, 1998, for details). By contrast, adjectives in many cases denote values of attributes, and can be interrelated via antonymy, as in the case of pairs hot-cold and long-short, where the antonyms express values of head and length, respectively. Most adjectives, like icy and elongated, have no salient antonyms' ; they are linked to core adjective like cold and long through a relation of semantic similarity (K. J. Miller, 1998).