A lexical item is an language terminal within a natural language syntax



  • (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/lexical_item Retrieved:2015-1-30.
    • A lexical item (or lexical unit, lexical entry) is a single word, a part of a word, or a chain of words (=catena) that forms the basic elements of a language's lexicon (≈vocabulary). Examples are cat, traffic light, take care of, by the way, and it's raining cats and dogs. Lexical items can be generally understood to convey a single meaning, much as a lexeme, but are not limited to single words. Lexical items are like semes in that they are "natural units" translating between languages, or in learning a new language. In this last sense, it is sometimes said that language consists of grammaticalized lexis, and not lexicalized grammar. The entire store of lexical items in a language is called its lexis.

      Lexical items composed of more than one word are also sometimes called lexical chunks, gambits, lexical phrases, lexical units, lexicalized stems, or speech formulae. The term polyword listemes is also sometimes used.


  • http://en.wikipedia.org/wiki/Word_form#Lexemes_and_word_forms
    • The distinction between these two senses of "word" is arguably the most important one in morphology. The first sense of "word", the one in which dog and dogs are "the same word", is called a lexeme. The second sense is called word form. We thus say that dog and dogs are different forms of the same lexeme. Dog and dog catcher, on the other hand, are different lexemes, as they refer to two different kinds of entities. The form of a word that is chosen conventionally to represent the canonical form of a word is called a lemma, or citation form.


  • http://en.wiktionary.org/wiki/Word
    • A distinct unit of language (sounds in speech or written letters) with a particular meaning, composed of one or more morphemes, and also of one ...
  • (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Word
    • A word is a unit of language that carries meaning and consists of one or more morphemes which are linked more or less tightly together, and has a phonetical value. Typically a word will consist of a root or stem and zero or more affixes. ...


  • http://en.wiktionary.org/wiki/lexical_item
    • (semantics) A term — word or a sequence of words — that acts as a unit of meaning, including words, phrases, phrasal verbs and proverbs, exemplified by "cat", "traffic light", "take care of", "by-the-way", and "don't count your chickens before they hatch".

  • (Wikipedia, 2009) ⇒ http://en.wikipedia.org/wiki/Inflection
    • Overt inflection typically distinguishes lexical items (such as lexemes) from functional ones (such as affixes, clitics, particles and morphemes in general) and has functional items acting as markers on lexical ones.
  • (WordNet, 2009) ⇒ http://wordnetweb.princeton.edu/perl/webwn?s=word
    • a unit of language that native speakers can identify; "words are the blocks from which sentences are made"; "he hardly said ten words all morning"
    • a brief statement; "he didn't say a word about it"
    • news: information about recent and important events; "they awaited news of the outcome"
    • a verbal command for action; "when I give the word, charge!"
  • http://folk.uio.no/hhasselg/terms.html
    • word (ord): the smallest linguistic unit that can have a syntactic function. A word has an expression side (combination of sounds, or of letters) and a content side (an independent meaning).
  • http://www.cse.unsw.edu.au/~billw/nlpdict.html#word
    • Words are units of language. They are built of morphemes and are used to build phrases (which are in turn used to build sentences.
    • See also lexeme
    • See also terminal symbol
  • (Jurafsky & Martin, 2009) ⇒ Daniel Jurafsky, and James H. Martin. (2000). “Speech and Language Processing, 2nd edition." Pearson Education.
    • QUOTE: For the purposes of lexical semantics, particularly for dictionaries and thesauruses, we represent a lexeme by a lemma. A lemma or citation form is the grammatical form that is used to represent a lexeme; thus, carpet is the lemma for carpets. The lemma or citation form for sing, sang, sung is sing. In many language the infinitive form is used as the lemma for the verb; thus in Spansih dormir "to sleep" is the lemma for the verb duermes "you sleep". The specific forms sung or carpets or sign or duermes are called 'wordforms.

      The process of mapping from a wordform to a lemma is called lemmatization. Lemmatization is not always deterministic, since it may depend on the context. For example, the wordform found can map to the lemma find (meaning 'to locate' or the lemma found ('to create an institution').


  • (Masse et al., 2008) ⇒ Blondin Masse, A, G. Chicoisne, Y. Gargouri, Stevan Harnad, O. Picard, and O. Marcotte. (2008). “How Is Meaning Grounded in Dictionary Definitions?.” In: TextGraphs-3 Workshop, 22nd International Conference on Computational Linguistics (Coling 2008).
    • QUOTE: We know from the 19th century philosopher-mathematician Frege that the referent and the meaning (or “sense”) of a word (or phrase) are not the same thing: two different words or phrases can refer to the very same object without having the same meaning (Frege, 1948): “George W. Bush” and “the current president of the United States of America” have the same referent but a different meaning. So do “human females” and “daughters”. And “things that are bigger than a breadbox” and “things that are not the size of a breadbox or smaller”.

      A word’s “extension” is the set of things to which it refers, and its “intension” is the rule for defining what things fall within its extension. A word’s meaning is hence something closer to a rule for picking out its referent. Is the dictionary definition of a word, then, its meaning?

      Clearly, if we do not know the meaning of a word, we look up its definition in a dictionary. But what if we do not know the meaning of any of the words in its dictionary definition? And what if we don’t know the meanings of the words in the definitions of the words defining those words, and so on? This is a problem of infinite regress, called the “symbol grounding problem” (Harnad, 1990; Harnad, 2003).


  • (Kakkonen, 2007) ⇒ Tuomo Kakkonen. (2007). “Framework and Resources for Natural Language Evaluation." Academic Dissertation. University of Joensuu.
    • Definition 3-1. Symbol, terminal and alphabet.
      • A symbol is a distinguishable character, such as “a”, “b” or “c”.
      • Any permissible sequence of symbols is called a terminal (also referred to as a word).
      • A finite, nonempty set ∑ of terminals is called an alphabet.
      • A lexicon is a structure that defines the 'terminals in a language.
      • A grammar [math]G[/math] consists of a lexicon and rules.



  • (Mikheev, 2003) ⇒ Andrei Mikheev. (2003). “Text Segmentation.” In: (Mitkov, 2003).
    • The first step in the majority of text processing applications is to segment text into words. The term 'word', however, is ambiguous: a word from a language's vocabulary can occur many times in the text but it is still a single individual word of the language. So there is a distinction between words of vocabulary or word types and multiple occurrences of these words in the text which are called word tokens. This is why the process of segmenting words tokens in text is called tokenization. Although the distinction between word types and word tokens is important it is usual to refer to the both as 'words' whenever the context unambiguously implies the interpretation.
  • (Mitkov, 2003) ⇒ Ruslan Mitkov, editor. (2003). “The Oxford Handbook of Computational Linguistics." Oxford University Press. ISBN:019927634X
    • word-type: A word in a language vocabulary, as opposed to its specific occurrence in text. Compare word-token.
  • (Mitkov, 2003) ⇒ Ruslan Mitkov, editor. (2003). “The Oxford Handbook of Computational Linguistics." Oxford University Press. ISBN:019927634X
    • lexical entry: A word or phrase in a used as a peg on which to hang information about part of speech, subcategorization, meaning, pronunciation, links to related terms, and/or any of various other kinds of information.


  • (Bauer, 2000) ⇒ Laurie Bauer. (2000). “Word.” In: "Morphology.", edited by Geert Booij, Christian Lehmann, and Joachim Mugdan. ISBN:9783110111286
    • A word-form, on the other hand, is the orthographic or phonological form which represents a lexeme. The terms appear to have used first by Matthews (1972: 41), although the notion was current much earlier. The usual notation is to mark word-forms in italics, and this will be followed here. Although Lyons himself uses at least three different notations for lexemes, that most frequently adopted in other works is the notation introduced in 1968, by which lexemes are indicated by the use of small capitals ...


  • (Carter, 1998) ⇒ Ronald Carter. (1998). “Vocabulary: Applied Linguistic Perspectives; 2nd edition." Routledge.
    • QUOTE: One theoretical notion which may help us to resolve some of the above problems is that of the lexeme. A lexeme is the abstract unit which underlies some of the variants we have observed in connection with 'words'. Thus BRING is the lexeme which underlies different grammatical variants: 'bring', 'brought', 'brings', 'bringing' which we can refer to as word-forms (note a lexeme is conventionally represented by upper-case letters and that quotation marks are used for its word-forms). Lexemes are the basic, contrasting units of vocabulary in a language. When we look up words in a dictionary we are looking up lexemes rather than words. That is, 'brought' and 'bringing' will be found under and entry for BRING. The lexeme BRING is an abstraction. It does not actually occur itself in texts. Instead, it realizes different word-forms. Thus, the word-form 'bring' is realized by the lexeme BRING; the lexeme GO realizes the word-form 'went'. In a diction each lexeme merits a separate entry or sub-entry.

      The term lexeme also embraces items which consist of more than one word-form. Into the category come lexical items such as multi-word verbs (to catch up on), phrasal verbs (to drop in) and idioms (kick the bucket). Here, KICK THE BUCKETis a lexeme and would appear a such in a single dictionary entry even though it is a three-word form. …

      An important question which also arises her concerns our own metalanguage in this book. Should we talk of words or word-forms or lexemes or lexical items? It is clear that the uses of these words word or vocabulary have a general common-sense validity and are serviceable when there is no real need to be precise. They will continue to be used for general reference. The terms lexeme and the word-forms of a lexeme are valuable theoretical concepts and will be used when theoretical distinctions are necessary. Lexical item(s) (or sometimes vocabulary items or simply items) is a useful and fairly neutral hold-all term which captures and, to some extend, helps to overcome instability in the term word, especially when it become limited by orthography.

      In this chapter there is a distinct shift from examining lexical items at the level of the orthographic ‘word’ or in the patterns which occur in fixed expressions towards a consideration of lexis in larger units of language organization.