(Redirected from Text Artifact)
- AKA: Type Written Electronic Item, Unstructured Text.
- It can (like any typical linguistic item) contain linguistic expressions.
- It can (typically) contain Text Item Components, such as words, phrases, authors.
- It can range from being a Raw Text Item to being a Tokenized Text Item.
- It can have a Text Item Location within some larger Text Item.
- It can range from being an Annotated Text Item to being an Annotated Text Item (such as a Labeled Text Item).
- It can range from being a Short Text to being a Long Text.
- It can be, depending on its Text Item Type:
- It can be within a Text Dataset.
- See: Texting, Linguistic Artifact.
- (WordNet, 2009) ⇒ http://wordnetweb.princeton.edu/perl/webwn?s=text
- S: (n) text, textual matter (the words of something written) "there were more than a thousand words of text"; "they handed out the printed text of the mayor's speech"; "he wants to reconstruct the original text"
- 4. (computing) Data which can be interpreted as human-readable text (often contrasted with binary data).
- (Hirst, 2006) ⇒ Graeme Hirst. (2006). "Views of text-meaning in computational linguistics: Past, present, and future." In: Computing, Philosophy, and Cognitive Science; Edited by G. Dodig-Crnkovic and S. Stuart.
- In this paper, I’ll use the word text to denote any complete utterance, short or long. In a computational context, a text could be a non-interactive document, such as a news article, a legal statute, or a memorandum, that a writer or author has produced for other people and which is to undergo some kind of processing by a computer. Or a text could be a natural-language utterance by a user in a spoken or typewritten interactive dialogue with another person or a computer: a turn or set of turns in a conversation. The term text-meaning, then, as opposed to mere word-meaning or sentence-meaning, denotes the complete in-context meaning or message of such texts at all levels of interpretation including subtext."
- (Wall & al, 1996) ⇒ Larry Wall, Tom Christiansen, and Randal L. Schwartz. (1996). "Programming Perl, 2nd edition." O'Reilly. ISBN 1565921496
- text: Normally, a string or file containing primarily printable characters. The word has been usurped in some UNIX circles to mean the portion of your process that contains machine code to be executed.
- (Sproat et al, 1996) ⇒ Richard Sproat, William A. Gale, Chilin Shih, and Nancy Chang. (1996). "A Stochastic Finite-state Word-Segmentation Algorithm for Chinese." In: Computational Linguistics, 22(3).
- Any NLP application that presumes as input unrestricted text requires an initial phase of text analysis; such applications involve problems as diverse as machine translation, information retrieval, and text-to-speech synthesis (TTS). An initial step of any text analysis task is the tokenization of the input into words.