Text Corpus

From GM-RKB
Jump to: navigation, search

A text corpus is a corpus composed of text items.



References

2015

  • (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/text_corpus Retrieved:2015-4-13.
    • In linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (nowadays usually electronically stored and processed). They are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.


  • (Wikipedia, 2015) ⇒ http://en.wikipedia.org/wiki/list_of_text_corpora Retrieved:2015-4-13.
    • Following is a list of text corpora in various languages. "Text corpora" is the plural of "text corpus". A text corpus is a large and structured set of texts (nowadays usually electronically stored and processed). Text corpora are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.


  1. Professor Mark Davies at BYU created an online tool to search Google's English language corpus, drawn from Google Books, at http://googlebooks.byu.edu/x.asp.
  2. Cite error: Invalid <ref> tag; no text was provided for refs named googlelabs

2009