Large Text Document Corpus
(Redirected from large corpus)
		
		
		
		Jump to navigation
		Jump to search
		A Large Text Document Corpus is a corpus that is a large dataset (that requires significant resources to processed by a machine but can fit in large memory banks).
- Context:
- It can fit into the computer memory of a very large computer.
- It can range from being a Relatively Large Corpus to being a Very Large Corpus.
 
- Example(s):
- a Large Text Corpus, such as Genia Corpus, TREC Corpus, the KDD-2009 Abstracts Corpus.
- …
 
- Counter-Example(s):
- a Small Corpus, such as the kdd09cma1 Corpus.
- any Large Corpora, such as a Web Snapshot.
 
- See: Information Extraction Task, PubMed Corpus.