(Redirected from text dataset)
- AKA: Unstructured Textual Data.
- See: Written Message.
- (Jijkoun et al., 2008) ⇒ Valentin Jijkoun, Mahboob Alam Khalid, Maarten Marx, and Maarten de Rijke. (2008). “Named Entity Normalization in User Generated Content.” In: Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data (AND 2008). doi:10.1145/1390749.1390755
- QUOTE: We consider the NEN (named entity normalization) task within the setting of user generated content (UGC), such as blogs, discussion forums, or comments left behind by readers of online documents. For this type of textual data, the NEN task is particularly important within the settings of media and reputation analysis (which motivated the work reported here) and of intelligence gathering.
- (Sarawagi, 2008) ⇒ Sunita Sarawagi. (2008). “Information Extraction.” In: Foundations and Trends in Databases, 1(3). doi:10.1561/1900000003
- QUOTE: The automatic extraction of information from unstructured sources has opened up new avenues for querying, organizing, and analyzing data by drawing upon the clean semantics of structured databases and the abundance of unstructured data. ... As society became more data oriented with easy online access to both structured and unstructured data, new applications of structure extraction came around.