Text Annotation Task
(Redirected from text annotation)
- AKA: Linguistic Annotation Task.
- input: Text Item.
- output: Annotated Text Item.
- Task Performance: Text Annotator Bias Measure.
- It can support a Corpus Annotation Task.
- It can range from being a Manual Text Annotation Task to being a Automated Text Annotation Task (aided by a text annotation system).
- It can range from being a Syntactic Text Annotation Task to being a Semantic Text Annotation Task.
- It can range from being a Document Annotation Task to being a Sentence Annotation Task.
- See: Text Clustering Task, Text Sequence Token Classification Task.
- (Wilcock, 2009) ⇒ Graham Wilcock. (2009). “Introduction to Linguistic Annotation and Text Analytics.” In: Synthesis Lectures on Human Language Technologies. Morgan & Claypool. doi:10.2200/S00194ED1V01Y200905HLT003 ISBN:1598297384
- (Palmer, Moon & Baldridge, 2009) ⇒ Alexis Palmer, Taesun Moon, and Jason Baldridge. (2009). “Evaluating Automation Strategies in Language Documentation.” In: Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing (HLT 2009).
- QUOTE: This paper presents pilot work integrating machine labeling and active learning with human annotation of data for the language documentation task of creating interlinearized gloss text (IGT) for the Mayan language Uspanteko. The practical goal is to produce a totally annotated corpus that is as accurate as possible given limited time for manual annotation. We describe ongoing pilot studies which examine the influence of three main factors on reducing the time spent to annotate IGT: suggestions from a machine labeler, sample selection methods, and annotator expertise.
- This wiki describes tools and formats for creating and managing linguistic annotations. `Linguistic annotation‘ covers any descriptive or analytic notations applied to raw language data. The basic data may be in the form of time functions -- audio, video and/or physiological recordings -- or it may be textual. The added notations may include transcriptions of all sorts (from phonetic features to discourse structures), part-of-speech and sense tagging, syntactic analysis, "named entity" identification, co-reference annotation, and so on. The focus is on tools which have been widely used for constructing annotated linguistic databases, and on the formats commonly adopted by such tools and databases.
- (Snow et al., 2008) ⇒ Rion Snow, Brendan O'Connor, Daniel Jurafsky, and Andrew Y. Ng. (2008). “Cheap and Fast - But is it Good?: Evaluating non-expert annotations for natural language tasks.” In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2008).