Data-Item Annotation Task
A Data-Item Annotation Task is a data processing task that adds annotation items to data items to create annotated data items.
- AKA: Data Annotation Task, Digital Item Annotation Task, Data Labeling Task.
- Context:
- Task Input: Data Items.
- Task Output: Annotated Data Items.
- Task Performance Measure: Data-Item Annotation Accuracy, Data-Item Annotation Precision, Data-Item Annotation Recall, Inter-Annotator Agreement, Annotation Throughput.
- It can typically add Semantic Annotations to data item content.
- It can typically apply Annotation Labels based on data-item annotation guidelines.
- It can typically create Metadata Layers for data item interpretation.
- It can typically support Machine Learning Training Data Creation through labeled data item generation.
- It can typically enable Data Item Searchability through annotation-based indexing.
- ...
- It can often integrate Domain Knowledge into data item representations.
- It can often facilitate Data Item Classification through categorical annotations.
- It can often enhance Data Item Accessibility through descriptive annotations.
- It can often support Quality Control Processes through annotation validation.
- ...
- It can range from being a Simple Data-Item Annotation Task to being a Complex Data-Item Annotation Task, depending on its data-item annotation complexity.
- It can range from being a Manual Data-Item Annotation Task to being an Automated Data-Item Annotation Task, depending on its data-item annotation automation level.
- It can range from being a Single-Label Data-Item Annotation Task to being a Multi-Label Data-Item Annotation Task, depending on its data-item annotation label count.
- It can range from being a Structured Data-Item Annotation Task to being an Unstructured Data-Item Annotation Task, depending on its data-item annotation schema formality.
- It can range from being a Domain-Specific Data-Item Annotation Task to being a Domain-Agnostic Data-Item Annotation Task, depending on its data-item annotation domain scope.
- ...
- It can be performed by a Data-Item Annotator using data-item annotation tools.
- It can be supported by a Data-Item Annotation System implementing data-item annotation algorithms.
- It can be managed by a Data-Item Annotation Manager following data-item annotation protocols.
- It can be part of a Data-Item Annotation Pipeline within data processing workflows.
- ...
- Example(s):
- Data Type-Based Data-Item Annotation Tasks, such as:
- Text Data-Item Annotation Tasks, such as:
- Image Data-Item Annotation Tasks, such as:
- Audio Data-Item Annotation Tasks, such as:
- Video Data-Item Annotation Tasks, such as:
- Purpose-Based Data-Item Annotation Tasks, such as:
- Domain-Specific Data-Item Annotation Tasks, such as:
- ...
- Data Type-Based Data-Item Annotation Tasks, such as:
- Counter-Example(s):
- Data Generation Task, which creates new data items rather than annotating existing ones.
- Data Transformation Task, which modifies data content rather than adding annotations.
- Content Moderation Task, which filters or removes content rather than annotating it.
- Data Validation Task, which verifies data quality without adding persistent annotations.
- Physical Item Labeling Task, which annotates tangible objects rather than digital data items.
- See: Data Processing Task, Annotation System, Machine Learning Data Preparation, Metadata Management, Human-in-the-Loop System.
References
2024
- https://www.theguardian.com/technology/article/2024/jul/06/mercy-anita-african-workers-ai-artificial-intelligence-exploitation-feeding-machine
- NOTES:
- Data annotation involves data reviewing and data labeling large volumes of data under strict performance targets and performance deadlines.
- Content moderators face exposure to disturbing and graphic content, resulting in severe psychological impacts.
- There is intense supervision and surveillance, with limited support for mental health and well-being.
- NOTES:
2011
- (Wikipedia - Annotation, 2009) ⇒ http://en.wikipedia.org/wiki/Annotation
- For DNA annotation, a previously unknown sequence representation of genetic material is enriched with information relating genomic position to intron-exon boundaries, regulatory sequences, repeats, gene names and protein products. This annotation is stored in genomic databases as Mouse Genome Informatics, FlyBase, and WormBase. Educational materials on some aspects of biological annotation from this year's Gene Ontology annotation camp and similar events are available at the Gene Ontology website. The National Center for Biomedical Ontology (www.bioontology.org) develops tools for automated annotation of database records based on the textual descriptions of those records.
In the digital imaging community the term annotation is commonly used for visible metadata superimposed on an image without changing the underlying master image, such as sticky notes, virtual laser pointers, circles, arrows, and black-outs (cf. redaction).
… legal publishers such as Thomson West and Lexis Nexis publish annotated versions of statutes, providing information about court cases that have interpreted the statutes. Both the federal United States Code and state statutes are subject to interpretation by the courts, and the annotated statutes are valuable tools in legal research.
In linguistics, annotation include comments and metadata; these non-transcriptional annotations are also non-linguistic. A collection of texts with linguistic annotations is known as a corpus (plural corpora). The Linguistic Annotation Wiki describes tools and formats for creating and managing linguistic annotations.
- For DNA annotation, a previously unknown sequence representation of genetic material is enriched with information relating genomic position to intron-exon boundaries, regulatory sequences, repeats, gene names and protein products. This annotation is stored in genomic databases as Mouse Genome Informatics, FlyBase, and WormBase. Educational materials on some aspects of biological annotation from this year's Gene Ontology annotation camp and similar events are available at the Gene Ontology website. The National Center for Biomedical Ontology (www.bioontology.org) develops tools for automated annotation of database records based on the textual descriptions of those records.
2009
- (WordNet, 2009) ⇒ http://wordnetweb.princeton.edu/perl/webwn?s=annotation
- S: (n) annotation, annotating (the act of adding notes)
- …
2009
- http://en.wiktionary.org/wiki/annotate
- To add annotation
2009
- http://en.wiktionary.org/wiki/annotation#Noun
- the process of writing such comment or commentary
- a critical or explanatory commentary or analysis.
- a comment added to a text
2006
- (Bukhardt et al., 2006) ⇒ Kyle Burkhardt, Bohdan Schneider, Jeramia Ory. (2006). “A Biocurator Perspective: Annotation at the Research Collaboratory for Structural Bioinformatics Protein Data Bank.” In: PLoS Computational Biology, 2(10). doi:10.1371/journal.pcbi.0020099
- QUOTE: The goal of annotation is to make each entry not only self-consistent but also consistent with the rest of the archive. To this end, annotators help authors represent their data in the best possible way. Annotators routinely review the incoming data and perform many standard inspections (see Box 1).
Box 1. Annotators Work to Represent PDB Data in the Best Possible Way by: * Reviewing entry for self-consistency * Matching given title to structure * Correcting format errors in data and coordinates * Checking sequence using BLAST [13] * Inserting sequence database reference * Providing protein name and synonyms * Checking scientific name of the source organism * Confirming chemical consistency between ligand name and the 3-D coordinates * Adding information describing the biological assembly * Checking entry visually * Generating validation reports * Finding citation references with PubMed [14]