Manual Annotation Task

From GM-RKB
Jump to navigation Jump to search

A manual annotation is an annotation task that is done by a human annotator.



References

2014

  • (Sabou et al., 2014) ⇒ Marta Sabou, Kalina Bontcheva, Leon Derczynski, and Arno Scharl. (2014). “Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines.” In: Proc. LREC.
    • QUOTE: Crowdsourcing is an emerging collaborative approach that can be used for the acquisition of annotated corpora and a wide range of other linguistic resources. Although the use of this approach is intensifying in all its key genres (paid-for crowdsourcing, games with a purpose, volunteering-based approaches), the community still lacks a set of best-practice guidelines similar to the annotation best practices for traditional, expert-based corpus acquisition. In this paper we focus on the use of crowdsourcing methods for corpus acquisition and propose a set of best practice guidelines based in our own experiences in this area and an overview of related literature. We also introduce GATE Crowd, a plugin of the GATE platform that relies on these guidelines and offers tool support for using crowdsourcing in a more principled and efficient manner.

      Over the past ten years, Natural Language Processing (NLP) research has been driven forward by a growing volume of annotated corpora, produced by evaluation initiatives such as ACE (ACE, 2004), TAC,[1] SemEval and Senseval, [2] and large annotation projects such as OntoNotes (Hovy et al., 2006). These corpora have been essential for training and domain adaptation of NLP algorithms and their quantitative evaluation, as well as for enabling algorithm comparison and repeatable experimentation. Thanks to these efforts, there are now well-understood best practices in how to create annotations of consistently high quality, by employing, training, and managing groups of linguistic and/or domain experts. This process is referred to as “the science of annotation” (Hovy, 2010).

      More recently, the emergence of crowdsourcing platforms (e.g. paid-for marketplaces such as Amazon Mechanical Turk (AMT) and CrowdFlower (CF); games with a purpose; and volunteer-based platforms such as crowdcrafting), coupled with growth in internet connectivity, motivated NLP researchers to experiment with crowdsourcing as a novel, collaborative approach for obtaining linguistically annotated corpora. The advantages of crowdsourcing over expert-based annotation have already been discussed elsewhere (Fort et al., 2011; Wang et al., 2012), but in a nutshell, crowdsourcing tends to be cheaper and faster. ...

2009


  1. 1 www.nist.gov/tac
  2. www.senseval.org