2015 DebiasingCrowdsourcedBatches
- (Zhuang et al., 2015) ⇒ Honglei Zhuang, Aditya Parameswaran, Dan Roth, and Jiawei Han. (2015). “Debiasing Crowdsourced Batches.” In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2015). ISBN:978-1-4503-3664-2 doi:10.1145/2783258.2783316
Subject Headings:
Notes
Cited By
- http://scholar.google.com/scholar?q=%222015%22+Debiasing+Crowdsourced+Batches
- http://dl.acm.org/citation.cfm?id=2783258.2783316&preflayout=flat#citedby
Quotes
Author Keywords
Abstract
Crowdsourcing is the de-facto standard for gathering annotated data. While, in theory, data annotation tasks are assumed to be attempted by workers independently, in practice, data annotation tasks are often grouped into batches to be presented and annotated by workers together, in order to save on the time or cost overhead of providing instructions or necessary background. Thus, even though independence is usually assumed between annotations on data items within the same batch, in most cases, a worker's judgment on a data item can still be affected by other data items within the batch, leading to additional errors in collected labels. In this paper, we study the data annotation bias when data items are presented as batches to be judged by workers simultaneously. We propose a novel worker model to characterize the annotating behavior on data batches, and present how to train the worker model on annotation data sets. We also present a debiasing technique to remove the effect of such annotation bias from adversely affecting the accuracy of labels obtained. Our experimental results on synthetic and real-world data sets demonstrate that our proposed method can achieve up to + 57% improvement in F 1-score compared to the standard majority voting baseline.
References
;
Author | volume | Date Value | title | type | journal | titleUrl | doi | note | year | |
---|---|---|---|---|---|---|---|---|---|---|
2015 DebiasingCrowdsourcedBatches | Aditya Parameswaran Honglei Zhuang Dan Roth Jiawei Han | Debiasing Crowdsourced Batches | 10.1145/2783258.2783316 | 2015 |