2009 QuantificationandSemiSupervised

Jump to: navigation, search

Subject Headings:


Cited By


Author Keywords

Semi-supervised Learning, Quantification, Classification, Concept Drift, Class Distribution


In realistic settings the prevalence of a class may change after a classifier is induced and this will degrade the performance of the classifier. Further complicating this scenario is the fact that labeled data is often scarce and expensive. In this paper we address the problem where the class distribution changes and only unlabeled examples are available from the new distribution. We design and evaluate a number of methods for coping with this problem and compare the performance of these methods. Our quantification-based methods estimate the class distribution of the unlabeled data from the changed distribution and adjust the original classifier accordingly, while our semi-supervised methods build a new classifier using the examples from the new (unlabeled) distribution which are supplemented with predicted class values. We also introduce a hybrid method that utilizes both quantification and semi-supervised learning. All methods are evaluated using accuracy and F-measure on a set of benchmark data sets. Our results demonstrate that our methods yield substantial improvements in accuracy and F-measure.



 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2009 QuantificationandSemiSupervisedJack Chongjie Xue
Gary M. Weiss
Quantification and Semi-supervised Classification Methods for Handling Changes in Class DistributionKDD-2009 Proceedings10.1145/1557019.15571172009
AuthorJack Chongjie Xue + and Gary M. Weiss +
doi10.1145/1557019.1557117 +
journalProceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining +
titleQuantification and Semi-supervised Classification Methods for Handling Changes in Class Distribution +
year2009 +