2013 QueryingDiscriminativeandRepres

From GM-RKB
Jump to navigation Jump to search

Subject Headings:

Notes

Cited By

Quotes

Author Keywords

Abstract

Empirical risk minimization (ERM) provides a useful guideline for many machine learning and data mining algorithms. Under the ERM principle, one minimizes an upper bound of the true risk, which is approximated by the summation of empirical risk and the complexity of the candidate classifier class. To guarantee a satisfactory learning performance, ERM requires that the training data are i.i.d. sampled from the unknown source distribution. However, this may not be the case in active learning, where one selects the most informative samples to label and these data may not follow the source distribution. In this paper, we generalize the empirical risk minimization principle to the active learning setting. We derive a novel form of upper bound for the true risk in the active learning setting; by minimizing this upper bound we develop a practical batch mode active learning method. The proposed formulation involves a non-convex integer programming optimization problem. We solve it efficiently by an alternating optimization method. Our method is shown to query the most informative samples while preserving the source distribution as much as possible, thus identifying the most uncertain and representative queries. Experiments on benchmark data sets and real-world applications demonstrate the superior performance of our proposed method in comparison with the state-of-the-art methods.

References

;

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2013 QueryingDiscriminativeandRepresJieping Ye
Zheng Wang
Querying Discriminative and Representative Samples for Batch Mode Active Learning10.1145/2487575.24876432013