2007 SemiSupervisedLearning

Jump to: navigation, search

Subject headings: Semi-Supervised Learning Algorithm, Self-Training Algorithm, Semi-Supervised Generative Model Algorithm, Semi-Supervised S3VM Algorithm, Semi-Supervised Graph-based Algorithm, Semi-Supervised Multiview Algorithm.


Cited By



Why can we learn from unlabeled data for supervised learning tasks? Do unlabeled data always help? What are the popular semi-supervised learning methods, and how do they work? How do they relate to each other? What are the research trends? In this tutorial we address these questions. We will examine state-of-the-art methods, including generative models, multiview learning (e.g., co-training), graph-based learning (e.g., manifold regularization), transductive SVMs and so on. We also offer some advice for practitioners. Finally we discuss the connection between semi-supervised machine learning and natural learning. The emphasis of the tutorial is on the intuition behind each method, and the assumptions they need.


Using both labeled and unlabeled data to build better learners, than using each one alone.


  • input instance x, label y
  • learner f : X -> Y
  • labeled data (X_l, Y_l) = {(x_1:l, y_1:l)}
  • unlabeled data X_u = {x_l+1:n}, available during training
  • usually l << n
  • test data X_test = {x_n+1:}, not available during training

Semi-supervised vs. transductive learning

  • Semi-supervised learning is ultimately applied to the test data (inductive).
  • Transductive learning is only concerned with the unlabeled data.

Self-training algorithm

  • Assumption: One’s own high confidence predictions are correct.
  • Self-training algorithm:
    • 1 Train f from (Xl, Yl)
    • 2 Predict on x 2 Xu
    • 3 Add (x, f(x)) to labeled data
    • 4 Repeat

Variations in self-training

Advantages of self-training

  • The simplest semi-supervised learning method.
  • A wrapper method, applies to existing (complex) classifiers.
  • Often used in real tasks like natural language processing.

Disadvantages of self-training

  • Early mistakes could reinforce themselves.
  • Heuristic solutions, e.g. “un-label” an instance if its confidence falls below a threshold.
  • Cannot say too much in terms of convergence.
  • But there are special cases when self-training is equivalent to the Expectation-Maximization (EM) algorithm.
  • There are also special cases (e.g., linear functions) when the closed-form solution is known.


  • 1 Olivier Chapelle, Alexander Zien, Bernhard Schölkopf (Eds.). (2006). “Semi-supervised learning.” MIT Press.
  • 2 Xiaojin Zhu (2005). “Semi-supervised learning literature survey.” TR-1530. University of Wisconsin-Madison Department of Computer Science.
  • 3 Matthias Seeger (2001). “Learning with labeled and unlabeled data.” Technical Report. University of Edinburgh.



 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2007 SemiSupervisedLearningXiaojin ZhuSemi-Supervised Learninghttp://pages.cs.wisc.edu/~jerryzhu/pub/sslicml07.pdf2007