2006 SemiSupervisedLearning

From GM-RKB
Jump to navigation Jump to search

Subject Headings: Semi-Supervised Learning Task, Semi-Supervised Learning Algorithm.

Notes

Quotes

Book Overview

In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics. This first comprehensive overview of SSL presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research.

Semi-Supervised Learning first presents the key assumptions and ideas underlying the field: smoothness, cluster or low-density separation, manifold structure, and transduction. The core of the book is the presentation of SSL methods, organized according to algorithmic strategies. After an examination of generative models, the book describes algorithms that implement the low-density separation assumption, graph-based methods, and algorithms that perform two-step learning. The book then discusses SSL applications and offers guidelines for SSL practitioners by analyzing the results of extensive benchmark experiments. Finally, the book looks at interesting directions for SSL research. The book closes with a discussion of the relationship between semi-supervised learning and transduction.

1. Introduction to Semi-Supervised Learning

1.1	Supervised, Unsupervised, and Semi-Supervised Learning	1
1.2	When Can Semi-Supervised Learning Work?	4
1.3	Classes of Algorithms and Organization of This Book	8

I. Generative Models 13

2. A Taxonomy for Semi-Supervised Learning Methods

Matthias W. Seeger	15
2.1	The Semi-Supervised Learning Problem	15
2.2	Paradigms for Semi-Supervised Learning	17
2.3	Examples	22
2.4	Conclusions	31

3. Semi-Supervised Text Classification Using EM

N. C. Nigam, Andrew McCallum and Tom Mitchell	33
3.1	Introduction	33
3.2	A Generative Model for Text	35
3.3	Experminental Results with Basic EM	41
3.4	Using a More Expressive Generative Model	43
3.5	Overcoming the Challenges of Local Maxima	49
3.6	Conclusions and Summary	54

4. Risks of Semi-Supervised Learning

Fabio Cozman and Ira Cohen	57
4.1	Do Unlabled Data Improve or Degrade Classification Performance?	57
4.2	Understanding Unlabeled Data: Asymptotic Bias	59
4.3	The Asymptotic Analysis of Generative Smei-Supervised Learning	63
4.4	The Value of Labeled and Unlabeled Data	67
4.5	Finite Sample Effects	69
4.6	Model Search and Robustness	70
4.7	Conclusion	71

5. Probabilistic Semi-Supervised Cluster with Constraints

Sugato Basu, Mikhail Bilenko, Arindam Banerjee and Raymond Mooney	73
5.1	Introduction	74
5.2	HMRF Model for Semi-Supervised Clustering	75
5.3	HMRF-KMeans Algorithm	81
5.4	Active Learning for Constraint Acquistion	93
5.5	Experimental Results	96
5.6	Related Work	100
5.7	Conclusions	101

II. Low-Density Separation 103

6. Transductive Support Vector Machines

Thorsten Joachims	105
6.1	Introduction	105
6.2	Transductive Support Vector Machines	108
6.3	Why Use Margin on the Test Set?	111
6.4	Experiments and Applications of the TSVMs	112
6.5	Solving the TSVM Optimization Problem	114
6.6	Connection to Related Approaches	116
6.7	Summary and Conclusions	116

7. Semi-Supervised Learning Using Semi-Definite Programming

Tijl De Bie and Nello Cristianini	119
7.1	Relaxing SVM transduction	119
7.2	An Approximation for Speedup	126
7.3	General Semi-Supervised Learning Settings	128
7.4	Empirical Results	129
7.5	Summary and Outlook	133

Appendix:

The Extended Schur Complement Lemma 134

8.	Gaussian Processes and the Null-Category Noise Model
Neil D. Lawrence and Michael I. Jordan	137
8.1	Introduction	137
8.2	The Noise Model	141
8.3	Process Model and the Effect of the Null-Category	143
8.4	Posterior Inference and Prediction	145
8.5	Results	147
8.6	Discussion	149

9. Entropy Regularization

Yves Grandvalet and Yoshua Bengio	151
9.1	Introduction	151
9.2	Derivation of the Criterion	152
9.3	Optimization Algorithms	155
9.4	Related Methods	158
9.5	Experiments	160
9.6	Conclusion	166

Appendix

Proof of Theorem 9.1	166

10. Data-Dependent Regularization

Adrian Corduneanu and Tommi S. Jaakkola	169
10.1	Introduction	169
10.2	Information Regularization on Metric Spaces	174
10.3	Information Regularization and Relational Data	182
10.4	Discussion	189

III. Graph-based Models 191

11. Label Propogation and Quadratic Criterion

Yoshua Bengio, Olivier Delalleau and Nicolas Le Roux	193
11.1	Introduction	193
11.2	Label Propogation on a Similarity Graph	194
11.3	Quadratic Cost Criterion	198
11.4	From Transduction to Induction	205
11.5	Incorporating Class Prior Knowledge	205
11.6	Curse of Dimensionality for Semi-Supervised Learning	206
11.7	Discussion	215

12. The Geometric Basis of Semi-Supervised Learning

Vikas Sindhwani, Misha Belkin and Partha Niyogi	217
12.1	Introduction	217
12.2	Incorporating Geometry in Regularization	220
12.3	Algorithms	224
12.4	Data-Dependent Kernels for Semi-Supervised Learning	229
12.5	Linear Methods for Large-Scale Semi-Supervised Learning	231
12.6	Connections to Other Algorithms and Related Work	232
12.7	Future Directions	234

13. Discrete Regularization

Dengyong Zhou and Bernhard Schölkopf	237
13.1	Introduction	237
13.2	Discrete Analysis	239
13.3	Discrete Regularization	245
13.4	Conclusion	249

14. Semi-Supervised Learning with Conditional Harmonic Mixing

Christopher J.  C. Burges and John C. Platt	251
14.1	Introduction	251
14.2	Conditional Harmonic Mixing	255
14.3	Learning in CHM Models	256
14.4	Incorporating Prior Knowledge	261
14.5	Learning the Conditionals	261
14.6	Model Averaging	262
14.7	Experiments	263
14.8	Conclusions	273

IV. Change of Representation 275

15. Graph Kernels by Spectral Transforms

Xiaojin Zhu, Jaz Kandola, John Lafferty and Zoubin Ghahramani	277
15.1	The Graph Laplacian	278
15.2	Kernels by Spectral Transforms	280
15.3	Kernel Alignment	281
15.4	Optimizing Alignment Using QCQP for Semi-Supervised Learning	282
15.5	Semi-Supervised Kernels with Order Restraints	283
15.6	Experimental Results	285
15.7	Conclusion	289

16. Spectral Methods for Dimensionality Reduction

Lawrence K. Saul, Kilian Weinberger, Fei Sha and Jihun Ham	293
16.1	Introduction	293
16.2	Linear Methods	295
16.3	Graph-based Methods	297
16.4	Kernel Methods	303
16.5	Discussion	306

17. Modifying Distances

Alon Orlitsky and Sajama	309
17.1	Introduction	309
17.2	Estimating DBD Metrics	312
17.3	Computing DBD Metrics	321
17.4	Semi-Supervised Learning Using Density-based Metrics	327
17.5	Conclusions and Future Work	329

V. Semi-Supervised Learning in Practice 331

18. Large-Scale Algorithms

Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux	333
18.1	Introduction	333
18.2	Cost Approximations	334
18.3	Subset Selection	337
18.4	Discussion	340

19. Semi-Supervised Protein Classification Using Cluster Kernels

Jason Weston, Christina Leslie, Eugene Ie and William Stafford Noble	343
19.1	Introduction	343
19.2	Representation and Kernels for Protein Sequences	345
19.3	Semi-Supervised Kernels for Protein Sequences	348
19.4	Experiments	352
19.5	Discussion	358

20. Prediction of Protein Function from Networks

Hyunjung Shin and Koji Tsuda	361
20.1	Introduction	361
20.2	Graph-based Semi-Supervised Learning	364
20.3	Combining Multiple Graphs	366
20.4	Experiments on Function Prediction of Proteins	369
20.5	Conclusion and Outlook	374
21.	Analysis of Benchmarks	377
21.1	The Benchmark	377
21.2	Application of SSL Methods	383
21.3	Results and Discussion	390

VI. Perspectives 395

22. An Augmented PAC Model for Semi-Supervised Learning

Maria-Florina Balcan and Avrim Blum	397
22.1	Introduction	398
22.2	A Formal Framework	400
22.3	Sample Complexity Results	403
22.4	Algorithmic Results	412
22.5	Related Models and Discussion	416

23. Metric-based Approaches for Semi-Supervised Regression and Classification

Dale Schuurmans, Finnegan Southey, Dana Wilkinson and Yuhong Guo	421
23.1	Introduction	421
23.2	Metric Structure of Supervised Learning	423
23.3	Model Selection	426
23.4	Regularization	436
23.5	Classification	445
23.6	Conclusion	449

24. Transductive Inference and Semi-Supervised Learning

Vladimir Vapnik	453
24.1	Problem Settings	453
24.2	Problem of Generalization in Inductive and Transductive Inference	455
24.3	Structure of the VC Bounds and Transductive Inference	457
24.4	The Symmetrization Lemma and Transductive Inference	458
24.5	Bounds for Transductive Inference	459
24.6	The Structural Risk Minimization Principle for Induction and Transduction	460
24.7	Combinatorics in Transductive Inference	462
24.8	Measures of Size of Equivalence Classes	463
24.9	Algorithms for Inductive and Transductive SVMs	465
24.10	Semi-Supervised Learning	470
24.11	Conclusion:

Transductive Inference and the New Problems of Inference 470

24.12	Beyond Transduction: Selective Inference	471
25.	A Discussion of Semi-Supervised Learning and Transduction,


 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
2006 SemiSupervisedLearningBernhard Schölkopf
Olivier Chapelle
Alexander Zien
Semi-Supervised Learning2006