# Metric-based Model Selection Algorithm

A Metric-based Model Selection Algorithm is a Semi-Supervised Learning Algorithm that detects Inconsistent Hypothesis with unlabeled data by imposing a metric structure on hypotheses by determining the discrepancy between their predictions across the distribution of unlabeled data.

**See:**Distance-based Learning.

## References

### 2008

- (Zhu, 2008) ⇒ Xiaojin Zhu. (2008). “Semi-Supervised Learning Literature Survey (revised edition)." Technical Report 1530, Department of Computer Sciences, University of Wisconsin, Madison.
- Metric-based model selection (Schuurmans & Southey, 2001) is an method to detect hypotheses inconsistency with unlabeled data. We may have two hypotheses which are consistent on [math]L[/math], for example they all have zero training set error. However they may be inconsistent on the much larger
*U*. If so we should reject at least one of them, e.g. the more complex one if we employ Occam’s razor.

- Metric-based model selection (Schuurmans & Southey, 2001) is an method to detect hypotheses inconsistency with unlabeled data. We may have two hypotheses which are consistent on [math]L[/math], for example they all have zero training set error. However they may be inconsistent on the much larger

### 2004

- (Bilenko et al., 2004) ⇒ Mikhail Bilenko, Sugato Basu, and Raymond Mooney. (2004). “Integrating Constraints and Metric Learning in Semi-Supervised Clustering.” In: Proceedings of the twenty-first International Conference on Machine learning. doi:10.1145/1015330.1015360

### 2001

- (Schuurmans & Southey, 2001) ⇒ Dale Schuurmans, and Finnegan Southey. (2001). “Metric-based methods for adaptive model selection and regularization."In: Machine Learning, Special Issue on New Methods for Model Selection and Model Combination, 48. doi:10.1023/A:1013947519741
- We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea is to impose a metric structure on hypotheses by determining the discrepancy between their predictions across the distribution of unlabeled data.