1996 BiasPlusVarianceDecompForZeroOneLossF

Jump to: navigation, search

Subject Headings: Real-Valued Random Variable, Bias-Variance Trade-off.


Cited By



We present a bias-variance decomposition of expected misclassication rate, the most commonly used loss function in supervised classication learning. The bias-variance decomposition for quadratic loss functions is well known and serves as an important tool for analyzing learning algorithms, yet no decomposition was offered for the more commonly used zero-one (misclassication)loss functions until the recent work of Kong & Dietterich (1995) and Breiman (1996). Their decomposition suers from some major shortcomings though (e.g., potentially negative variance), which our decomposition avoids. We show that, in practice, the naive frequency-based estimation of the decomposition terms is by itself biased and show how to correct for this bias. We illustrate the decomposition on various algorithms and datasets from the UCI repository.

The cost, [math]C[/math], is a real-valued random variable defined as the loss over the random variables YF and YH. So the expected cost is: E(C) = ...

For zero-one loss, the cost is usually referred to as misclassification rate and is derived as follows: ...



 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1996 BiasPlusVarianceDecompForZeroOneLossFRon Kohavi
David H. Wolpert
Bias Plus Variance Decomposition for Zero-One Loss FunctionsProceedings of the 13th International Conference on Machine Learninghttp://robotics.stanford.edu/~ronnyk/biasVar.pdf1996