1996 BaggingPredictors

(Breiman, 1996) ⇒ Leo Breiman. (1996). “Bagging Predictors.” In: Machine Learning, 24(2). doi:10.1023/A:1018054314350

Subject Headings Bagging, Bagging Predictors Algorithm.

Notes

Cited By

~7229 http://scholar.google.com/scholar?q=%22Bagging+Predictors%22+1996

Quotes

Author Keywords

Aggregation, Bootstrap, Averaging, Combining

Abstract

Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.

1. Introduction

A learning set of [math]\displaystyle{ L }[/math] consists of data [math]\displaystyle{ {(y_n, \mathbf{x}_n), n = 1,…,N} }[/math] where the [math]\displaystyle{ y }[/math]’s are either class labels or a numerical response. We have procedure for using this learning set to form a predictor [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math] - if the input is [math]\displaystyle{ \mathbf{x} }[/math] we predict [math]\displaystyle{ y }[/math] by [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math]. Now, suppose we are given a sequence of learning sets [math]\displaystyle{ {L_k} }[/math] each consisting of [math]\displaystyle{ N }[/math] independent observations from the same underlying distribution as [math]\displaystyle{ L }[/math]. Our mission is to use the [math]\displaystyle{ {L_k} }[/math] to get a better predictor than the single learning set predictor [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math]. The restriction is that all we are allowed to work with is the sequence of predictors [math]\displaystyle{ {\varphi (\mathbf{x}, L)} }[/math].

If [math]\displaystyle{ y }[/math] is numerical, an obvious procedure is to replace [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math] by the average of [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math] over [math]\displaystyle{ k }[/math]. i.e. by [math]\displaystyle{ \varphi_A(\mathbf{x}) = E_L \varphi (\mathbf{x}, L) }[/math] where [math]\displaystyle{ E_L }[/math] denotes the expectation over [math]\displaystyle{ L }[/math], and the subscript [math]\displaystyle{ A }[/math] in [math]\displaystyle{ \varphi_A }[/math] denotes aggregation. If [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math] …

…

References

…

,

	Author	volume	Date Value	title	type	journal	titleUrl	doi	note	year
1996 BaggingPredictors	Leo Breiman			Bagging Predictors		Machine Learning (ML) Subject Area	http://www.public.asu.edu/~jye02/CLASSES/Fall-2005/PAPERS/breiman96bagging.pdf	10.1023/A:101805431435		1996