1996 BaggingPredictors

From GM-RKB
Jump to navigation Jump to search

Subject Headings Bagging, Bagging Predictors Algorithm.

Notes

Cited By

Quotes

Author Keywords

Abstract

Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.

1. Introduction

A learning set of [math]\displaystyle{ L }[/math] consists of data [math]\displaystyle{ {(y_n, \mathbf{x}_n), n = 1,…,N} }[/math] where the [math]\displaystyle{ y }[/math]’s are either class labels or a numerical response. We have procedure for using this learning set to form a predictor [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math] - if the input is [math]\displaystyle{ \mathbf{x} }[/math] we predict [math]\displaystyle{ y }[/math] by [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math]. Now, suppose we are given a sequence of learning sets [math]\displaystyle{ {L_k} }[/math] each consisting of [math]\displaystyle{ N }[/math] independent observations from the same underlying distribution as [math]\displaystyle{ L }[/math]. Our mission is to use the [math]\displaystyle{ {L_k} }[/math] to get a better predictor than the single learning set predictor [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math]. The restriction is that all we are allowed to work with is the sequence of predictors [math]\displaystyle{ {\varphi (\mathbf{x}, L)} }[/math].

If [math]\displaystyle{ y }[/math] is numerical, an obvious procedure is to replace [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math] by the average of [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math] over [math]\displaystyle{ k }[/math]. i.e. by [math]\displaystyle{ \varphi_A(\mathbf{x}) = E_L \varphi (\mathbf{x}, L) }[/math] where [math]\displaystyle{ E_L }[/math] denotes the expectation over [math]\displaystyle{ L }[/math], and the subscript [math]\displaystyle{ A }[/math] in [math]\displaystyle{ \varphi_A }[/math] denotes aggregation. If [math]\displaystyle{ \varphi (\mathbf{x}, L) }[/math]

References


,

 AuthorvolumeDate ValuetitletypejournaltitleUrldoinoteyear
1996 BaggingPredictorsLeo BreimanBagging PredictorsMachine Learning (ML) Subject Areahttp://www.public.asu.edu/~jye02/CLASSES/Fall-2005/PAPERS/breiman96bagging.pdf10.1023/A:1018054314351996