# Bagged Trees Algorithm

(Redirected from bagged trees)

A Bagged Trees Algorithm is a bagging algorithm that uses a decision tree learning algorithm.

## References

### 2015

- http://en.wikipedia.org/wiki/Random_forest#Tree_bagging
- The training algorithm for random forests applies the general technique of bootstrap aggregating, or bagging, to tree learners. Given a training set
`X`=`x`, …,_{1}`x`with responses_{n}`Y`=`y`, …,_{1}`y`, bagging repeatedly selects a random sample with replacement of the training set and fits trees to these samples … After training, predictions for unseen samples_{n}`x'`can be made by averaging the predictions from all the individual regression trees on`x'`: :[math]\hat{f} = \frac{1}{B} \sum_{b=1}^B \hat{f}_b (x')[/math] or by taking the majority vote in the case of decision trees.This bootstrapping procedure leads to better model performance because it decreases the variance of the model, without increasing the bias. This means that while the predictions of a single tree are highly sensitive to noise in its training set, the average of many trees is not, as long as the trees are not correlated. Simply training many trees on a single training set would give strongly correlated trees (or even the same tree many times, if the training algorithm is deterministic); bootstrap sampling is a way of de-correlating the trees by showing them different training sets.

- The training algorithm for random forests applies the general technique of bootstrap aggregating, or bagging, to tree learners. Given a training set

### 2006

- (Caruana & Niculescu-Mizil, 2006) ⇒ Rich Caruana, and Alexandru Niculescu-Mizil. (2006). “An Empirical Comparison of Supervised Learning Algorithms.” In: Proceedings of the 23rd International Conference on Machine learning. ISBN:1-59593-383-2 doi:10.1145/1143844.1143865
- QUOTE: A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90's. We present a large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance.