sklearn.ensemble.GradientBoostingClassifier

A sklearn.ensemble.GradientBoostingClassifier is an Gradient Boosting Classification System within sklearn.ensemble module.

AKA: GradientBoostingClassifier.
Context
- Usage:

1) Import Gradient Tree Boosting Classification System from scikit-learn : from sklearn.ensemble import GradientBoostingClassifier

2) Create design matrix X and response vector Y

3) Create Gradient Tree Boosting Classifier object:

BC=GradientBoostingClassifier([loss=’deviance’, learning_rate=0.1, n_estimators=100, subsample=1.0, criterion=’friedman_mse’, min_samples_split=2, ...])

4) Choose method(s):

apply(X), applies trees in the ensemble to X, return leaf indices.
decision_function(X), computes the decision function of X.
fit(X, y[, sample_weight, monitor]), fits the gradient boosting model.
get_params([deep]), gets parameters for this estimator.
predict(X), predicts class for X.
predict_log_proba(X), predicts class log-probabilities for X.
predict_proba(X), predicts class probabilities for X.
score(X, y[, sample_weight]), returns the mean accuracy on the given test data and labels.
set_params(**params), sets the parameters of this estimator.
staged_decision_function(X), computes decision function of X for each iteration.
staged_predict(X), predicts class at each stage for X.
staged_predict_proba(X), predicts class probabilities at each stage for X.

Example(s):
- Gradient Boosting regularization
- Gradient Boosting Out-of-Bag estimates
Counter-Example(s):
See: Decision Tree, Classification System, Regularization Task, Ridge Regression Task, Kernel-based Classification Algorithm.

References

2017a

(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html Retrieved:2017-10-22.
- QUOTE: class sklearn.ensemble.GradientBoostingClassifier(loss=’deviance’, learning_rate=0.1, n_estimators=100, subsample=1.0, criterion=’friedman_mse’, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, min_impurity_decrease=0.0, min_impurity_split=None, init=None, random_state=None, max_features=None, verbose=0, max_leaf_nodes=None, warm_start=False, presort=’auto’)
  Gradient Boosting for classification.
  GB builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage n_classes_ regression trees are fit on the negative gradient of the binomial or multinomial deviance loss function. Binary classification is a special case where only a single regression tree is induced. Read more in the User Guide.

2017b

(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/ensemble.html#gradient-tree-boosting Retrieved:2017-10-22.
- QUOTE: Gradient Tree Boosting or Gradient Boosted Regression Trees (GBRT) is a generalization of boosting to arbitrary differentiable loss functions. GBRT is an accurate and effective off-the-shelf procedure that can be used for both regression and classification problems. Gradient Tree Boosting models are used in a variety of areas including Web search ranking and ecology.
  The advantages of GBRT are:
  - Natural handling of data of mixed type (= heterogeneous features)
  - Predictive power
  - Robustness to outliers in output space (via robust loss functions)

The disadvantages of GBRT are:

Scalability, due to the sequential nature of boosting it can hardly be parallelized.

The module sklearn.ensemble provides methods for both classification and regression via gradient boosted regression trees.

sklearn.ensemble.GradientBoostingClassifier

References

2017a

2017b

Navigation menu

Search