sklearn.ensemble.VotingClassifier

From GM-RKB
Jump to navigation Jump to search

A sklearn.ensemble.VotingClassifier is a Classification System (based on Majoraty Voting Rules and Average Predicted Probabilities) within sklearn.ensemble module.

1) Import the Classification System from scikit-learn : from sklearn.ensemble import VotingClassifier
2) Generate training data or load dataset: X,y
3) Train the classifier: rt=VotingClassifier([estimators, voting=’hard’, weights=None, n_jobs=1, flatten_transform=None])
4) Choose method(s):
  • fit(X, y[, sample_weight]), fits the estimators.
  • fit_transform(X[, y]), fits to data, then transform it.
  • get_params([deep]), retrieves the parameters of the VotingClassifier
  • predict(X), predicts class labels for X.
  • score(X, y[, sample_weight]), returns the mean accuracy on the given test data and labels.
  • set_params(**params), sets the parameters for the voting classifier
  • transform(X), returns class labels or probabilities for X for each estimator.


References

2017a

2017b

  • (Scikit Learn, 2017B) ⇒ http://scikit-learn.org/stable/modules/ensemble.html#voting-classifier Retrieved:2017-10-29
    • QUOTE: The idea behind the VotingClassifier is to combine conceptually different machine learning classifiers and use a majority vote or the average predicted probabilities (soft vote) to predict the class labels. Such a classifier can be useful for a set of equally well performing model in order to balance out their individual weaknesses.

      (...)

      In majority voting, the predicted class label for a particular sample is the class label that represents the majority (mode) of the class labels predicted by each individual classifier.

      (...)

      In contrast to majority voting (hard voting), soft voting returns the class label as argmax of the sum of predicted probabilities.

      Specific weights can be assigned to each classifier via the weights parameter. When weights are provided, the predicted class probabilities for each classifier are collected, multiplied by the classifier weight, and averaged. The final class label is then derived from the class label with the highest average probability.