Huber Regression System

A Huber Regression System is a Robustness Regression System that implements a Huber Regression Algorithm to solve a Huber Regression Task.

AKA: Huber Regressor, Huber Regression Estimator.
- …
Example(s):
- sklearn.linear_model.HuberRegressor [1]:
  - HuberRegressor vs Ridge on dataset with strong outliers
Counter-Example(s):
See: Regression Analysis Task, Random Variable, L2-norm.

References

2017

(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/linear_model.html#huber-regression Retrieved:2017-09-17
- QUOTE: The HuberRegressor is different to Ridge because it applies a linear loss to samples that are classified as outliers. A sample is classified as an inlier if the absolute error of that sample is lesser than a certain threshold. It differs from TheilSenRegressor and RANSACRegressor because it does not ignore the effect of the outliers but gives a lesser weight to them.
  The loss function that HuberRegressor minimizes is given by
  \underset{w, \sigma}{min\,} {\sum_{i=1}^n\left(\sigma + H_m\left(\frac{X_{i}w - y_{i}}{\sigma}\right)\sigma\right) + \alpha {||w||_2}^2}</math>
  where
  [math]\displaystyle{ H_m(z) = \begin{cases} z^2, & \text {if } |z| \lt \epsilon, \\ 2\epsilon|z| - \epsilon^2, & \text{otherwise} \end{cases} }[/math]
  (...)
  It is advised to set the parameter epsilon to 1.35 to achieve 95% statistical efficiency.
  The HuberRegressor differs from using SGDRegressor with loss set to huber in the following ways.
  HuberRegressor is scaling invariant. Once epsilon is set, scaling X and y down or up by different values would produce the same robustness to outliers as before. as compared to SGDRegressor where epsilon has to be set again when X and y are scaled.
  HuberRegressor should be more efficient to use on data with small number of samples while SGDRegressor needs a number of passes on the training data to produce the same robustness.

Huber Regression System

References

2017

Navigation menu

Search