Empirical Risk Minimization (ERM) Algorithm

From GM-RKB
Jump to navigation Jump to search

An Empirical Risk Minimization (ERM) Algorithm is a Supervised Learning Algorithm that is an optimization algorithm used to determine theoretical bounds on a machine learning algorihtm's performance.



References

2019

  • (Wikipedia, 2019) ⇒ https://en.wikipedia.org/wiki/Empirical_risk_minimization Retrieved:2019-5-27.
    • Empirical risk minimization (ERM) is a principle in statistical learning theory which defines a family of learning algorithms and is used to give theoretical bounds on their performance (...)

      In general, the risk [math]\displaystyle{ R(h) }[/math] cannot be computed because the distribution [math]\displaystyle{ P(x, y) }[/math] is unknown to the learning algorithm (this situation is referred to as agnostic learning). However, we can compute an approximation, called empirical risk, by averaging the loss function on the training set:

      [math]\displaystyle{ \! R_\text{emp}(h) = \frac{1}{n} \sum_{i=1}^n L(h(x_i), y_i) }[/math].

      The empirical risk minimization principle [1] states that the learning algorithm should choose a hypothesis [math]\displaystyle{ \hat{h} }[/math] which minimizes the empirical risk:

      [math]\displaystyle{ \hat{h} = \underset{h \in \mathcal{H}}{\arg \min} R_{\text{emp}}(h) }[/math].

      Thus the learning algorithm defined by the ERM principle consists in solving the above optimization problem.

2015

  1. A well specified statistical model is one where the data is generated under some model in the parametric class. See the linear regression Section 3.1
  2. Note that biased estimators, e.g. the James-Stein estimator, can outperform the MLE (Lehmann and Casella, 1998).