Empirical Risk Minimization Principle

From GM-RKB
Jump to navigation Jump to search

See: Generalization Bounds, Gradient Boosted Supervised Learning, Statistical Learning Theory, Expected Risk, Expected Risk Minimization, Empirical Risk.



References

2017

2011

  • (Zhang, 2011b) ⇒ Xinhua Zhang. (2011). “Empirical Risk Minimization.” In: (Sammut & Webb, 2011) p.312
    • QUOTE: The goal of learning is usually to find a model which delivers good generalization performance over an underlying distribution of the data. Consider an input space X and output space Y. Assume the pairs [math]\displaystyle{ (X \times Y ) \in \mathcal{X}\times \mathcal{Y} }[/math] are random variables whose (unknown) joint distribution is [math]\displaystyle{ P_{XY} }[/math]. It is our goal to find a predictor [math]\displaystyle{ f : \mathcal{X}\mapsto \mathcal{Y} }[/math] which minimizes the expected risk: [math]\displaystyle{ P(\,f(X)\neq Y ) = {\mathbb{E}}_{(X,Y )\sim {P}_{XY }}\left [\delta (\,f(X)\neq Y )\right ], }[/math] where δ(z) = 1 if z is true, and 0 otherwise.

      However, in practice we only have n pairs of training examples (Xi, Yi) drawn identically and independently from [math]\displaystyle{ P_{XY} }[/math]. Since [math]\displaystyle{ P_{XY} }[/math] is unknown, we often use the risk on the training set (called empirical risk) as a surrogate of the expected risk on the underlying distribution: …