Statistical Learning Framework

From GM-RKB
Jump to navigation Jump to search

A Statistical Learning Framework is a Machine Learning Framework that combines statistics and functional analysis.



References

2020a

  1. Trevor Hastie, Robert Tibshirani, Jerome Friedman (2009) The Elements of Statistical Learning, Springer-Verlag .

2020b

  • (Wikipedia, 2020b) ⇒ https://en.wikipedia.org/wiki/Statistical_learning_theory#Formal_description Retrieved:2020-2-1.
    • Take [math]\displaystyle{ X }[/math] to be the vector space of all possible inputs, and [math]\displaystyle{ Y }[/math] to be

      the vector space of all possible outputs. Statistical learning theory takes the perspective that there is some unknown probability distribution over the product space [math]\displaystyle{ Z = X \times Y }[/math] , i.e. there exists some unknown [math]\displaystyle{ p(z) = p(\vec{x},y) }[/math] . The training set is made up of [math]\displaystyle{ n }[/math] samples from this probability distribution, and is notated :

      [math]\displaystyle{ S = \{(\vec{x}_1,y_1), \dots ,(\vec{x}_n,y_n)\} = \{\vec{z}_1, \dots ,\vec{z}_n\} }[/math]

      Every [math]\displaystyle{ \vec{x}_i }[/math] is an input vector from the training data, and [math]\displaystyle{ y_i }[/math] is the output that corresponds to it.

      In this formalism, the inference problem consists of finding a function [math]\displaystyle{ f: X \to Y }[/math] such that [math]\displaystyle{ f(\vec{x}) \sim y }[/math] . Let [math]\displaystyle{ \mathcal{H} }[/math] be a space of functions [math]\displaystyle{ f: X \to Y }[/math] called the hypothesis space. The hypothesis space is the space of functions the algorithm will search through. Let [math]\displaystyle{ V(f(\vec{x}),y) }[/math] be the loss function, a metric for the difference between the predicted value [math]\displaystyle{ f(\vec{x}) }[/math] and the actual value [math]\displaystyle{ y }[/math] . The expected risk is defined to be :

      [math]\displaystyle{ I[f] = \displaystyle \int_{X \times Y} V(f(\vec{x}),y)\, p(\vec{x},y) \,d\vec{x} \,dy }[/math]

      The target function, the best possible function [math]\displaystyle{ f }[/math] that can be

      chosen, is given by the [math]\displaystyle{ f }[/math] that satisfies :

      [math]\displaystyle{ f = \inf_{h \in \mathcal{H}} I[h] }[/math]

      Because the probability distribution [math]\displaystyle{ p(\vec{x},y) }[/math] is unknown, a

      proxy measure for the expected risk must be used. This measure is based on the training set, a sample from this unknown probability distribution. It is called the empirical risk :

      [math]\displaystyle{ I_S[f] = \frac{1}{n} \displaystyle \sum_{i=1}^n V( f(\vec{x}_i),y_i) }[/math]

      A learning algorithm that chooses the function [math]\displaystyle{ f_S }[/math] that minimizes

      the empirical risk is called empirical risk minimization.

2009

2008

2007

  • (Berkeley Univerisyt, 2007) ⇒ http://www.stat.berkeley.edu/~statlearning/
    • Statistical machine learning merges statistics with the computational sciences --- computer science, systems science and optimization. Much of the agenda in statistical machine learning is driven by applied problems in science and technology, where data streams are increasingly large-scale, dynamical and heterogeneous, and where mathematical and algorithmic creativity are required to bring statistical methodology to bear. Fields such as bioinformatics, artificial intelligence, signal processing, communications, networking, information management, finance, game theory and control theory are all being heavily influenced by developments in statistical machine learning.
    • The field of statistical machine learning also poses some of the most challenging theoretical problems in modern statistics, chief among them being the general problem of understanding the link between inference and computation.