sklearn.neural network Module: Difference between revisions

From GM-RKB
Jump to navigation Jump to search
No edit summary
(ContinuousReplacement)
Tag: continuous replacement
 
Line 22: Line 22:


=== 2017a ===
=== 2017a ===
* (Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/classes.html#module-sklearn.neural_network  Retrieved: 2017-12-17
* (Scikit Learn, 2017) http://scikit-learn.org/stable/modules/classes.html#module-sklearn.neural_network  Retrieved: 2017-12-17
** QUOTE: The [[sklearn.neural_network module]] includes models based on [[neural network]]s.        <P>        User guide: See the [http://scikit-learn.org/stable/modules/neural_networks_supervised.html#neural-networks-supervised Neural network models (supervised)] and [http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Neural network models (unsupervised)] sections for further details.
** QUOTE: The [[sklearn.neural_network module]] includes models based on [[neural network]]s.        <P>        User guide: See the [http://scikit-learn.org/stable/modules/neural_networks_supervised.html#neural-networks-supervised Neural network models (supervised)] and [http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Neural network models (unsupervised)] sections for further details.
*** <code>neural_network.BernoulliRBM([n_components, …])</code> [[Bernoulli Restricted Boltzmann Machine (RBM)]].
*** <code>neural_network.BernoulliRBM([n_components, …])</code> [[Bernoulli Restricted Boltzmann Machine (RBM)]].
Line 29: Line 29:


=== 2017b ===
=== 2017b ===
* (sklearn,2017) &rArr; http://scikit-learn.org/stable/modules/neural_networks_supervised.html#multi-layer-perceptron Retrieved:2017-12-3.
* (sklearn,2017) http://scikit-learn.org/stable/modules/neural_networks_supervised.html#multi-layer-perceptron Retrieved:2017-12-3.
** QUOTE: [[Multi-layer Perceptron (MLP)]] is a [[supervised learning algorithm]] that learns a function <math>f(\cdot): R^m \rightarrow R^o</math> by [[training]] on a [[dataset]], where <math>m</math> is the number of [[dimension]]s for [[input]] and <math>o</math> is the number of [[dimension]]s for [[output]]. Given a [[set of features]] <math>X = {x_1, x_2, \cdots, x_m}</math> and a target <math>y</math>, it can learn a [[non-linear function approximator]] for either [[classification]] or [[regression]]. It is different from [[logistic regression]], in that between the [[input]] and the [[output]] [[layer]], there can be one or more [[non-linear layer]]s, called [[hidden layer]]s. [http://scikit-learn.org/stable/_images/multilayerperceptron_network.png Figure 1] shows a one [[hidden layer]] [[MLP]] with [[scalar output]].        <P>        The leftmost layer, known as the [[input layer]], consists of a set of [[neuron]]s <math>\{x_i | x_1, x_2, \cdots, x_m\}</math> representing the input [[feature]]s. Each [[neuron]] in the [[hidden layer]] transforms the values from the [[previous layer]] with a [[weighted linear summation]] <math>w_1x_1 + w_2x_2 + \cdots + w_mx_m</math>, followed by a [[non-linear activation function]] <math>g(\cdot):R \rightarrow R -</math> like the [[hyperbolic tan function]]. The [[output layer]] receives the values from the last [[hidden layer]] and transforms them into [[output value]]s.        <P>        The [[module]] contains the public attributes <code>coefs_</code> and <code>intercepts_</code>. <code>coefs_</code> is a list of [[weight matrice]]s, where [[weight matrix]] at index <math>i</math> represents the [[weight]]s between [[layer]] <math>i</math> and layer <math>i+1</math>. <code>intercepts_</code> is a list of [[bias vector]]s, where the vector at index <math>i</math> represents the [[bias]] values added to [[layer]] <math>i+1</math>.        <P>        The advantages of [[Multi-layer Perceptron]] are:
** QUOTE: [[Multi-layer Perceptron (MLP)]] is a [[supervised learning algorithm]] that learns a function <math>f(\cdot): R^m \rightarrow R^o</math> by [[training]] on a [[dataset]], where <math>m</math> is the number of [[dimension]]s for [[input]] and <math>o</math> is the number of [[dimension]]s for [[output]]. Given a [[set of features]] <math>X = {x_1, x_2, \cdots, x_m}</math> and a target <math>y</math>, it can learn a [[non-linear function approximator]] for either [[classification]] or [[regression]]. It is different from [[logistic regression]], in that between the [[input]] and the [[output]] [[layer]], there can be one or more [[non-linear layer]]s, called [[hidden layer]]s. [http://scikit-learn.org/stable/_images/multilayerperceptron_network.png Figure 1] shows a one [[hidden layer]] [[MLP]] with [[scalar output]].        <P>        The leftmost layer, known as the [[input layer]], consists of a set of [[neuron]]s <math>\{x_i | x_1, x_2, \cdots, x_m\}</math> representing the input [[feature]]s. Each [[neuron]] in the [[hidden layer]] transforms the values from the [[previous layer]] with a [[weighted linear summation]] <math>w_1x_1 + w_2x_2 + \cdots + w_mx_m</math>, followed by a [[non-linear activation function]] <math>g(\cdot):R \rightarrow R -</math> like the [[hyperbolic tan function]]. The [[output layer]] receives the values from the last [[hidden layer]] and transforms them into [[output value]]s.        <P>        The [[module]] contains the public attributes <code>coefs_</code> and <code>intercepts_</code>. <code>coefs_</code> is a list of [[weight matrice]]s, where [[weight matrix]] at index <math>i</math> represents the [[weight]]s between [[layer]] <math>i</math> and layer <math>i+1</math>. <code>intercepts_</code> is a list of [[bias vector]]s, where the vector at index <math>i</math> represents the [[bias]] values added to [[layer]] <math>i+1</math>.        <P>        The advantages of [[Multi-layer Perceptron]] are:
*** Capability to learn [[non-linear model]]s.
*** Capability to learn [[non-linear model]]s.
Line 39: Line 39:


=== 2017c ===
=== 2017c ===
* (sklearn,2017) &rArr; http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Retrieved:2017-12-17
* (sklearn,2017) http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Retrieved:2017-12-17
** QUOTE: [[Restricted Boltzmann machines (RBM)]] are [[unsupervised nonlinear feature learner]]s based on a [[probabilistic model]]. The [[feature]]s extracted by an [[RBM]] or a hierarchy of [[RBM]]s often give good results when fed into a [[linear classifier]] such as a [[linear SVM]] or a [[perceptron]].        <P>        The model makes assumptions regarding the [[distribution]] of [[input]]s. At the moment, [[scikit-learn]] only provides [[BernoulliRBM]], which assumes the inputs are either [[binary value]]s or values between 0 and 1, each encoding the [[probability]] that the specific [[feature]] would be turned on.        <P>        The [[RBM]] tries to [[maximize]] the [[likelihood]] of the [[data]] using a particular [[graphical model]]. The [[parameter]] [[learning algorithm]] used ([[Stochastic Maximum Likelihood]]) prevents the [[representation]]s from straying far from the [[input data]], which makes them capture interesting regularities, but makes the model less useful for [[small dataset]]s, and usually not useful for [[density estimation]].        <P>        The method gained popularity for initializing [[deep neural network]]s with the weights of independent [[RBM]]s. This method is known as [[unsupervised pre-training]].
** QUOTE: [[Restricted Boltzmann machines (RBM)]] are [[unsupervised nonlinear feature learner]]s based on a [[probabilistic model]]. The [[feature]]s extracted by an [[RBM]] or a hierarchy of [[RBM]]s often give good results when fed into a [[linear classifier]] such as a [[linear SVM]] or a [[perceptron]].        <P>        The model makes assumptions regarding the [[distribution]] of [[input]]s. At the moment, [[scikit-learn]] only provides [[BernoulliRBM]], which assumes the inputs are either [[binary value]]s or values between 0 and 1, each encoding the [[probability]] that the specific [[feature]] would be turned on.        <P>        The [[RBM]] tries to [[maximize]] the [[likelihood]] of the [[data]] using a particular [[graphical model]]. The [[parameter]] [[learning algorithm]] used ([[Stochastic Maximum Likelihood]]) prevents the [[representation]]s from straying far from the [[input data]], which makes them capture interesting regularities, but makes the model less useful for [[small dataset]]s, and usually not useful for [[density estimation]].        <P>        The method gained popularity for initializing [[deep neural network]]s with the weights of independent [[RBM]]s. This method is known as [[unsupervised pre-training]].



Latest revision as of 21:19, 6 July 2023

An sklearn.neural network Module is a neural network framework that is an sklearn module (which contains a collection of neural network algorithm implementations).



References

2017a

2017b

The disadvantages of Multi-layer Perceptron (MLP) include:

2017c