sklearn.neural network Module: Difference between revisions

Latest revision as of 21:19, 6 July 2023

An sklearn.neural network Module is a neural network framework that is an sklearn module (which contains a collection of neural network algorithm implementations).

Context:
- It can (often) reference a sklearn.neural network system, such as by: sklearn.neural network.Model_Name(self, arguments) or simply sklearn.neural network.Model_Name() (where Model_Name is a neural network algorithm).
Example(s)
- sklearn.neural_network.BernoulliRBM, a Bernoulli Restricted Boltzmann Machine (RBM) System that implements an Isometric Mapping Algorithm.
- sklearn.neural_network.MLPClassifier, a Multi-layer Perceptron Classifier.
- sklearn.neural_network.MLPRegressor, a Multi-layer Perceptron Regressor.
- …
Counter-Example(s):
- PyTorch.
- Keras.
- TensorFlow.
- MXNet.
- sklearn.tree, a collection of Decision Tree Learning Systems.
- sklearn.svm, a collection of Support Vector Machine algorithms.
See: Multilayer Perceptron (MLP) Training, sklearn.metrics, sklearn.covariance, sklearn.cluster.bicluster, sklearn.linear_model, sklearn.manifold.

References

2017a

(Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/classes.html#module-sklearn.neural_network Retrieved: 2017-12-17
- QUOTE: The sklearn.neural_network module includes models based on neural networks.
  User guide: See the Neural network models (supervised) and Neural network models (unsupervised) sections for further details.
  - neural_network.BernoulliRBM([n_components, …]) Bernoulli Restricted Boltzmann Machine (RBM).
  - neural_network.MLPClassifier([…]) Multi-layer Perceptron classifier.
  - neural_network.MLPRegressor([…]) Multi-layer Perceptron regressor.

2017b

(sklearn,2017) ⇒ http://scikit-learn.org/stable/modules/neural_networks_supervised.html#multi-layer-perceptron Retrieved:2017-12-3.
- QUOTE: Multi-layer Perceptron (MLP) is a supervised learning algorithm that learns a function [math]\displaystyle{ f(\cdot): R^m \rightarrow R^o }[/math] by training on a dataset, where [math]\displaystyle{ m }[/math] is the number of dimensions for input and [math]\displaystyle{ o }[/math] is the number of dimensions for output. Given a set of features [math]\displaystyle{ X = {x_1, x_2, \cdots, x_m} }[/math] and a target [math]\displaystyle{ y }[/math], it can learn a non-linear function approximator for either classification or regression. It is different from logistic regression, in that between the input and the output layer, there can be one or more non-linear layers, called hidden layers. Figure 1 shows a one hidden layer MLP with scalar output.
  The leftmost layer, known as the input layer, consists of a set of neurons [math]\displaystyle{ \{x_i | x_1, x_2, \cdots, x_m\} }[/math] representing the input features. Each neuron in the hidden layer transforms the values from the previous layer with a weighted linear summation [math]\displaystyle{ w_1x_1 + w_2x_2 + \cdots + w_mx_m }[/math], followed by a non-linear activation function [math]\displaystyle{ g(\cdot):R \rightarrow R - }[/math] like the hyperbolic tan function. The output layer receives the values from the last hidden layer and transforms them into output values.
  The module contains the public attributes coefs_ and intercepts_. coefs_ is a list of weight matrices, where weight matrix at index [math]\displaystyle{ i }[/math] represents the weights between layer [math]\displaystyle{ i }[/math] and layer [math]\displaystyle{ i+1 }[/math]. intercepts_ is a list of bias vectors, where the vector at index [math]\displaystyle{ i }[/math] represents the bias values added to layer [math]\displaystyle{ i+1 }[/math].
  The advantages of Multi-layer Perceptron are:
  - Capability to learn non-linear models.
  - Capability to learn models in real-time (on-line learning) using partial_fit.

The disadvantages of Multi-layer Perceptron (MLP) include:

MLP with hidden layers have a non-convex loss function where there exists more than one local minimum. Therefore different random weight initializations can lead to different validation accuracy.
MLP requires tuning a number of hyperparameters such as the number of hidden neurons, layers, and iterations.
MLP is sensitive to feature scaling.

2017c

(sklearn,2017) ⇒ http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Retrieved:2017-12-17
- QUOTE: Restricted Boltzmann machines (RBM) are unsupervised nonlinear feature learners based on a probabilistic model. The features extracted by an RBM or a hierarchy of RBMs often give good results when fed into a linear classifier such as a linear SVM or a perceptron.
  The model makes assumptions regarding the distribution of inputs. At the moment, scikit-learn only provides BernoulliRBM, which assumes the inputs are either binary values or values between 0 and 1, each encoding the probability that the specific feature would be turned on.
  The RBM tries to maximize the likelihood of the data using a particular graphical model. The parameter learning algorithm used (Stochastic Maximum Likelihood) prevents the representations from straying far from the input data, which makes them capture interesting regularities, but makes the model less useful for small datasets, and usually not useful for density estimation.
  The method gained popularity for initializing deep neural networks with the weights of independent RBMs. This method is known as unsupervised pre-training.

@@ Line 22: / Line 22: @@
 === 2017a ===
-* (Scikit Learn, 2017) &rArr; http://scikit-learn.org/stable/modules/classes.html#module-sklearn.neural_network  Retrieved: 2017-12-17
+* (Scikit Learn, 2017) ⇒ http://scikit-learn.org/stable/modules/classes.html#module-sklearn.neural_network  Retrieved: 2017-12-17
 ** QUOTE: The [[sklearn.neural_network module]] includes models based on [[neural network]]s.        <P>        User guide: See the [http://scikit-learn.org/stable/modules/neural_networks_supervised.html#neural-networks-supervised Neural network models (supervised)] and [http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Neural network models (unsupervised)] sections for further details.
 *** <code>neural_network.BernoulliRBM([n_components, …])</code> [[Bernoulli Restricted Boltzmann Machine (RBM)]].
@@ Line 29: / Line 29: @@
 === 2017b ===
-* (sklearn,2017) &rArr; http://scikit-learn.org/stable/modules/neural_networks_supervised.html#multi-layer-perceptron Retrieved:2017-12-3.
+* (sklearn,2017) ⇒ http://scikit-learn.org/stable/modules/neural_networks_supervised.html#multi-layer-perceptron Retrieved:2017-12-3.
 ** QUOTE: [[Multi-layer Perceptron (MLP)]] is a [[supervised learning algorithm]] that learns a function <math>f(\cdot): R^m \rightarrow R^o</math> by [[training]] on a [[dataset]], where <math>m</math> is the number of [[dimension]]s for [[input]] and <math>o</math> is the number of [[dimension]]s for [[output]]. Given a [[set of features]] <math>X = {x_1, x_2, \cdots, x_m}</math> and a target <math>y</math>, it can learn a [[non-linear function approximator]] for either [[classification]] or [[regression]]. It is different from [[logistic regression]], in that between the [[input]] and the [[output]] [[layer]], there can be one or more [[non-linear layer]]s, called [[hidden layer]]s. [http://scikit-learn.org/stable/_images/multilayerperceptron_network.png Figure 1] shows a one [[hidden layer]] [[MLP]] with [[scalar output]].        <P>        The leftmost layer, known as the [[input layer]], consists of a set of [[neuron]]s <math>\{x_i | x_1, x_2, \cdots, x_m\}</math> representing the input [[feature]]s. Each [[neuron]] in the [[hidden layer]] transforms the values from the [[previous layer]] with a [[weighted linear summation]] <math>w_1x_1 + w_2x_2 + \cdots + w_mx_m</math>, followed by a [[non-linear activation function]] <math>g(\cdot):R \rightarrow R -</math> like the [[hyperbolic tan function]]. The [[output layer]] receives the values from the last [[hidden layer]] and transforms them into [[output value]]s.        <P>        The [[module]] contains the public attributes <code>coefs_</code> and <code>intercepts_</code>. <code>coefs_</code> is a list of [[weight matrice]]s, where [[weight matrix]] at index <math>i</math> represents the [[weight]]s between [[layer]] <math>i</math> and layer <math>i+1</math>. <code>intercepts_</code> is a list of [[bias vector]]s, where the vector at index <math>i</math> represents the [[bias]] values added to [[layer]] <math>i+1</math>.        <P>        The advantages of [[Multi-layer Perceptron]] are:
 *** Capability to learn [[non-linear model]]s.
@@ Line 39: / Line 39: @@
 === 2017c ===
-* (sklearn,2017) &rArr; http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Retrieved:2017-12-17
+* (sklearn,2017) ⇒ http://scikit-learn.org/stable/modules/neural_networks_unsupervised.html#neural-networks-unsupervised Retrieved:2017-12-17
 ** QUOTE: [[Restricted Boltzmann machines (RBM)]] are [[unsupervised nonlinear feature learner]]s based on a [[probabilistic model]]. The [[feature]]s extracted by an [[RBM]] or a hierarchy of [[RBM]]s often give good results when fed into a [[linear classifier]] such as a [[linear SVM]] or a [[perceptron]].        <P>        The model makes assumptions regarding the [[distribution]] of [[input]]s. At the moment, [[scikit-learn]] only provides [[BernoulliRBM]], which assumes the inputs are either [[binary value]]s or values between 0 and 1, each encoding the [[probability]] that the specific [[feature]] would be turned on.        <P>        The [[RBM]] tries to [[maximize]] the [[likelihood]] of the [[data]] using a particular [[graphical model]]. The [[parameter]] [[learning algorithm]] used ([[Stochastic Maximum Likelihood]]) prevents the [[representation]]s from straying far from the [[input data]], which makes them capture interesting regularities, but makes the model less useful for [[small dataset]]s, and usually not useful for [[density estimation]].        <P>        The method gained popularity for initializing [[deep neural network]]s with the weights of independent [[RBM]]s. This method is known as [[unsupervised pre-training]].

sklearn.neural network Module: Difference between revisions

Latest revision as of 21:19, 6 July 2023

References

2017a

2017b

2017c

Navigation menu

Search