sklearn.neural network.MLPClassifier: Difference between revisions
(Created page with "A sklearn.neural_network.MLPClassifier is a Multi-layer Perceptron Classification System within <code>sklearn.neural_network</code>. * <B>Context</B> ** Usage: :...") |
m (Text replacement - "<P> " to "<P> ") |
||
(13 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
A [[sklearn.neural_network.MLPClassifier]] is a [[Multi-layer Perceptron Classification System]] within <code>[[sklearn.neural_network]]</code>. | A [[sklearn.neural_network.MLPClassifier]] is a [[Multi-layer Perceptron Classification System]] within <code>[[sklearn.neural_network]]</code>. | ||
* <B>Context</B> | * <B>Context</B> | ||
** Usage: | ** Usage: | ||
::: 1) Import [[MLP Classification System]] from [[scikit-learn]] : <code>from [[sklearn.neural_network]] import [[MLPClassifier]]</code> | ::: 1) Import [[MLP Classification System]] from [[scikit-learn]] : <code>from [[sklearn.neural_network]] import [[MLPClassifier]]</code> | ||
::: 2) Create [[design matrix]] <code>X</code> and [[response vector]] <code>Y</code> | ::: 2) Create [[design matrix]] <code>X</code> and [[response vector]] <code>Y</code> | ||
::: 3) Create [[Classifier]] object: <code>clf=MLPClassifier([hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001,...])</code> | ::: 3) Create [[Classifier]] object: <code>clf=MLPClassifier([hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001,...])</code> | ||
::: 4) Choose method(s): | ::: 4) Choose method(s): | ||
Line 16: | Line 16: | ||
** [http://scikit-learn.org/stable/auto_examples/neural_networks/plot_mlp_alpha.html Varying regularization in Multi-layer Perceptron]. | ** [http://scikit-learn.org/stable/auto_examples/neural_networks/plot_mlp_alpha.html Varying regularization in Multi-layer Perceptron]. | ||
** [http://scikit-learn.org/stable/auto_examples/neural_networks/plot_mlp_training_curves.html Compare Stochastic learning strategies for MLPClassifier] | ** [http://scikit-learn.org/stable/auto_examples/neural_networks/plot_mlp_training_curves.html Compare Stochastic learning strategies for MLPClassifier] | ||
** [http://scikit-learn.org/stable/auto_examples/neural_networks/plot_mnist_filters.html Visualization of MLP weights on MNIST] | |||
* <B>Counter-Example(s):</B> | * <B>Counter-Example(s):</B> | ||
** <code>[[sklearn.neural network.MLPRegressor]]</code> | ** <code>[[sklearn.neural network.MLPRegressor]]</code> | ||
** <code>[[sklearn.neural_network.BernoulliRBM]]</code> | ** <code>[[sklearn.neural_network.BernoulliRBM]]</code> | ||
* <B>See:</B>[[Classification System]], [[Regularization Task]], [[Ridge Regression Task]]. | * <B>See:</B> [[Classification System]], [[Regularization Task]], [[Ridge Regression Task]]. | ||
---- | ---- | ||
---- | ---- | ||
== References == | == References == | ||
=== 2017a === | === 2017a === | ||
* (Scikit-Learn, 2017A) ⇒ http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html Retrieved:2017-12-17 | * (Scikit-Learn, 2017A) ⇒ http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html Retrieved:2017-12-17 | ||
** QUOTE: <code>class sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08) </code> <P>[[Multi-layer Perceptron classifier]].<P> This model optimizes the [[log-loss function]] using [[LBFGS]] or [[stochastic gradient descent]].<P>(...)<P>'''Notes'''<P> [[MLPClassifier]] [[train]]s iteratively since at each time step the [[partial derivative]]s of the [[loss function]] with respect to the [[model parameter]]s are computed to update the [[parameter]]s. It can also have a [[regularization term]] added to the [[loss function]] that shrinks [[model parameter]]s to prevent [[overfitting]]. This implementation works with data represented as dense numpy arrays or sparse scipy arrays of floating point values. | ** QUOTE: <code>class sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08) </code> <P> [[Multi-layer Perceptron classifier]]. <P> This model optimizes the [[log-loss function]] using [[LBFGS]] or [[stochastic gradient descent]]. <P>(...)<P>'''Notes'''<P> [[MLPClassifier]] [[train]]s iteratively since at each time step the [[partial derivative]]s of the [[loss function]] with respect to the [[model parameter]]s are computed to update the [[parameter]]s. It can also have a [[regularization term]] added to the [[loss function]] that shrinks [[model parameter]]s to prevent [[overfitting]]. This implementation works with data represented as dense numpy arrays or sparse scipy arrays of floating point values. | ||
=== 2017b === | === 2017b === | ||
* (sklearn,2017) ⇒ http://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification Retrieved:2017-12-17. | * (sklearn,2017) ⇒ http://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification Retrieved:2017-12-17. | ||
Line 32: | Line 36: | ||
{| class="wikitable" style="margin-left: 50px;border:1px;background:#f2f2f2" | {| class="wikitable" style="margin-left: 50px;border:1px;background:#f2f2f2" | ||
|style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"| >>> <span style="color:green">from</span> | |style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"| >>> <span style="color:green">from</span> | ||
<span style="color:blue">sklearn.neural_network</span> <span style="color:green">import</span> MLPClassifier<P> X = [ [0., 0.], [1., 1.] ]<P> y = [0, 1] <P>clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1)<P> clf.fit(X, y) | <span style="color:blue">sklearn.neural_network</span> <span style="color:green">import</span> MLPClassifier<P> X = [ [0., 0.], [1., 1.] ]<P> y = [0, 1] <P> clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1)<P> clf.fit(X, y) | ||
|} | |} | ||
:: After [[fitting]] ([[training]]), the model can [[predict]] [[label]]s for new samples:<P> | :: After [[fitting]] ([[training]]), the model can [[predict]] [[label]]s for new samples:<P> {| class="wikitable" style="margin-left: 50px;border:1px;background:#f2f2f2" | ||
{| class="wikitable" style="margin-left: 50px;border:1px;background:#f2f2f2" | |||
|style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"|>>> clf.predict([ [2., 2.], [-1., -2.] ]) | |style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"|>>> clf.predict([ [2., 2.], [-1., -2.] ]) | ||
|} | |} | ||
Line 45: | Line 48: | ||
{| class="wikitable" style="margin-left: 50px;border:1px;background:#f2f2f2" | {| class="wikitable" style="margin-left: 50px;border:1px;background:#f2f2f2" | ||
|style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"|>>> clf.predict_proba([ [2., 2.], [1., 2.] ]) | |style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"|>>> clf.predict_proba([ [2., 2.], [1., 2.] ]) | ||
|} | |} | ||
:: [[MLPClassifier]] supports [[multi-class classification]] by applying [[Softmax]] as the [[output function]]. Further, the model supports [[multi-label classification]] in which a [[sample]] can belong to more than one class. For each class, the [[raw output]] passes through the [[logistic function]]. Values larger or equal to 0.5 are rounded to 1, otherwise to 0. For a predicted output of a [[sample]], the indices where the value is 1 represents the assigned classes of that sample: | :: [[MLPClassifier]] supports [[multi-class classification]] by applying [[Softmax]] as the [[output function]]. Further, the model supports [[multi-label classification]] in which a [[sample]] can belong to more than one class. For each class, the [[raw output]] passes through the [[logistic function]]. Values larger or equal to 0.5 are rounded to 1, otherwise to 0. For a predicted output of a [[sample]], the indices where the value is 1 represents the assigned classes of that sample: | ||
{| class="wikitable" style="margin-left: 50px;border:1px;background:#f2f2f2" | {| class="wikitable" style="margin-left: 50px;border:1px;background:#f2f2f2" | ||
|style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"|>>> X = [ [0., 0.], [1., 1.] ] y = [ [0, 1], [1, 1] ] <P>clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(15,), random_state=1) <P>clf.fit(X, y)<P> | |style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"|>>> X = [ [0., 0.], [1., 1.] ] y = [ [0, 1], [1, 1] ] <P> clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(15,), random_state=1) <P> clf.fit(X, y)<P> clf.predict([ [1., 2.] ]) <P> clf.predict([ [0., 0.] ]) | ||
|} | |} | ||
---- | ---- | ||
__NOTOC__ | __NOTOC__ | ||
[[Category:Concept]] | [[Category:Concept]] |
Latest revision as of 18:20, 2 June 2024
A sklearn.neural_network.MLPClassifier is a Multi-layer Perceptron Classification System within sklearn.neural_network
.
- Context
- Usage:
- 1) Import MLP Classification System from scikit-learn :
from sklearn.neural_network import MLPClassifier
- 2) Create design matrix
X
and response vectorY
- 3) Create Classifier object:
clf=MLPClassifier([hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001,...])
- 4) Choose method(s):
fit(X, y)
, fits the classification model to data matrix X and target(s) y.get_params([deep])
, retrieves parameters for this estimator.predict(X)
, predicts using the multi-layer perceptron classifier.predict_log_proba(X)
, returns the probability log estimates.predict_proba(X)
, returns probability estimates.score(X, y[, sample_weight])
, returns the mean accuracy on the given test data and labels.set_params(**params)
, sets the parameters of this estimator.
- 1) Import MLP Classification System from scikit-learn :
- Example(s):
- Counter-Example(s):
- See: Classification System, Regularization Task, Ridge Regression Task.
References
2017a
- (Scikit-Learn, 2017A) ⇒ http://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html Retrieved:2017-12-17
- QUOTE:
class sklearn.neural_network.MLPClassifier(hidden_layer_sizes=(100, ), activation=’relu’, solver=’adam’, alpha=0.0001, batch_size=’auto’, learning_rate=’constant’, learning_rate_init=0.001, power_t=0.5, max_iter=200, shuffle=True, random_state=None, tol=0.0001, verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True, early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, epsilon=1e-08)
Multi-layer Perceptron classifier.
This model optimizes the log-loss function using LBFGS or stochastic gradient descent.
(...)
Notes
MLPClassifier trains iteratively since at each time step the partial derivatives of the loss function with respect to the model parameters are computed to update the parameters. It can also have a regularization term added to the loss function that shrinks model parameters to prevent overfitting. This implementation works with data represented as dense numpy arrays or sparse scipy arrays of floating point values.
- QUOTE:
2017b
- (sklearn,2017) ⇒ http://scikit-learn.org/stable/modules/neural_networks_supervised.html#classification Retrieved:2017-12-17.
- QUOTE: Class MLPClassifier implements a multi-layer perceptron (MLP) algorithm that trains using Backpropagation. MLP trains on two arrays: array X of size (n_samples, n_features), which holds the training samples represented as floating point feature vectors; and array y of size (n_samples,), which holds the target values (class labels) for the training samples:
>>> from
sklearn.neural_network import MLPClassifier X = [ [0., 0.], [1., 1.] ] y = [0, 1] clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1) clf.fit(X, y) |
|style="font-family:monospace; font-size:10pt;font-weight=bold;text-align:left;width:700px;"|>>> clf.predict([ [2., 2.], [-1., -2.] ]) |}
- MLP can fit a non-linear model to the training data.
clf.coefs_
contains the weight matrices that constitute the model parameters:
- MLP can fit a non-linear model to the training data.
>>> [coef.shape for coef in clf.coefs_] |
- Currently, MLPClassifier supports only the Cross-Entropy loss function, which allows probability estimates by running the
predict_proba
method. MLP trains using Backpropagation. More precisely, it trains using some form of gradient descent and the gradients are calculated using Backpropagation. For classification, it minimizes the Cross-Entropy loss function, giving a vector of probability estimates P(y|x) per sample x:
- Currently, MLPClassifier supports only the Cross-Entropy loss function, which allows probability estimates by running the
>>> clf.predict_proba([ [2., 2.], [1., 2.] ]) |
- MLPClassifier supports multi-class classification by applying Softmax as the output function. Further, the model supports multi-label classification in which a sample can belong to more than one class. For each class, the raw output passes through the logistic function. Values larger or equal to 0.5 are rounded to 1, otherwise to 0. For a predicted output of a sample, the indices where the value is 1 represents the assigned classes of that sample:
>>> X = [ [0., 0.], [1., 1.] ] y = [ [0, 1], [1, 1] ] clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(15,), random_state=1) clf.fit(X, y) clf.predict([ [1., 2.] ]) clf.predict([ [0., 0.] ]) |