sklearn.linear model.ElasticNet

From GM-RKB
Jump to navigation Jump to search

A sklearn.linear_model.ElasticNet is an ElasticNet System within sklearn.linear_model class.

1) Import ElasticNet Regression model from scikit-learn : from sklearn.linear_model import ElasticNet
2) Create design matrix X and response vector Y
3) Create ElasticNet object: ENreg=ElasticNet([n_iter=300, tol=0.001, alpha_1=1e-06, alpha_2=1e-06, lambda_1=1e-06, lambda_2=1e-06, compute_score=False, fit_intercept=True, normalize=False, copy_X=True, verbose=False])
4) Choose method(s):
Input: Output:
#Importing modules
from sklearn.linear_model import ElasticNet
from sklearn.model_selection import cross_val_predict
from sklearn.datasets import load_boston
from sklearn.metrics import explained_variance_score, mean_squared_error
import numpy as np
import pylab as pl
boston = load_boston() #Loading boston datasets
x = boston.data # Creating Regression Design Matrix
y = boston.target # Creating target dataset
ENreg= ElasticNet() # Create EN regression object
ENreg.fit(x,y) # predicted values

#Calculaton of RMSE and Explained Variances

yp_cv = cross_val_predict(ENreg, x, y, cv=10) #Calculation 10-Fold CV
Evariance=explained_variance_score(y,yp)
Evariance_cv=explained_variance_score(y,yp_cv)
RMSE =np.sqrt(mean_squared_error(y,yp))
RMSECV =sqrt(mean_squared_error(y,yp_cv)

# Printing Results

print('Method: ElasticNet Regression')
print('RMSE on the dataset: %.4f' %RMSE)
print('RMSE on 10-fold CV: %.4f' %RMSECV)
print('Explained Variance Regression Score on the dataset: %.4f' %Evariance)
print('Explained Variance Regression 10-fold CV: %.4f' %Evariance_cv)

#plotting real vs predicted data

pl.figure(1)
pl.plot(yp, y,'ro')
pl.plot(yp_cv, y,'bo', alpha=0.25, label='10-folds CV')
pl.xlabel('predicted')
pl.title('EN Regression')
pl.ylabel('real')
pl.grid(True)
pl.show()
EN boston10fold.png
(blue dots correspond to 10-Fold CV)


Method: ElasticNet Regression
RMSE on the dataset: 5.1478
RMSE on 10-fold CV: 5.5801
Explained Variance Regression Score on the dataset: 0.6861
Explained Variance Regression 10-fold CV: 0.6315


References

2017

Linear regression with combined L1 and L2 priors as regularizer.
Minimizes the objective function:
   1 / (2 * n_samples) * ||y - Xw||^2_2
   + alpha * l1_ratio * ||w||_1
   + 0.5 * alpha * (1 - l1_ratio) * ||w||^2_2
If you are interested in controlling the L1 and L2 penalty separately, keep in mind that this is equivalent to:
a * L1 + b * L2
where:
alpha = a + b and l1_ratio = a / (a + b)
The parameter l1_ratio corresponds to alpha in the glmnet R package while alpha corresponds to the lambda parameter in glmnet. Specifically, l1_ratio = 1 is the lasso penalty. Currently, l1_ratio <= 0.01 is not reliable, unless you supply your own sequence of alpha.