sklearn.linear model.LinearRegression

Jump to: navigation, search

A sklearn.linear model.LinearRegression is a linear least-squares regression system within sklearn.linear_model class.

1) Import Linear Regression model from scikit-learn : from sklearn.linear_model import LinearRegression
2) Create a design matrix X and response vector Y
3) Create a Lasso Regression object: model=LinearRegression([fit_intercept=True, normalize=False, copy_X=True, n_jobs=1])
4) Choose method(s):
Input: Output:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_predict
from sklearn.datasets import load_boston
from sklearn.metrics import explained_variance_score, mean_squared_error
import numpy as np
import pylab as pl
boston = load_boston() #Loading boston datasets
x = # Creating Regression Design Matrix
y = # Creating target dataset
linreg = LinearRegression() # Create linear regression object,y) <span style="font-weight:italic; color:gray; # Fit linear regression
yp = linreg.predict(x) # predicted values
yp_cv = cross_val_predict(linreg, x, y, cv=10) #Calculation 10-Fold CV
linear boston10fold.png
(blue dots correspond to 10-Fold CV)

#Calculaton of RMSE and Explained Variances

RMSE =np.sqrt(mean_squared_error(y,yp))
RMSECV =sqrt(mean_squared_error(y,yp_cv)_
Method: Linear Regression
RMSE on the dataset: 4.6795
RMSE on 10-fold CV: 5.8819
Explained Variance Regression Score on the dataset : 0.7406
Explained Variance Regression 10-fold CV: 0.5902




# Split the targets into training/testing sets 
diabetes_y_train =[:-20]
diabetes_y_test =[-20:]

# Create linear regression object regr = linear_model.LinearRegression()
# Train the model using the training sets, diabetes_y_train)


2017 e.

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(,

from sklearn.linear_model import LinearRegression

clf = LinearRegression(), y_train)

predicted = clf.predict(X_test)

expected = y_test

print("RMS: %s" % np.sqrt(np.mean((predicted - expected) ** 2)))

We can plot the error: expected as a function of predicted:
plt.scatter(expected, predicted)