Simple linear regression:-
There
is only one continuous independent variable x and the assumed relation between
the independent variable and the dependent variable y is
y = a + bx.
Simple
linear regression is an approach for predicting a response using a single
feature. It is assumed that the two variables are linearly related. Hence, we
try to find a linear function that predicts the response value(y) as accurately
as possible as a function of the feature or independent variable(x).
Let
us start to experiment with Simple Linear Regression:
Download
the dataset:
Code:
import
numpy as np
import
pandas as pd
import
matplotlib.pyplot as plt
##
load the dataset
dataset = pd.read_csv("C:\\Users\\SR Laptop\\Desktop\\Linear regression\\Salary_dataset.csv")
X
=dataset.iloc[:, 0:1].values
y
= dataset.iloc[:, -1].values
##Spliting
the datasets into training and testing test
from
sklearn.model_selection import train_test_split
X_train,
X_test, y_train, y_test= train_test_split(X,y, test_size=0.7, random_state=0)
print("features
YOE: ", X_train)
print(" ")
print("Label
Salary: ", y_train)
print(" ")
print("features
YOE: ", X_test)
print(" ")
print("Label
Salary: ", y_test)
print(" ")
#train
the simple linear regression model on the training sets
from
sklearn.linear_model import LinearRegression
regressor=
LinearRegression()
regressor.fit(X_train,
y_train)
print("features
YOE: ", X_train)
print(" ")
print("Label
Salary: ", y_train)
print(" ")
print("features
YOE: ", X_test)
print(" ")
print("Label
Salary: ", y_test)
print(" ")
##predict
the test sets results
y_pred
= regressor.predict(X_test)
##visualizing
the training set results
plt.scatter(X_train,y_train,color='r')
plt.plot(X_train,
regressor.predict (X_train), color='b')
plt.title('salary
vs experince(training set)')
plt.xlabel('years
of exprence')
plt.ylabel('salary')
plt.show()
##Visualizing
the test set Results
plt.scatter(X_test,y_test,color='r')
plt.plot(X_train,
regressor.predict (X_train), color='b')
plt.title('salary
vs experince(testing set)')
plt.xlabel('years
of exprence')
plt.ylabel('salary')
plt.show()
##visulization
with predicted values
plt.scatter(X_test,y_test,color='r')
plt.plot(X_test,
y_pred, color='b')
plt.title('salary
vs experince(predicted test values)')
plt.xlabel('years
of exprence')
plt.ylabel('salary')
plt.show()
##make
a single prediction
single_prediction=
regressor.predict([[12]])
print(single_prediction)
##print
the model parameters
coefficient=
regressor.coef_
print("coefficient
is:", coefficient)
intercept=
regressor.intercept_
print("intercept
is:", intercept)
##manual
calculation of salary prediction extra part
manual_prediction=
2835.78327444 + 33603.2285041225*15
print("manual
prediction by using coeficient and intercept: ", manual_prediction)
#auto
prediction
auto_prediction=
regressor.predict([[15]])
print("Auto
regression is: ", auto_prediction)
Output:
features YOE: [[19]
[ 9]
[ 7]
[25]
[ 3]
[ 0]
[21]
[15]
[12]]
Label Salary: [ 93941.
57190. 54446. 105583. 43526.
39344. 98274. 67939.
56958.]
features YOE: [[ 2]
[28]
[13]
[10]
[26]
[24]
[27]
[11]
[17]
[22]
[ 5]
[16]
[ 8]
[14]
[23]
[20]
[ 1]
[29]
[ 6]
[ 4]
[18]]
Label Salary: [ 37732. 122392. 57082.
63219. 116970. 109432. 112636.
55795. 83089.
101303.
56643. 66030. 64446.
61112. 113813. 91739. 46206. 121873.
60151. 39892. 81364.]
features YOE: [[19]
[ 9]
[ 7]
[25]
[ 3]
[ 0]
[21]
[15]
[12]]
Label Salary: [ 93941.
57190. 54446. 105583. 43526.
39344. 98274. 67939.
56958.]
features YOE: [[ 2]
[28]
[13]
[10]
[26]
[24]
[27]
[11]
[17]
[22]
[ 5]
[16]
[ 8]
[14]
[23]
[20]
[ 1]
[29]
[ 6]
[ 4]
[18]]
Label Salary: [ 37732. 122392. 57082.
63219. 116970. 109432. 112636.
55795. 83089.
101303.
56643. 66030. 64446.
61112. 113813. 91739. 46206. 121873.
60151. 39892. 81364.]
[67632.62779741]
coefficient is: [2835.78327444]
intercept is: 33603.2285041225
manual prediction by using coeficient and
intercept: 506884.21083627746
Auto regression is: [76139.97762073]
0 Comments