Simple linear regression:-
There
is only one continuous independent variable x and the assumed relation between
the independent variable and the dependent variable y is
y = a + bx.
Simple
linear regression is an approach for predicting a response using a single
feature. It is assumed that the two variables are linearly related. Hence, we
try to find a linear function that predicts the response value(y) as accurately
as possible as a function of the feature or independent variable(x). 
Let
us start to experiment with Simple Linear Regression: 
Download
the dataset:
Code:
import
numpy as np
import
pandas as pd
import
matplotlib.pyplot as plt
##
load the dataset
dataset = pd.read_csv("C:\\Users\\SR Laptop\\Desktop\\Linear regression\\Salary_dataset.csv")
X
=dataset.iloc[:, 0:1].values
y
= dataset.iloc[:, -1].values
##Spliting
the datasets into training and testing test
from
sklearn.model_selection import train_test_split
X_train,
X_test, y_train, y_test= train_test_split(X,y, test_size=0.7, random_state=0)
print("features
YOE:  ", X_train)
print("         ")
print("Label
Salary:  ", y_train)
print("         ")
print("features
YOE:  ", X_test)
print("         ")      
print("Label
Salary:  ", y_test)
print("         ")      
#train
the simple linear regression model on the training sets
from
sklearn.linear_model import LinearRegression
regressor=
LinearRegression()
regressor.fit(X_train,
y_train)
print("features
YOE:  ", X_train)
print("         ") 
print("Label
Salary:  ", y_train)
print("         ") 
print("features
YOE:  ", X_test)
print("         ") 
print("Label
Salary:  ",  y_test)
print("         ") 
##predict
the test sets results
y_pred
= regressor.predict(X_test)
##visualizing
the training set results
plt.scatter(X_train,y_train,color='r')
plt.plot(X_train,
regressor.predict (X_train), color='b')
plt.title('salary
vs experince(training set)')        
plt.xlabel('years
of exprence')
plt.ylabel('salary')  
plt.show()           
##Visualizing
the test set Results 
plt.scatter(X_test,y_test,color='r')
plt.plot(X_train,
regressor.predict (X_train), color='b')
plt.title('salary
vs experince(testing set)')        
plt.xlabel('years
of exprence')
plt.ylabel('salary')  
plt.show()   
##visulization
with predicted values
plt.scatter(X_test,y_test,color='r')
plt.plot(X_test,
y_pred, color='b')
plt.title('salary
vs experince(predicted test values)')       
plt.xlabel('years
of exprence')
plt.ylabel('salary')  
plt.show()  
##make
a single prediction
single_prediction=
regressor.predict([[12]])
print(single_prediction)
##print
the model parameters
coefficient=
regressor.coef_
print("coefficient
is:", coefficient)
intercept=
regressor.intercept_
print("intercept
is:", intercept)
##manual
calculation of salary prediction extra part
manual_prediction=
2835.78327444 + 33603.2285041225*15
print("manual
prediction by using coeficient and intercept: ", manual_prediction)
#auto
prediction
auto_prediction=
regressor.predict([[15]])
print("Auto
regression is: ", auto_prediction)
Output:
features YOE:   [[19]
 [ 9]
 [ 7]
 [25]
 [ 3]
 [ 0]
 [21]
 [15]
 [12]]
        
Label Salary:   [ 93941. 
57190.  54446. 105583.  43526. 
39344.  98274.  67939. 
56958.]
        
features YOE:   [[ 2]
 [28]
 [13]
 [10]
 [26]
 [24]
 [27]
 [11]
 [17]
 [22]
 [ 5]
 [16]
 [ 8]
 [14]
 [23]
 [20]
 [ 1]
 [29]
 [ 6]
 [ 4]
 [18]]
        
Label Salary:   [ 37732. 122392.  57082. 
63219. 116970. 109432. 112636. 
55795.  83089.
 101303. 
56643.  66030.  64446. 
61112. 113813.  91739.  46206. 121873.
 
60151.  39892.  81364.]
        
features YOE:   [[19]
 [ 9]
 [ 7]
 [25]
 [ 3]
 [ 0]
 [21]
 [15]
 [12]]
        
Label Salary:   [ 93941. 
57190.  54446. 105583.  43526. 
39344.  98274.  67939. 
56958.]
        
features YOE:   [[ 2]
 [28]
 [13]
 [10]
 [26]
 [24]
 [27]
 [11]
 [17]
 [22]
 [ 5]
 [16]
 [ 8]
 [14]
 [23]
 [20]
 [ 1]
 [29]
 [ 6]
 [ 4]
 [18]]
        
Label Salary:   [ 37732. 122392.  57082. 
63219. 116970. 109432. 112636. 
55795.  83089.
 101303. 
56643.  66030.  64446. 
61112. 113813.  91739.  46206. 121873.
 
60151.  39892.  81364.]
        
[67632.62779741]
coefficient is: [2835.78327444]
intercept is: 33603.2285041225
manual prediction by using coeficient and
intercept:  506884.21083627746
Auto regression is:  [76139.97762073]
 



 
 
0 Comments