Multiple
Linear Regressions:-
Multiple linear regression (MLR), also known simply
as multiple regression, is a statistical technique that uses several
explanatory variables to predict the outcome of a response variable. The goal
of multiple linear regression is to model the linear relationship between the
explanatory (independent) variables and response (dependent) variables. In
essence, multiple regression is the extension of ordinary least-squares (OLS)
regression because it involves more than one explanatory variable.
Multiple linear regression (MLR), also known simply
as multiple regression, is a statistical technique that uses several
explanatory variables to predict the outcome of a response variable.
Multiple regression is an extension of linear (OLS)
regression that uses just one explanatory variable.
What Multiple Linear Regression Can Tell
You
Simple linear
regression is a function that allows an analyst or statistician to make
predictions about one variable based on the information that is known about
another variable. Linear regression can only be used when one has two
continuous variables an independent
variable and a dependent variable. The independent variable is the parameter
that is used to calculate the dependent variable or outcome. A multiple
regression model extends to several explanatory variables.
The multiple regression model is based on the
following assumptions:
There is a linear relationship between the dependent variables and the independent variables The independent variables are not too highly correlated with each other yi observations are selected independently and randomly from the population.
code:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
"""## Importing the dataset"""
## dataset = pd.read_csv(r'D:\Regression\50_Startups.csv')
dataset = pd.read_csv("C:/Users/SR Laptop/Desktop/multiple linear regression/50_Startups.csv")
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
print(X)
"""## Encoding categorical data"""
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers=[('encoder', OneHotEncoder(), [3])],
remainder='passthrough')
X = np.array(ct.fit_transform(X))
print(X)
"""## Splitting the dataset into the Training set and Test set"""
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2,
random_state = 0)
"""## Training the Multiple Linear Regression model on the Training set"""
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)
"""## Predicting the Test set results"""
y_pred = regressor.predict(X_test)
np.set_printoptions(precision=2)
print(np.concatenate((y_pred.reshape(len(y_pred),1),
y_test.reshape(len(y_test),1)),1))
"""## Predicting the Test set results in other way"""
y_pred = regressor.predict(X_test)
np.set_printoptions(precision=2)
# Print header
print("Predicted Profit\tActual Profit")
print("----------------------------------")
# Loop to print each pair
for pred, actual in zip(y_pred, y_test):
print(f"{pred:,.2f}\t\t{actual:,.2f}")
Output:
[[165349.2 136897.8 471784.1 'New York'] [162597.7 151377.59 443898.53 'California'] [153441.51 101145.55 407934.54 'Florida'] [144372.41 118671.85 383199.62 'New York'] [142107.34 91391.77 366168.42 'Florida'] [131876.9 99814.71 362861.36 'New York'] [134615.46 147198.87 127716.82 'California'] [130298.13 145530.06 323876.68 'Florida'] [120542.52 148718.95 311613.29 'New York'] [123334.88 108679.17 304981.62 'California'] [101913.08 110594.11 229160.95 'Florida'] [100671.96 91790.61 249744.55 'California'] [93863.75 127320.38 249839.44 'Florida'] [91992.39 135495.07 252664.93 'California'] [119943.24 156547.42 256512.92 'Florida'] [114523.61 122616.84 261776.23 'New York'] [78013.11 121597.55 264346.06 'California'] [94657.16 145077.58 282574.31 'New York'] [91749.16 114175.79 294919.57 'Florida'] [86419.7 153514.11 0.0 'New York'] [76253.86 113867.3 298664.47 'California'] [78389.47 153773.43 299737.29 'New York'] [73994.56 122782.75 303319.26 'Florida'] [67532.53 105751.03 304768.73 'Florida'] [77044.01 99281.34 140574.81 'New York'] [64664.71 139553.16 137962.62 'California'] [75328.87 144135.98 134050.07 'Florida'] [72107.6 127864.55 353183.81 'New York'] [66051.52 182645.56 118148.2 'Florida'] [65605.48 153032.06 107138.38 'New York'] [61994.48 115641.28 91131.24 'Florida'] [61136.38 152701.92 88218.23 'New York'] [63408.86 129219.61 46085.25 'California'] [55493.95 103057.49 214634.81 'Florida'] [46426.07 157693.92 210797.67 'California'] [46014.02 85047.44 205517.64 'New York'] [28663.76 127056.21 201126.82 'Florida'] [44069.95 51283.14 197029.42 'California'] [20229.59 65947.93 185265.1 'New York'] [38558.51 82982.09 174999.3 'California'] [28754.33 118546.05 172795.67 'California'] [27892.92 84710.77 164470.71 'Florida'] [23640.93 96189.63 148001.11 'California'] [15505.73 127382.3 35534.17 'New York'] [22177.74 154806.14 28334.72 'California'] [1000.23 124153.04 1903.93 'New York'] [1315.46 115816.21 297114.46 'Florida'] [0.0 135426.92 0.0 'California'] [542.05 51743.15 0.0 'New York'] [0.0 116983.8 45173.06 'California']] [[0.0 0.0 1.0 165349.2 136897.8 471784.1] [1.0 0.0 0.0 162597.7 151377.59 443898.53] [0.0 1.0 0.0 153441.51 101145.55 407934.54] [0.0 0.0 1.0 144372.41 118671.85 383199.62] [0.0 1.0 0.0 142107.34 91391.77 366168.42] [0.0 0.0 1.0 131876.9 99814.71 362861.36] [1.0 0.0 0.0 134615.46 147198.87 127716.82] [0.0 1.0 0.0 130298.13 145530.06 323876.68] [0.0 0.0 1.0 120542.52 148718.95 311613.29] [1.0 0.0 0.0 123334.88 108679.17 304981.62] [0.0 1.0 0.0 101913.08 110594.11 229160.95] [1.0 0.0 0.0 100671.96 91790.61 249744.55] [0.0 1.0 0.0 93863.75 127320.38 249839.44] [1.0 0.0 0.0 91992.39 135495.07 252664.93] [0.0 1.0 0.0 119943.24 156547.42 256512.92] [0.0 0.0 1.0 114523.61 122616.84 261776.23] [1.0 0.0 0.0 78013.11 121597.55 264346.06] [0.0 0.0 1.0 94657.16 145077.58 282574.31] [0.0 1.0 0.0 91749.16 114175.79 294919.57] [0.0 0.0 1.0 86419.7 153514.11 0.0] [1.0 0.0 0.0 76253.86 113867.3 298664.47] [0.0 0.0 1.0 78389.47 153773.43 299737.29] [0.0 1.0 0.0 73994.56 122782.75 303319.26] [0.0 1.0 0.0 67532.53 105751.03 304768.73] [0.0 0.0 1.0 77044.01 99281.34 140574.81] [1.0 0.0 0.0 64664.71 139553.16 137962.62] [0.0 1.0 0.0 75328.87 144135.98 134050.07] [0.0 0.0 1.0 72107.6 127864.55 353183.81] [0.0 1.0 0.0 66051.52 182645.56 118148.2] [0.0 0.0 1.0 65605.48 153032.06 107138.38] [0.0 1.0 0.0 61994.48 115641.28 91131.24] [0.0 0.0 1.0 61136.38 152701.92 88218.23] [1.0 0.0 0.0 63408.86 129219.61 46085.25] [0.0 1.0 0.0 55493.95 103057.49 214634.81] [1.0 0.0 0.0 46426.07 157693.92 210797.67] [0.0 0.0 1.0 46014.02 85047.44 205517.64] [0.0 1.0 0.0 28663.76 127056.21 201126.82] [1.0 0.0 0.0 44069.95 51283.14 197029.42] [0.0 0.0 1.0 20229.59 65947.93 185265.1] [1.0 0.0 0.0 38558.51 82982.09 174999.3] [1.0 0.0 0.0 28754.33 118546.05 172795.67] [0.0 1.0 0.0 27892.92 84710.77 164470.71] [1.0 0.0 0.0 23640.93 96189.63 148001.11] [0.0 0.0 1.0 15505.73 127382.3 35534.17] [1.0 0.0 0.0 22177.74 154806.14 28334.72] [0.0 0.0 1.0 1000.23 124153.04 1903.93] [0.0 1.0 0.0 1315.46 115816.21 297114.46] [1.0 0.0 0.0 0.0 135426.92 0.0] [0.0 0.0 1.0 542.05 51743.15 0.0] [1.0 0.0 0.0 0.0 116983.8 45173.06]] [[103015.2 103282.38] [132582.28 144259.4 ] [132447.74 146121.95] [ 71976.1 77798.83] [178537.48 191050.39] [116161.24 105008.31] [ 67851.69 81229.06] [ 98791.73 97483.56] [113969.44 110352.25] [167921.07 166187.94]] Predicted Profit Actual Profit ---------------------------------- 103,015.20 103,282.38 132,582.28 144,259.40 132,447.74 146,121.95 71,976.10 77,798.83 178,537.48 191,050.39 116,161.24 105,008.31 67,851.69 81,229.06 98,791.73 97,483.56 113,969.44 110,352.25 167,921.07 166,187.94
0 Comments