Previously, a brief introduction was made on linear regression by the traditional method in order to understand the mathematics behind it. However, Python has built-in machine learning libraries to make coding easier and shorter. In this second part of linear regression, you will learn how to use this powerful library.
Previously, a brief introduction was made on linear regression by the traditional method in order to understand the mathematics behind it. However, Python has built-in machine learning libraries to make coding easier and shorter. In this second part of linear regression, you will learn how to use this powerful library.
In the previous tutorial, you have learned how to build a linear regression using matrix multiplication (please go to Python Machine Learning: Linear Regression (I)). Now, in this tutorial, a Machine Learning Python library called scikit-learn will be used for this purpose.
Once we have imported the data from the text file, let's set our x- and y-values.
#Importing library
import numpy as np
#Importing text file
data = np.loadtxt('points.txt', skiprows=(2), dtype=float)
print(data)
#Setting x values
x = data[:,0]
print(x)
#Setting y values
y = data[:,1]
print(y)
From the figure above (an extract of the whole data), we can notice that $x$ and $y$ are 1D arrays. If we want to work with the scikit-learn Machine Learning Python library, it is necessary to convert our 1D arrays into 2D. For this, the function reshape(-1,1).
#Reshaping the array into a vector-column
x2 = data[:,0].reshape(-1,1)
print(x2)
#Reshaping the array into a vector-column
y2 = data[:,1].reshape(-1,1)
print(y2)
Now, we are able to build our linear regression model using the LinearRegression module from the scikit-learn library. Do not forget to import the library.
#Importing library
from sklearn.linear_model import LinearRegression
#Building the linear regression model
linear_regression = LinearRegression()
linear_model = linear_regression.fit(x2,y2)
As explained in the previous tutorial, the linear relationship can be as \[y = c_{0} + c_{1}*x\], where $c_0$ is the intercept with the y-axis, and $c_1$ is the slope of the line. These two coefficients can be found easier and faster thanks to the function LinearRegression().fit(). In order to get both coefficients, the functions intercept_ and coef_ are needed.
#Getting the intercept with y-axis
intercept_yaxis = linear_model.intercept_
print(intercept_yaxis)
#Getting the coefficient
slope = linear_model.coef_
print(slope)
In contrast to the matrix multiplication approach where the coefficient matrix is an array of two elements, both elements are now got in two different arrays of one element each. If comparing both approaches, both intercept and slope should be exactly the same. The coefficient matrix from the previous tutorial was the following:
As seen from both pictures, we notice that both coefficients (intercept and slope) are exactly the same. This means we did a great job making the linear regression! Finally, let's establish the linear relationship and plot it.
#Importing library
import matplotlib.pyplot as plt
#Establishing the linear relationship
y_lineal2 = slope*x2 + intercept_yaxis
print(y_lineal2)
#Plotting
#Initially given x- and y-points
plt.scatter(x,y)
#Linear regression points
plt.plot(x2, y_lineal2, color='red')
#Naming the graph, x- and y-axis
plt.title('scikit-learn library')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
The plot we got in the previous tutorial was the following:
As seen from both graphics, we can say they are exactly the same! The final Python code will look like the following:
#Importing libraries
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
#Importing text file
data = np.loadtxt('points.txt', skiprows=(2), dtype=float)
print(data)
#Setting x values
x = data[:,0]
print(x)
#Setting y values
y = data[:,1]
print(y)
#Reshaping the array into a vector-column
x2 = data[:,0].reshape(-1,1)
print(x2)
#Reshaping the array into a vector-column
y2 = data[:,1].reshape(-1,1)
print(y2)
#Building the linear regression model
linear_regression = LinearRegression()
linear_model = linear_regression.fit(x2,y2)
#Getting the intercept with y-axis
intercept_yaxis = linear_model.intercept_
print(intercept_yaxis)
#Getting the coefficient
slope = linear_model.coef_
print(slope)
#Establishing the linear relationship
y_lineal2 = slope*x2 + intercept_yaxis
print(y_lineal2)
#Plotting
#Initially given x- and y-points
plt.scatter(x,y)
#Linear regression points
plt.plot(x2, y_lineal2, color='red')
#Naming the graph, x- and y-axis
plt.title('scikit-learn library')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
Congratulations! You just made your first Machine Learning regression. In the next tutorial, polynomial regression will be explained. To download the complete code and the text file containing the data used in this tutorial, please click here.
Views: 1 Github
Notifications
Receive the new articles in your email