Software

2 minute read

Master Linear Regression with NumPy: Step-by-Step Guide to Building and Optimizing Your First Model!

July 11, 2024

master-linear-regression-with-numpy:-step-by-step-guide-to-building-and-optimizing-your-first-model!

Linear regression is a simple yet powerful method in machine learning used to model the relationship between a dependent variable (target) and one or more independent variables (predictors). In this article, we will implement a simple linear regression using NumPy, a powerful library for scientific computing in Python. We will cover the different equations necessary for this implementation: the model, the cost function, the gradient, and gradient descent.

1. Linear Regression Model
The linear regression model can be represented by the following equation:

y=Xθ
where:

X is the matrix of predictors.
θ is the vector of parameters (coefficients).

2. Cost Function
The cost function in linear regression is often the sum of squared errors (mean squared error). It measures the difference between the values predicted by the model and the actual values.

3. Gradient
The gradient of the cost function with respect to the parameters
θ is necessary to minimize the cost function using gradient descent. The gradient is calculated as follows:

4. Gradient Descent
Gradient descent is an iterative optimization method used to minimize the cost function. The parameter update equation is:

5. Implementation with NumPy
importing libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Generate example data

np.random.seed(42)
m = 100
x = 2 * np.random.rand(m, 1)
y = 4 + 3 * x + np.random.randn(m, 1)
plt.figure(figsize=(12, 8))
plt.scatter(x, y)

Add a bias term (column of 1s) to X

X = np.hstack((x, np.ones(x.shape)))
X.shape

(100, 2)
Initialize the parameters θ to random values

theta = np.random.randn(2, 1)

Model function

def model(X, theta):
    return X.dot(theta)

plt.scatter(x[:, 0], y)
plt.scatter(x[:, 0], model(X, theta), c='r')

Cost function

def cost_function(X, y, theta):
    m = len(y)
    return 1/(2*m) * np.sum((model(X, theta) - y)**2)

cost_function(X, y, theta)

16.069293038191518

Gradient et Gradient descent

def grad(X, y, theta):
    m = len(y)
    return 1/m * X.T.dot(model(X, theta) - y)

def gradient_descent(X, y, theta, learning_rate, n_iter):
    cost_history = np.zeros(n_iter)
    for i in range(0, n_iter):
        theta = theta - learning_rate * grad(X, y, theta)
        cost_history[i] = cost_function(X, y, theta)

    return theta, cost_history

n_iterations = 1000
learning_rate = 0.01

final_theta, cost_history = gradient_descent(X, y, theta, learning_rate, n_iterations)

final_theta

array([[2.79981142],
[4.18146098]])

predict = model(X, funal_theta)

plt.scatter(x[:, 0], y)
plt.scatter(x[:, 0], predict, c='r')

Learning curve

plt.plot(range(n_iterations), cost_history)

This implementation demonstrates how the fundamental concepts of linear regression, such as the model, cost function, gradient, and gradient descent, can be implemented using NumPy. This basic understanding is essential for advancing to more complex machine learning models.
Feel free to explore further by adjusting hyperparameters, adding more features, or trying other optimization techniques to improve your linear regression model.

Day 1: Python Programming 🧡

July 11, 2024

Software

Running Kubernetes locally with Kind and .NET 8 (plus a bonus with Lens)

July 12, 2024

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

What STEM Professionals Should Know About EB1A Self-Petition in 2026

Airbnb plans to bake in AI features for search, discovery and support

🥂 Beginner-Friendly Guide ‘Champagne Tower’ – Problem 799 (C++, Python, JavaScript)

Trending Tags

Master Linear Regression with NumPy: Step-by-Step Guide to Building and Optimizing Your First Model!

Leave a Reply Cancel reply

Previous Post

Day 1: Python Programming 🧡

Next Post

Running Kubernetes locally with Kind and .NET 8 (plus a bonus with Lens)

Master Linear Regression with NumPy: Step-by-Step Guide to Building and Optimizing Your First Model!

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts