📌

For more content check out the Circuit of Knowledge.

Introduction:

The Gauss-Newton method is an iterative optimization algorithm used to solve non-linear least squares problems. It is particularly useful for curve fitting, where the goal is to find the best-fit parameters of a model function to a given set of data points. In this tutorial, we will explore the Gauss-Newton method, its mathematical derivation, and its implementation in Python with and without using optimization libraries.

Problem Statement:

Let's consider a real-life problem of predicting the temperature of a city based on the day of the year. We have collected temperature data for various days throughout the year and want to fit a curve to this data to make predictions. The temperature model can be represented as a sine function:

$T(d) = a \sin(b (d - c)) + e$

Where:

$T(d)$ is the temperature on day $d$
$a$ is the amplitude of the sine function
$b$ is the frequency of the sine function
$c$ is the phase shift of the sine function
$e$ is the vertical shift of the sine function

Our goal is to find the optimal values of the parameters $a$ , $b$ , $c$ , and $e$ that minimize the sum of squared residuals between the observed temperatures and the predicted temperatures.

Mathematical Derivation:

The Gauss-Newton method is based on the idea of linearizing the non-linear model function around the current estimate of parameters and iteratively updating the parameters to minimize the sum of squared residuals.

Let's denote the residual function as:

$r_i = T_i - T(d_i, \mathbf{p})$

where $T_i$ is the observed temperature on day $d_i$ , $T(d_i, \mathbf{p})$ is the predicted temperature using the current parameter vector $\mathbf{p} = [a, b, c, e]^T$ , and $r_i$ is the residual.

The sum of squared residuals is given by:

$S(\mathbf{p}) = \sum_{i=1}^{n} r_i^2 = \sum_{i=1}^{n} (T_i - T(d_i, \mathbf{p}))^2$

To minimize $S(\mathbf{p})$ , the Gauss-Newton method linearizes the model function using a first-order Taylor expansion around the current parameter estimate $\mathbf{p}_k$ :

$T(d_i, \mathbf{p}) \approx T(d_i, \mathbf{p}_k) + \mathbf{J}_i(\mathbf{p}_k)(\mathbf{p} - \mathbf{p}_k)$

where $\mathbf{J}_i(\mathbf{p}_k)$ is the Jacobian matrix evaluated at $\mathbf{p}_k$ .

The Jacobian matrix for the sine function model is:

$\mathbf{J}_i(\mathbf{p}_k) = \begin{bmatrix} \sin(b_k(d_i - c_k)) & a_k(d_i - c_k)\cos(b_k(d_i - c_k)) & -a_kb_k\cos(b_k(d_i - c_k)) & 1 \end{bmatrix}$

By substituting the linearized model into the sum of squared residuals and setting the derivative with respect to $\mathbf{p}$ to zero, we obtain the normal equations:

$\mathbf{J}^T\mathbf{J}\Delta\mathbf{p} = \mathbf{J}^T\mathbf{r}$

where $\mathbf{J}$ is the Jacobian matrix stacked vertically for all data points, $\mathbf{r}$ is the vector of residuals, and $\Delta\mathbf{p}$ is the update to the parameter vector.

The parameter update is then calculated as:

$\Delta\mathbf{p} = (\mathbf{J}^T\mathbf{J})^{-1}\mathbf{J}^T\mathbf{r}$

The Gauss-Newton method iteratively updates the parameters using:

$\mathbf{p}_{k+1} = \mathbf{p}_k + \Delta\mathbf{p}$

until convergence or a maximum number of iterations is reached.

Python Implementation without Libraries:

Here's the Python code that implements the Gauss-Newton method from scratch:

import numpy as np
import matplotlib.pyplot as plt

# Define the sine function
def sine_func(params, d):
    a, b, c, e = params
    return a * np.sin(b * (d - c)) + e

# Define the Jacobian matrix
def jacobian(params, d):
    a, b, c, e = params
    J = np.zeros((len(d), 4))
    J[:, 0] = np.sin(b * (d - c))
    J[:, 1] = a * (d - c) * np.cos(b * (d - c))
    J[:, 2] = -a * b * np.cos(b * (d - c))
    J[:, 3] = 1
    return J

# Gauss-Newton method
def gauss_newton(params, d, temp, max_iter=100, tol=1e-6):
    for _ in range(max_iter):
        residuals = temp - sine_func(params, d)
        J = jacobian(params, d)
        delta = np.linalg.pinv(J.T @ J) @ J.T @ residuals
        params += delta
        if np.linalg.norm(delta) < tol:
            break
    return params

# Example data
days = np.array([0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330])
temperatures = np.array([5, 10, 20, 25, 30, 35, 40, 35, 25, 20, 10, 5])

# Initial guess for parameters
initial_params = np.array([20, 0.02, 90, 20])

# Perform Gauss-Newton optimization
optimized_params = gauss_newton(initial_params, days, temperatures)

# Generate points for the optimized curve
days_plot = np.linspace(0, 365, 365)
temperatures_plot = sine_func(optimized_params, days_plot)

# Plot the data points and optimized curve
plt.figure(figsize=(10, 6))
plt.scatter(days, temperatures, color='red', label='Data Points')
plt.plot(days_plot, temperatures_plot, color='blue', label='Optimized Curve')
plt.xlabel('Day of the Year')
plt.ylabel('Temperature')
plt.legend()
plt.show()

print("Optimized Parameters:")
print("a =", optimized_params[0])
print("b =", optimized_params[1])
print("c =", optimized_params[2])
print("e =", optimized_params[3])

Optimized Parameters:
a = 17.214381454450546
b = 0.01595915001087644
c = 69.05945422111704
e = 20.060318747383874

Python Implementation using Libraries:

Here's the Python code that utilizes the scipy.optimize library to perform the Gauss-Newton optimization:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import least_squares

# Define the sine function
def sine_func(params, d):
    a, b, c, e = params
    return a * np.sin(b * (d - c)) + e

# Define the residual function
def residuals(params, d, temp):
    return temp - sine_func(params, d)

# Example data
days = np.array([0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330])
temperatures = np.array([5, 10, 20, 25, 30, 35, 40, 35, 25, 20, 10, 5])

# Initial guess for parameters
initial_params = np.array([20, 0.02, 90, 20])

# Perform Gauss-Newton optimization using least_squares (with Levenberg Marquardt Regularization)
result = least_squares(residuals, initial_params, args=(days, temperatures), method='lm')

# Extract the optimized parameters
optimized_params = result.x

# Generate points for the optimized curve
days_plot = np.linspace(0, 365, 365)
temperatures_plot = sine_func(optimized_params, days_plot)

# Plot the data points and optimized curve
plt.figure(figsize=(10, 6))
plt.scatter(days, temperatures, color='red', label='Data Points')
plt.plot(days_plot, temperatures_plot, color='blue', label='Optimized Curve')
plt.xlabel('Day of the Year')
plt.ylabel('Temperature')
plt.legend()
plt.show()

print("Optimized Parameters:")
print("a =", optimized_params[0])
print("b =", optimized_params[1])
print("c =", optimized_params[2])
print("e =", optimized_params[3])

Optimized Parameters:
a = 17.214372262870874
b = 0.015959161056508728
c = 69.0595361694195
e = 20.060332299073693

The least_squares function from scipy.optimize internally uses the Levenberg-Marquardt algorithm, which is a variant of the Gauss-Newton method. It automatically handles the calculation of the Jacobian matrix and the parameter updates.

Another example:

In this example, we will use the Gauss-Newton method to fit a nonlinear function to a given dataset.

Problem Statement:

Suppose we have a dataset containing the population of a city at different years. The relationship between the population and time is known to follow a non-linear logistic growth model:

$P(t) = \frac{K}{1 + Ae^{-rt}}$

where $P(t)$ is the population at time $t$ , $K$ is the carrying capacity (maximum population), $A$ is a constant related to the initial population, and $r$ is the growth rate. Our goal is to estimate the parameters $K$ , $A$ , and $r$ that best fit the given data using the Gauss-Newton method.

Mathematical Derivation:

The Gauss-Newton method aims to minimize the sum of squared residuals between the observed population values and the predicted population values using the non-linear model. The residual for the i-th data point is given by:

$r_i = P_i - P(t_i, \mathbf{p})$

where $P_i$ is the observed population at time $t_i$ , $P(t_i, \mathbf{p})$ is the predicted population using the current parameter vector $\mathbf{p} = [K, A, r]^T$ , and $r_i$ is the residual.

The sum of squared residuals is given by:

$S(\mathbf{p}) = \sum_{i=1}^{n} r_i^2 = \sum_{i=1}^{n} (P_i - P(t_i, \mathbf{p}))^2$

To minimize $S(\mathbf{p})$ using the Gauss-Newton method, we calculate the Jacobian matrix $\mathbf{J}$ of the residuals with respect to the parameters. The Jacobian matrix for the logistic growth model is:

$\mathbf{J}_i(\mathbf{p}) = \begin{bmatrix} \frac{1}{(1 + Ae^{-rt_i})^2} & -\frac{Ke^{-rt_i}}{(1 + Ae^{-rt_i})^2} & \frac{KAte^{-rt_i}}{(1 + Ae^{-rt_i})^2} \end{bmatrix}$

The Gauss-Newton update step is then calculated as:

$\Delta\mathbf{p} = (\mathbf{J}^T\mathbf{J})^{-1}\mathbf{J}^T\mathbf{r}$

The parameters are updated iteratively using:

$\mathbf{p}_{k+1} = \mathbf{p}_k + \Delta\mathbf{p}$ until convergence or a maximum number of iterations is reached.

Python Implementation:

Here's the Python code that implements the Gauss-Newton method for the non-linear least squares problem:

import numpy as np
import matplotlib.pyplot as plt

# Logistic growth model
def logistic_growth(params, t):
    K, A, r = params
    return K / (1 + A * np.exp(-r * t))

# Jacobian matrix of the logistic growth model
def jacobian(params, t):
    K, A, r = params
    J = np.zeros((len(t), 3))
    J[:, 0] = 1 / (1 + A * np.exp(-r * t))**2
    J[:, 1] = -K * np.exp(-r * t) / (1 + A * np.exp(-r * t))**2
    J[:, 2] = K * A * t * np.exp(-r * t) / (1 + A * np.exp(-r * t))**2
    return J

# Gauss-Newton method for non-linear least squares
def gauss_newton(params, t, P, max_iter=100, tol=1e-6):
    for _ in range(max_iter):
        residuals = P - logistic_growth(params, t)
        J = jacobian(params, t)
        delta = np.linalg.pinv(J.T @ J) @ J.T @ residuals
        params += delta
        if np.linalg.norm(delta) < tol:
            break
    return params

# Example data
years = np.array([1900, 1920, 1940, 1960, 1980, 2000])
population = np.array([10000, 15000, 30000, 60000, 90000, 120000])

# Initial guess for parameters
initial_params = np.array([150000, 10, 0.02])

# Perform non-linear least squares using Gauss-Newton method
optimized_params = gauss_newton(initial_params, years - 1900, population)

# Generate points for the optimized curve
years_plot = np.linspace(0, 100, 100) + 1900
population_plot = logistic_growth(optimized_params, years_plot - 1900)

# Plot the data points and optimized curve
plt.figure(figsize=(8, 6))
plt.scatter(years, population, label='Data Points', color='red')
plt.plot(years_plot, population_plot, label='Optimized Curve', color='blue')
plt.xlabel('Year')
plt.ylabel('Population')
plt.title('Non-linear Least Squares Fit')
plt.legend()
plt.show()

print("Optimized Parameters:")
print("K =", optimized_params[0])
print("A =", optimized_params[1])
print("r =", optimized_params[2])

Optimized Parameters:
K = 160463.49627790874
A = 20.848212649774148
r = 0.041219431657605395

Gauss-Newton Algorithm, fitting a logistic population growth

In this example, we have a dataset containing the population of a city at different years. The relationship between the population and time follows a non-linear logistic growth model.

The code defines the logistic growth model function logistic_growth and the corresponding Jacobian matrix function jacobian. The gauss_newton function performs the parameter estimation using the Gauss-Newton method.

After obtaining the optimized parameters, we generate the fitted curve using the logistic growth model and plot it along with the data points to visualize the non-linear least squares fit.

This example demonstrates how the Gauss-Newton method can be applied to a non-linear least squares problem in a data science context, specifically for fitting a logistic growth model to a population dataset.

I apologize for the previous confusion and hope this corrected example accurately illustrates the application of the Gauss-Newton method for non-linear least squares problems in data science. Let me know if you have any further questions!

Conclusion:

In this tutorial, we explored the Gauss-Newton method for curve fitting. We derived the mathematical formulation of the method and implemented it in Python, both from scratch and using the scipy.optimize library. The Gauss-Newton method is a powerful tool for solving non-linear least squares problems and can be applied to various curve fitting and parameter estimation tasks in real-life scenarios.

I hope this tutorial helps you understand the Gauss-Newton method and its implementation in Python! Let me know if you have any further questions.

Tutorial: The Gauss-Newton Method (with Python Examples)

Introduction:

Problem Statement:

Mathematical Derivation:

Python Implementation without Libraries:

Python Implementation using Libraries:

Another example:

Problem Statement:

Mathematical Derivation:

Python Implementation:

Conclusion: