Tutorial: The Gauss-Newton Method (with MATLAB Examples)

Tutorial: The Gauss-Newton Method (with MATLAB Examples)

šŸ“Œ
šŸ“Œ
For more content check out the Circuit of Knowledge.

Introduction:

The Gauss-Newton method is an iterative optimization algorithm used to solve non-linear least squares problems. It is particularly useful for curve fitting, where the goal is to find the best-fit parameters of a model function to a given set of data points. In this tutorial, we will explore the Gauss-Newton method, its mathematical derivation, and its implementation in MATLAB.

Problem Statement:

Problem Statement: Let's consider a real-life problem of predicting the temperature of a city based on the day of the year. We have collected temperature data for various days throughout the year and want to fit a curve to this data to make predictions. The temperature model can be represented as a sine function:

T(d)=asinā”(b(dāˆ’c))+eT(d) = a \sin(b (d - c)) + e

Where:

  • T(d)T(d) is the temperature on day dd
  • aa is the amplitude of the sine function
  • bb is the frequency of the sine function
  • cc is the phase shift of the sine function
  • ee is the vertical shift of the sine function

Our goal is to find the optimal values of the parameters aa, bb, cc, and ee that minimize the sum of squared residuals between the observed temperatures and the predicted temperatures.

Mathematical Derivation:

The Gauss-Newton method is based on the idea of linearizing the non-linear model function around the current estimate of parameters and iteratively updating the parameters to minimize the sum of squared residuals.

Let's denote the residual function as:

ri=Tiāˆ’T(di,p)r_i = T_i - T(d_i, \mathbf{p})

where TiT_i is the observed temperature on day did_i, T(di,p)T(d_i, \mathbf{p}) is the predicted temperature using the current parameter vector p=[a,b,c,e]T\mathbf{p} = [a, b, c, e]^T, and rir_i is the residual.

The sum of squared residuals is given by:

S(p)=āˆ‘i=1nri2=āˆ‘i=1n(Tiāˆ’T(di,p))2S(\mathbf{p}) = \sum_{i=1}^{n} r_i^2 = \sum_{i=1}^{n} (T_i - T(d_i, \mathbf{p}))^2

To minimize S(p)S(\mathbf{p}), the Gauss-Newton method linearizes the model function using a first-order Taylor expansion around the current parameter estimate pk\mathbf{p}_k:

T(di,p)ā‰ˆT(di,pk)+Ji(pk)(pāˆ’pk)T(d_i, \mathbf{p}) \approx T(d_i, \mathbf{p}_k) + \mathbf{J}_i(\mathbf{p}_k)(\mathbf{p} - \mathbf{p}_k)

where Ji(pk)\mathbf{J}_i(\mathbf{p}_k) is the Jacobian matrix evaluated at pk\mathbf{p}_k.

The Jacobian matrix for the sine function model is:

Ji(pk)=[sinā”(bk(diāˆ’ck))ak(diāˆ’ck)cosā”(bk(diāˆ’ck))āˆ’akbkcosā”(bk(diāˆ’ck))1]\mathbf{J}_i(\mathbf{p}_k) = \begin{bmatrix} \sin(b_k(d_i - c_k)) & a_k(d_i - c_k)\cos(b_k(d_i - c_k)) & -a_kb_k\cos(b_k(d_i - c_k)) & 1 \end{bmatrix}

By substituting the linearized model into the sum of squared residuals and setting the derivative with respect to p\mathbf{p} to zero, we obtain the normal equations:

JTJĪ”p=JTr\mathbf{J}^T\mathbf{J}\Delta\mathbf{p} = \mathbf{J}^T\mathbf{r}

where J\mathbf{J} is the Jacobian matrix stacked vertically for all data points, r\mathbf{r} is the vector of residuals, and Ī”p\Delta\mathbf{p} is the update to the parameter vector.

The parameter update is then calculated as:

Ī”p=(JTJ)āˆ’1JTr\Delta\mathbf{p} = (\mathbf{J}^T\mathbf{J})^{-1}\mathbf{J}^T\mathbf{r}

The Gauss-Newton method iteratively updates the parameters using:

pk+1=pk+Ī”p\mathbf{p}_{k+1} = \mathbf{p}_k + \Delta\mathbf{p}

until convergence or a maximum number of iterations is reached.

MATLAB Implementation:

Here's the MATLAB code that implements the Gauss-Newton method:

% Example data
days = [0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330].';
temperatures = [5, 10, 20, 25, 30, 35, 40, 35, 25, 20, 10, 5].';

% Initial guess for parameters
initial_params = [20, 0.02, 90, 20].';

% Perform Gauss-Newton optimization
max_iter = 100;
tol = 1e-6;
optimized_params = gauss_newton(initial_params, days, temperatures, max_iter, tol);

% Generate points for the optimized curve
days_plot = linspace(1, 365, 365);
temperatures_plot = sine_func(optimized_params, days_plot);

% Plot the data points and optimized curve
figure;
hold on;
scatter(days, temperatures, 'r', 'filled');
plot(days_plot, temperatures_plot, 'b');
xlabel('Day of the Year');
ylabel('Temperature');
legend('Data Points', 'Optimized Curve');
hold off;

fprintf('Optimized Parameters:\n');
fprintf('a = %.4f\n', optimized_params(1));
fprintf('b = %.4f\n', optimized_params(2));
fprintf('c = %.4f\n', optimized_params(3));
fprintf('e = %.4f\n', optimized_params(4));

% Define the sine function
function T = sine_func(params, d)
a = params(1);
b = params(2);
c = params(3);
e = params(4);
T = a * sin(b * (d - c)) + e;
end

% Define the Jacobian matrix
function J = jacobian(params, d)
a = params(1);
b = params(2);
c = params(3);
J = zeros(length(d), 4);
J(:, 1) = sin(b * (d - c));
J(:, 2) = a * (d - c) .* cos(b * (d - c));
J(:, 3) = -a * b .* cos(b * (d - c));
J(:, 4) = 1;
end

% Gauss-Newton method
function params = gauss_newton(params, d, temp, max_iter, tol)
for i = 1:max_iter
    residuals = temp - sine_func(params, d); 
    J = jacobian(params, d);
    delta = (J.' * J) \ (J.' * residuals);
    params = params + delta;
    if norm(delta) < tol
        break;
    end
end
end

The code defines the sine_func function that represents the sine model and the jacobian function that calculates the Jacobian matrix. The gauss_newton function performs the Gauss-Newton optimization by iteratively updating the parameters until convergence or a maximum number of iterations is reached.

The example data and initial guess for parameters are defined, and the Gauss-Newton optimization is performed using the gauss_newton function. The optimized parameters are then used to generate points for the optimized curve, and the data points and optimized curve are plotted using MATLAB's plotting functions.

Output:

Optimized Parameters:
a = 17.2144
b = 0.0160
c = 69.0595
e = 20.0603
image

The plot shows the data points and the optimized curve obtained using the Gauss-Newton method.

MATLAB implementation using optimization toolbox

Here's the MATLAB code that implements the Gauss-Newton method using MATLAB's Optimization Toolbox:

% Example data
days = [0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330].';
temperatures = [5, 10, 20, 25, 30, 35, 40, 35, 25, 20, 10, 5].';

% Initial guess for parameters
initial_params = [20, 0.02, 90, 20].';

% Perform Gauss-Newton optimization using lsqnonlin
options = optimoptions('lsqnonlin', 'Algorithm', 'levenberg-marquardt', ...
                       'MaxIterations', 100, 'TolFun', 1e-6);
optimized_params = lsqnonlin(@(params) residual_func(params, days, temperatures), ...
                             initial_params, [], [], options);

% Generate points for the optimized curve
days_plot = linspace(0, 365, 365);
temperatures_plot = sine_func(optimized_params, days_plot);

% Plot the data points and optimized curve
figure;
hold on;
scatter(days, temperatures, 'r', 'filled');
plot(days_plot, temperatures_plot, 'b');
xlabel('Day of the Year');
ylabel('Temperature');
legend('Data Points', 'Optimized Curve');
hold off;

fprintf('Optimized Parameters:\n');
fprintf('a = %.4f\n', optimized_params(1));
fprintf('b = %.4f\n', optimized_params(2));
fprintf('c = %.4f\n', optimized_params(3));
fprintf('e = %.4f\n', optimized_params(4));

% Define the sine function
function T = sine_func(params, d)
    a = params(1);
    b = params(2);
    c = params(3);
    e = params(4);
    T = a * sin(b * (d - c)) + e;
end

% Define the residual function
function residuals = residual_func(params, d, temp)
    residuals = temp - sine_func(params, d);
end

In this implementation, we use the lsqnonlin function from MATLAB's Optimization Toolbox to perform the Gauss-Newton optimization. The lsqnonlin function is specifically designed for solving non-linear least squares problems.

We define the sine_func function as before, which represents the sine model. Additionally, we define a residual_func function that calculates the residuals between the observed temperatures and the predicted temperatures using the current parameter values.

The lsqnonlin function takes the following arguments:

  • The residual function (residual_func)
  • The initial guess for parameters (initial_params)
  • Empty arrays [] for the lower and upper bounds (since we don't have any bounds in this example)
  • Options for the optimization algorithm (options)

We set the optimization options using optimoptions to specify the Levenberg-Marquardt algorithm, maximum number of iterations, and tolerance for the function value.

The lsqnonlin function returns the optimized parameters, which we store in optimized_params.

The rest of the code remains the same as before, where we generate points for the optimized curve, plot the data points and optimized curve, and print the optimized parameter values.

By using MATLAB's Optimization Toolbox, we can leverage the built-in lsqnonlin function to solve the non-linear least squares problem without manually implementing the Gauss-Newton algorithm.

Note: Make sure you have the Optimization Toolbox installed in your MATLAB environment to use the lsqnonlin function.

Another example:

In this example, we will use the Gauss-Newton method to fit a nonlinear function to a given dataset.

Problem Statement: Suppose we have a dataset containing the population of a city at different years. The relationship between the population and time is known to follow a non-linear logistic growth model:

P(t)=K1+Aeāˆ’rtP(t) = \frac{K}{1 + Ae^{-rt}}

where P(t)P(t) is the population at time tt, KK is the carrying capacity (maximum population), AA is a constant related to the initial population, and rr is the growth rate. Our goal is to estimate the parameters KK, AA, and rr that best fit the given data using the Gauss-Newton method.

Mathematical Derivation: The Gauss-Newton method aims to minimize the sum of squared residuals between the observed population values and the predicted population values using the non-linear model. The residual for the i-th data point is given by:

ri=Piāˆ’P(ti,p)r_i = P_i - P(t_i, \mathbf{p})

where PiP_i is the observed population at time tit_i, P(ti,p)P(t_i, \mathbf{p}) is the predicted population using the current parameter vector p=[K,A,r]T\mathbf{p} = [K, A, r]^T, and rir_i is the residual.

The sum of squared residuals is given by:

S(p)=āˆ‘i=1nri2=āˆ‘i=1n(Piāˆ’P(ti,p))2S(\mathbf{p}) = \sum_{i=1}^{n} r_i^2 = \sum_{i=1}^{n} (P_i - P(t_i, \mathbf{p}))^2

To minimize S(p)S(\mathbf{p}) using the Gauss-Newton method, we calculate the Jacobian matrix J\mathbf{J} of the residuals with respect to the parameters. The Jacobian matrix for the logistic growth model is:

Ji(p)=[1(1+Aeāˆ’rti)2āˆ’Keāˆ’rti(1+Aeāˆ’rti)2KAteāˆ’rti(1+Aeāˆ’rti)2]\mathbf{J}_i(\mathbf{p}) = \begin{bmatrix} \frac{1}{(1 + Ae^{-rt_i})^2} & -\frac{Ke^{-rt_i}}{(1 + Ae^{-rt_i})^2} & \frac{KAte^{-rt_i}}{(1 + Ae^{-rt_i})^2} \end{bmatrix}

The Gauss-Newton update step is then calculated as:

Ī”p=(JTJ)āˆ’1JTr\Delta\mathbf{p} = (\mathbf{J}^T\mathbf{J})^{-1}\mathbf{J}^T\mathbf{r}

The parameters are updated iteratively using:

pk+1=pk+Ī”p\mathbf{p}_{k+1} = \mathbf{p}_k + \Delta\mathbf{p} until convergence or a maximum number of iterations is reached.

% Example data
years = [1900, 1920, 1940, 1960, 1980, 2000].';
population = [10000, 15000, 30000, 60000, 90000, 120000].';

% Initial guess for parameters
initial_params = [150000, 10, 0.02].';

% Perform non-linear least squares using Gauss-Newton method
max_iter = 100;
tol = 1e-6;
optimized_params = gauss_newton(initial_params, years - 1900, population, max_iter, tol);

% Generate points for the optimized curve
years_plot = linspace(0, 100, 100) + 1900;
population_plot = logistic_growth(optimized_params, years_plot - 1900);

% Plot the data points and optimized curve
figure;
hold on;
scatter(years, population, 'r', 'filled');
plot(years_plot, population_plot, 'b');
xlabel('Year');
ylabel('Population');
title('Non-linear Least Squares Fit');
legend('Data Points', 'Optimized Curve');
hold off;

fprintf('Optimized Parameters:\n');
fprintf('K = %.4f\n', optimized_params(1));
fprintf('A = %.4f\n', optimized_params(2));
fprintf('r = %.4f\n', optimized_params(3));



% Logistic growth model
function P = logistic_growth(params, t)
    K = params(1);
    A = params(2);
    r = params(3);
    P = K ./ (1 + A * exp(-r * t));
end

% Jacobian matrix of the logistic growth model
function J = jacobian(params, t)
    K = params(1);
    A = params(2);
    r = params(3);
    J = zeros(length(t), 3);
    J(:, 1) = 1 ./ (1 + A * exp(-r * t)).^2;
    J(:, 2) = -K * exp(-r * t) ./ (1 + A * exp(-r * t)).^2;
    J(:, 3) = K * A * t .* exp(-r * t) ./ (1 + A * exp(-r * t)).^2;
end

% Gauss-Newton method for non-linear least squares
function params = gauss_newton(params, t, P, max_iter, tol)
    for i = 1:max_iter
        residuals = P - logistic_growth(params, t); 
        J = jacobian(params, t);
        delta = (J' * J) \ (J' * residuals);
        params = params + delta;
        if norm(delta) < tol
            break;
        end
    end
end


Output:

Optimized Parameters:
K = 160463.4963
A = 20.8482
r = 0.0412
image

The plot shows the data points and the optimized curve obtained using the Gauss-Newton method for the non-linear logistic growth model.

Conclusion:

In this tutorial, we explored the Gauss-Newton method for curve fitting using MATLAB. We derived the mathematical formulation of the method and implemented it in MATLAB. The Gauss-Newton method is a powerful tool for solving non-linear least squares problems and can be applied to various curve fitting and parameter estimation tasks in real-life scenarios.

I hope this tutorial helps you understand the Gauss-Newton method and its implementation in MATLAB! Let me know if you have any further questions.