In pervious post, we shows the Gauss-Newton method for fitting non-linear function. The disadvantage of that method is that the inverse matrix could be ill-defined. This makes the method unstable.

Back to the basic, we want to minimize the sum of square of residual (SSR). The SSR is,

The derivative on ,

Many literatures denote , which is the Jacobian. The second derivative of is Hessian matrix .

The Gradient Descent method is that ,

where is a step size. The gradient descent changes the SSR using the steepest path. The step size has to be adjusted. The simplest way to adjust is testing the . If , the increases, else decreases. This method is slow but stable. It is slow because of finding the . It is stable because the method is always computable.

Thus, we have 2 methods, Gradient Descent is stable and slow, Gauss-Newton method is unstable but fast. Levenberg-Marquardt Algorithm combined this 2 methods so that it is stable and fast by solving,

where is an adjustable parameter. When , the is neglectable and the method becomes Gradient Descent with small . When the , the method becomes Gauss-Newton method.

Usually, the is small. The Gauss-Newton method is very good near the minimum of SSR, while Gradient Descent is better far away.

When the , , else . I don’t know the exact reason for this setting. In fact if you set oppositely, the method is still work in most cases.

The method add on the , the inverse is always well-define. Therefore, this method is stable.