Nonlinear L.S. conjugate-direction template

Next: Minimizing the Cauchy function Up: MEANS, MEDIANS, PERCENTILES AND Previous: Multivariate estimation by iterated

Nonlinear L.S. conjugate-direction template

nonlinear optimization Nonlinear optimization arises from two causes:

1.: Nonlinear physics. The operator depends upon the solution being attained.
2.: Nonlinear statistics. We need robust estimators like the $\ell^1$ norm.

The computing template below is useful in both cases. It is almost the same as the template for weighted linear least-squares except that the residual is recomputed at each iteration. Starting from the usual weighted least-squares template we simply move the iteration statement a bit earlier.

where ${\bf diag}[w(\bold r)]$ is whatever weighting function we choose along the diagonal of a diagonal matrix.

Now let us see how the weighting functions relate to robust estimation: Notice in the code template that $\bold W$ is applied twice in the definition of $\bold \Delta\bold m$ .Thus $\bold W$ is the square root of the diagonal operator in equation (13).

$\begin{displaymath} \bold W \eq \ {\bf diag} \left( {1\over\sqrt{\sqrt{1+r_i^2/\bar r^2}}} \right)\end{displaymath}$ (14)

Module solver_irls implements the computational template above. In addition to the usual set of arguments from the solver() subroutine , it accepts a user-defined function (parameter wght) for computing residual weights. Parameters nmem and nfreq control the restarting schedule of the iterative scheme. solver_irlsiteratively reweighted optimization

We can ask whether cgstep(), which was not designed with nonlinear least-squares in mind, is doing the right thing with the weighting function. First, we know we are doing weighted linear least-squares correctly. Then we recall that on the first iteration, the conjugate-directions technique reduces to steepest descent, which amounts to a calculation of the scale factor $\alpha$ with

$\begin{displaymath} \alpha \eq -\ { \Delta \bold r \cdot \bold r \over \Delta \bold r \cdot \Delta\bold r }\end{displaymath}$ (15)

Of course, cgstep() knows nothing about the weighting function, but notice that the iteration loop above nicely inserts the weighting function both in $\bold r$ and in $\Delta \bold r$ ,as required by (15).

Experience shows that difficulties arise when the weighting function varies rapidly from one iteration to the next. Naturally, the conjugate-direction method, which remembers the previous iteration, will have an inappropriate memory if the weighting function changes too rapidly. A practical approach is to be sure the changes in the weighting function are slowly variable.

Next: Minimizing the Cauchy function Up: MEANS, MEDIANS, PERCENTILES AND Previous: Multivariate estimation by iterated

Stanford Exploration Project
4/27/2004