next up previous print clean
Next: Minimizing the Cauchy function Up: MEANS, MEDIANS, PERCENTILES AND Previous: Multivariate estimation by iterated

Nonlinear L.S. conjugate-direction template

nonlinear optimization Nonlinear optimization arises from two causes:
1.
Nonlinear physics. The operator depends upon the solution being attained.
2.
Nonlinear statistics. We need robust estimators like the $\ell^1$ norm.
The computing template below is useful in both cases. It is almost the same as the template for weighted linear least-squares except that the residual is recomputed at each iteration. Starting from the usual weighted least-squares template we simply move the iteration statement a bit earlier.

		 iterate { 
		 		 $\bold r \quad\longleftarrow\quad\bold F \bold m - \bold d$ 
		 		 $\bold W \quad\longleftarrow\quad{\bf diag}[w(\bold r)]$ 
		 		 $\bold r \quad\longleftarrow\quad\bold W \bold r $ 
		 		  $\Delta\bold m \quad\longleftarrow\quad\bold F'\bold W'\ \bold r$ 
		 		  $\Delta\bold r\ \quad\longleftarrow\quad\bold W \bold F \ \Delta \bold m$ 
		 		  $(\bold m,\bold r) \quad\longleftarrow\quad{\rm cgstep}
 (\bold m,\bold r, \Delta\bold m,\Delta\bold r )$ 
		 		 } 
where ${\bf diag}[w(\bold r)]$ is whatever weighting function we choose along the diagonal of a diagonal matrix.

Now let us see how the weighting functions relate to robust estimation: Notice in the code template that $\bold W$ is applied twice in the definition of $\bold \Delta\bold m$.Thus $\bold W$ is the square root of the diagonal operator in equation (13).  
 \begin{displaymath}
\bold W \eq \ {\bf diag} \left( {1\over\sqrt{\sqrt{1+r_i^2/\bar r^2}}} \right)\end{displaymath} (14)

Module solver_irls [*] implements the computational template above. In addition to the usual set of arguments from the solver() subroutine [*], it accepts a user-defined function (parameter wght) for computing residual weights. Parameters nmem and nfreq control the restarting schedule of the iterative scheme. solver_irlsiteratively reweighted optimization

We can ask whether cgstep(), which was not designed with nonlinear least-squares in mind, is doing the right thing with the weighting function. First, we know we are doing weighted linear least-squares correctly. Then we recall that on the first iteration, the conjugate-directions technique reduces to steepest descent, which amounts to a calculation of the scale factor $\alpha$ with  
 \begin{displaymath}
\alpha \eq -\ { 
 \Delta \bold r \cdot \bold r
 \over
 \Delta \bold r \cdot \Delta\bold r
 }\end{displaymath} (15)
Of course, cgstep() knows nothing about the weighting function, but notice that the iteration loop above nicely inserts the weighting function both in $\bold r$ and in $\Delta \bold r$,as required by (15).

Experience shows that difficulties arise when the weighting function varies rapidly from one iteration to the next. Naturally, the conjugate-direction method, which remembers the previous iteration, will have an inappropriate memory if the weighting function changes too rapidly. A practical approach is to be sure the changes in the weighting function are slowly variable.


next up previous print clean
Next: Minimizing the Cauchy function Up: MEANS, MEDIANS, PERCENTILES AND Previous: Multivariate estimation by iterated
Stanford Exploration Project
4/27/2004