** Next:** Understanding CG magic and
** Up:** THE WORLD OF CONJUGATE
** Previous:** Coding nonlinear fitting problems

The conjugate-direction method is really a family of methods.
Mathematically, where there are *n* unknowns, these algorithms all
converge to the answer in *n* (or fewer) steps. The various methods
differ in numerical accuracy, treatment of underdetermined systems,
accuracy in treating ill-conditioned systems, space requirements, and
numbers of dot products. Technically, the method of CD used in the
`cgstep` module is not the
conjugate-gradient method itself, but is equivalent to it. This
method is more properly called the conjugate-direction method
with a memory of one step. I chose this method for its clarity and
flexibility. If you would like a free introduction and summary of
conjugate-gradient methods, I particularly recommend *An
Introduction to Conjugate Gradient Method Without Agonizing Pain *
by Jonathon Shewchuk, which you can downloadhttp://www.cs.cmu.edu/afs/cs/project/quake/public/papers/painless-conjugate-gradient.ps.
I suggest you skip over the remainder of this section and return
after you have seen many examples and have developed some expertise,
and have some technical problems.

The conjugate-gradient method was introduced
by Hestenes and Stiefel in 1952.
To read the standard literature and relate it to this book,
you should first realize that when I write fitting goals like

| |
(90) |

| (91) |

they are equivalent to minimizing the quadratic form:
| |
(92) |

The optimization theory (OT) literature starts from a minimization of
| |
(93) |

To relate equation (92) to (93)
we expand the parentheses in (92)
and abandon the constant term .Then gather the quadratic term in and the linear term in .There are two terms linear in that are transposes of each other.
They are scalars so they are equal.
Thus, to invoke ``standard methods,'' you take
your problem-formulation operators , , and create two modules that apply the operators
| |
(94) |

| (95) |

The operators and operate on model space.
Standard procedures do not require their adjoints
because is its own adjoint and reduces model space to a scalar.
You can see that computing and requires
one temporary space the size of data space
(whereas `cgstep` requires two).
When people have trouble with conjugate gradients or conjugate
directions, I always refer them to the Paige and Saunders
algorithm `LSQR`. Methods that form explicitly or
implicitly (including both the standard literature and the book3
method) square the condition number, that is, they are twice as
susceptible to rounding error as is `LSQR`.

** Next:** Understanding CG magic and
** Up:** THE WORLD OF CONJUGATE
** Previous:** Coding nonlinear fitting problems
Stanford Exploration Project

4/27/2004