The previous lemma is very important since it shows that starting from an initial spd Hessian , the
next approximation of the Hessian is spd (given that
). This guarantees the existence of
a minimizer for the function f (the inverse
is also spd).
It can be shown Kelley (1999) that given some assumptions, the BFGS iterates
are defined and converge q-superlinearly
to the local minimizer
.
In practice, the storage needed to compute the update and the possibility that
are important issues. The updated Hessian is computed at each iteration
recursively. For this, we need to store a solution step vector
and a gradient step vector
after each iteration. If for small problems this storage is not an issue, it may become
critical for large-scale problems. Unfortunately, these large-scale problems occur in geophysics,
and we need to find a better storage solution. Nocedal (1980) gives an interesting answer to this problem.
Instead of keeping all the
and
from the past iterations, we update the Hessian using the information
from the m previous iterations, where m is given by the end user. This means that when the number
of iterations is smaller than m, we have a ``real'' BFGS update, and when it is larger than m, we have
a Limited-memory BFGS (L-BFGS) update.