One common method of solving linear inversion problems of the form
is Lanczos's method (Lanczos, 1950; Golub and Van Loan,
1983; van der Sluis and van der Vorst, 1987). This method introduces
a sequence of orthonormal vectors
through a process of
tridiagonalization. Applying this method to the normal equations
,Lanczos's method is a projection procedure equivalent to the following:
[^(1)(^(1))^T+^(2)(^(2)
)^T]^T^(1) = ^T^(1),
and, for ,
[^(k-1)(^(k-1))^T+^(k)(^(k)
)^T+^(k+1)(^(k+1))^T]^T
^(k) = ^T^(k).
It is now clear that (lanczos1) defines as the unit vector
in the direction of
, and can have no terms from the right null
space of . Equation (lanczos2) defines
as the unit vector found by removing the component of
in the direction of
and then normalizing. Equation
(lanczos3) determines
as the unit vector along the
component of
that is orthogonal to both
and
. By construction,
(^(i))^T^(j) = _ij, so the vectors are orthonormal.
Defining the constants
D_k = (^(k))^T^T^(k) for k = 1,2,..., and
N_k+1 = (^(k+1))^T^T^(k) for k = 1,2,..., it is then clear that the equations (lanczos1)-(lanczos3) determine a tridiagonal system of the form
0&...&0&^(k+1)N_k+1+ _kD_1 & N_2 & & & N_2 & D_2 & N_3 & & & N_3 & D_3 & N_4 & & & & & & & & N_k & D_k = ^T_k for 2kr, where the orthogonal matrix composed of the resulting orthonormal vectors is
The process stops when k=r (the rank of the matrix) because then Nr+1 = 0, or is numerically negligible. It follows that this tridiagonalization process results in the identity
^T= _r_r_r^T, where the tridiagonal matrix of coefficients is defined by
_k = D_1 & N_2 & & &
N_2 & D_2 & N_3 & &
& N_3 & D_3 & N_4 &
& & & &
& & & N_k & D_k
for 2k r.
Since is invertible (by the definition of the rank r of ),
The solution to the least-squares inversion problem may therefore be written as
= _r(_r)^-1_r^T^T= N_1_r(_r)^-1100, where I used (lanczos1), (d1), and the fact that
_r^T^(1) = 100.
It follows that only the first column of the inverse of is needed
to solve this inversion problem.
Since Lanczos's method directly produces a sequence of orthonormal vectors in the model space, it is straightforward to see that the model resolution matrix for this method is given by
_model = ^T(^T)^= _r_r^T = _k=1^r ^(k)(^(k))^T, which is also clearly symmetric. It is however more difficult to compute the data resolution matrix. Using the fact that
_data = ^= (^T)^ ^T together with (Adagger), I find that
_data = _r(_r)^-1_r^T^T. It is clear that both of the resolution matrices are symmetric when the full Lanczos inverse has been computed. If the process is terminated early so that k < r by setting the constant Nk+1 equal to zero, then a Lanczos approximate inverse is given by
_k = _k(_k)^-1_k^T^T,
so that for all k, but
if k < r.
The effective resolution matrices are given by
and
,or are found by replacing
and
in (modelreslanczos) and (datareslanczos) by
and
, respectively. Clearly, the effective resolutions are also
symmetric.
An alternative method of computing the resolution matrices involves noting
that the vector sequence may also be used as the starting set of
vectors for the method of conjugate directions. Then, I can produce a
set of conjugate vectors from this orthogonal set and easily compute both
the model resolution matrix and the data resolution matrix if desired.
This alternative requires additional work, however, to produce the
sequence of conjugate vectors and it also generally produces a model
resolution that is not symmetric, and therefore difficult to interpret.
Thus, the Lanczos procedure appears
to be superior to conjugate directions/gradients from this point of view.
The main disadvantage of the Lanczos method is that the amount of
storage increases with the iteration number k, since all the vectors
must be stored until the final calculation
of
. In contrast, conjugate gradients
requires a fixed amount of storage, but an easily interpretable model
resolution must be sacrificed to gain this advantage.