The method of Lanczos may be applied to any square, symmetric matrix inversion
problem. To solve the least-squares inversion problem, I applied the
method to the normal matrix in the preceding section. But,
I could have ``completed the square'' with and considered instead
the problem
&
^T & =
0 ,
where is the residual vector.
The resulting modification of Lanczos's method is the LSQR algorithm of
Paige and Saunders (1982).
To derive this algorithm, first apply Lanczos's method to (PSLSQR) and
divide the (m+n)-vector into the two pieces:
in
the m-dimensional data space and
in the n-dimensional model
space. Thus,
^(k) = ^(k)^(k). Then, the tridiagonalization process takes the form
[^(1)(^(1))^T + ^(2)(^(2))^T]& ^T & ^(1) = & ^T & ^(1), and, for k > 2,
[^(k-1)(^(k-1))^T + ^(k)(^(k) )^T + ^(k+1)(^(k+1))^T] & ^T & ^(k) = & ^T & ^(k).
Solving equations (Hlanczos1)-(Hlanczos3) after substituting (zgh) shows that
^(2) = 0 and ^(2)(^(2))^T
^T^(1) = ^T^(1),
and, for ,
[^(2k-1)(^(2k-1))^T + ^(2k+1)(^(2k+1))^T]^(2k) = ^(2k) and ^(2k+1) = 0, and
^(2k+2) = 0 and [^(2k)(^(2k))^T + ^(2k+2)( ^(2k+2))^T]^T^(2k+1) = ^T.^(2k+1) Thus, the tridiagonal system of Lanczos has been reduced to two coupled bidiagonal systems in LSQR.
Defining the constants
b_1 = || = (^(1))^T,
and, for ,
b_2k = (^(2k))^T^T^(2k-1) = (^(2k-1))^T^(2k), and
b_2k+1
= (^(2k))^T^T^(2k+1)
= (^(2k+1))^T^(2k).
Then, the orthogonal matrix of the vectors is defined by
_k = ^(1) & ^(3) & ...& ^(2k-1),
while the corresponding matrix of the vectors is defined by
_k = ^(2) & ^(4) & ...& ^(2k).
Now the two bidiagonal systems may be written, for , as
_kb_2 & b_3 & & & & & b_4 & b_5 & & & & & b_6 & b_7 & & & & & & & & & & & b_2k & = ^T_k, and
0&...&0&^(k+1)b_2k+1 +
_kb_2 & & & &
b_3 & b_4 & & &
& b_5 & b_6 & &
& & & &
& & & b_2k-1 & b_2k =
_k.
Then, introducing the lower bidiagonal matrix
_k = b_2 & & & &
b_3 & b_4 & & &
& b_5 & b_6 & &
& & & &
& & & b_2k-1 & b_2k,
it is clear that, when b2r+1 = 0 (or is numerically negligible),
multiplying (GBH) on the right by yields the formal
relation
_r = _r(_r)^-1_r^T. The coupled equations (HBG) and (GBH) are equivalent to
_r_r^T_r = ^T_r,
which should be compared to (tridiagonal). If the
tridiagonal matrix satisfies
_r = _r^T_r
(so that Dk = b2k2+b2k+12 and Nk = b2k-1b2k),
then Lanczos's orthogonal matrix for the least-squares problem satisfies
.
The solution of the inversion problem may now be written as
= _r= _r(_r)^-1_r^T = b_1_r(_r)^-1100, where I used (gh1), (b1), and the fact that
_r^T^(1) = 100.
It follows that only the first column of the inverse of is needed
to solve the inversion problem.
Once I have the approximate generalized inverse via (MHBG), I can write the model resolution matrix as
_model = ^= _r= _r_r^T. Although it is tempting to think that the data resolution is given similarly by
_data = ^= _r _r_r^T, this is not true in general! It is easy to see that
_r_r^T= ,
by construction. Yet the absurd result seemingly implied by
(counterexample)
that the data resolution is always perfect for this method cannot hold
generally for , since it implies that the data can be perfectly
matched in all problems (all possible choices of and )which is clearly false.
The difficulty is that the initial vector
in LSQR is
proportional to the data vector itself, which may contain contributions
from the data null space if and are inconsistent
(i.e., there may exist a vector such that
but
). Since
the LSQR process never modifies this initial vector and never multiplies it by
, these terms in the data null space never get stripped off
during the LSQR computation.
This difficulty is equivalent to the observation that
for the LSQR algorithm if and
are inconsistent.