(10) | ||
(11) |
To fix ideas, let us examine a toy example. The data and the first three rows of the matrix below are random numbers truncated to integers. The model roughening operator is a first differencing operator times 100.
d(m) F(m,n) iter Norm
--- ------------------------------------------------ ---- -----------
41. -55. -90. -24. -13. -73. 61. -27. -19. 23. -55. 1 20.00396538
33. 8. -86. 72. 87. -41. -3. -29. 29. -66. 50. 2 12.14780140
-58. 84. -49. 80. 44. -52. -51. 8. 86. 77. 50. 3 8.94393635
0. 100. 0. 0. 0. 0. 0. 0. 0. 0. 0. 4 6.04517126
0. -100. 100. 0. 0. 0. 0. 0. 0. 0. 0. 5 2.64737511
0. 0.-100. 100. 0. 0. 0. 0. 0. 0. 0. 6 0.79238468
0. 0. 0.-100. 100. 0. 0. 0. 0. 0. 0. 7 0.46083349
0. 0. 0. 0.-100. 100. 0. 0. 0. 0. 0. 8 0.08301232
0. 0. 0. 0. 0.-100. 100. 0. 0. 0. 0. 9 0.00542009
0. 0. 0. 0. 0. 0.-100. 100. 0. 0. 0. 10 0.00000565
0. 0. 0. 0. 0. 0. 0.-100. 100. 0. 0. 11 0.00000026
0. 0. 0. 0. 0. 0. 0. 0.-100. 100. 0. 12 0.00000012
0. 0. 0. 0. 0. 0. 0. 0. 0.-100. 100. 13 0.00000000
Notice at the tenth iteration, the residual suddenly plunges 4 significant digits. Since there are ten unknowns and the matrix is obviously full-rank, conjugate-gradient theory tells us to expect the exact solution at the tenth iteration. This is the first miracle of conjugate gradients.
The second miracle of conjugate gradients is exhibited below. The data and data fitting matrix are the same, but the model damping is simplified.
d(m) F(m,n) iter Norm
--- ------------------------------------------------ ---- ----------
41. -55. -90. -24. -13. -73. 61. -27. -19. 23. -55. 1 3.64410686
33. 8. -86. 72. 87. -41. -3. -29. 29. -66. 50. 2 0.31269890
-58. 84. -49. 80. 44. -52. -51. 8. 86. 77. 50. 3 -0.00000021
0. 100. 0. 0. 0. 0. 0. 0. 0. 0. 0. 4 -0.00000066
0. 0. 100. 0. 0. 0. 0. 0. 0. 0. 0. 5 -0.00000080
0. 0. 0. 100. 0. 0. 0. 0. 0. 0. 0. 6 -0.00000065
0. 0. 0. 0. 100. 0. 0. 0. 0. 0. 0. 7 -0.00000088
0. 0. 0. 0. 0. 100. 0. 0. 0. 0. 0. 8 -0.00000074
0. 0. 0. 0. 0. 0. 100. 0. 0. 0. 0. 9 -0.00000035
0. 0. 0. 0. 0. 0. 0. 100. 0. 0. 0. 10 -0.00000037
0. 0. 0. 0. 0. 0. 0. 0. 100. 0. 0. 11 -0.00000018
0. 0. 0. 0. 0. 0. 0. 0. 0. 100. 0. 12 0.00000000
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 100. 13 0.00000000
Even though the matrix is full-rank, we see the residual drop about 6 decimal places after the third iteration!
Practitioners usually don't like the identity operator for model-shaping but they want the acceleration in convergence. These goals are reconciled by a change in variables also known as preconditioning. The new variables are .The helix gives us a large family of invertable multidimensional roughening operators.