Packing both missing data and filter into a vector

Next: LEVELED INVERSE INTERPOLATION Up: BOTH MISSING DATA AND Previous: Objections to interpolation error

Packing both missing data and filter into a vector

Now let us examine the theory and coding behind the above examples. Define a roughening filter $A(\omega )$ and a data signal $Y(\omega )$ at some stage of interpolation. The fitting goal is $0 \approx A(\omega ) Y(\omega )$ where the filter $A(\omega )$ has at least one time-domain coefficient constrained to be nonzero and the data contains both known and missing values. Think of perturbations $\Delta A$ and $\Delta Y$ .We neglect the nonlinear term $\Delta A\,\Delta Y$ as follows:

$\begin{eqnarray} 0 &\approx & (A \ +\ \Delta A)( Y\ +\ \Delta Y) \\ 0 &\approx &... ...lta Y \\ 0 &\approx & A\,\Delta Y \ +\ Y\,\Delta A \ +\ AY\end{eqnarray}$ (31)

(32)

(33)

Linearizing a nonlinear fitting goal such as (33) seems reasonable. I linearized this way to first obtain the results of this chapter. Never-the-less, I find linearizing is a dangerous procedure that quickly leads to many problems. In reality, the term $\Delta A\,\Delta Y$ might not actually be small and the solution might not converge. You don't know how small is small, because these are not scalars but operators. The answer will depend on where you start from. Strange things happen. You can feel a sense of frustration. There is good news: Two stage linear least squares offers relief. You should first use linear least squares to get close to your answer. Generally, when finding the filter A we have plenty of regression equations, so we can afford to neglect the ones that depend upon unknown data. Then you can fix the filter and in a second stage, find the missing data. If you don't have enough regression equations because your data is irregularly distributed, then you can use binning. Still not enough? Try coarser bins. The point is that nonlinear solvers will not work unless you first get sufficiently close to the solution, and the way you get close is by arranging first to solve a sensible (though approximate) linearized problem. Only as a last resort, after you have gotten as near as you can, should you use the nonlinear least-squares techniques described below.

Let us use matrix algebraic notation to rewrite the fitting goals (33). For this we need mask matrices

(diagonal matrices with ones on the diagonal where variables are free and zeros where they are constrained i.e., where $\Delta a_i=0$ and $\Delta y_i=0$ ). The free-mask matrix for missing data is denoted $\bold J$ and that for the PE filter is $\bold K$ .The fitting goal (33) becomes

$\begin{displaymath} \bold 0 \quad \approx \quad \bold A \bold J \Delta \bold y +... ...ta \bold a + ( \bold A \bold y {\rm \ or \ } \bold Y \bold a )\end{displaymath}$ (34)

or

$\begin{displaymath} \bold 0 \quad\approx\quad \left[ \begin{array} {cc} \bol... ...y \\ \Delta \bold a \end{array} \right] \ +\ \bold A \bold y\end{displaymath}$ (35)

For a 3-term filter and a 7-point data signal, the fitting goal (34) becomes

$\begin{displaymath} \left[ \begin{array} {rrrrrrrrrr} a_0& . & . & . & . & . &... ...r_7 \\ r_8 \\ 0 \end{array} \right] \quad \approx \ \bold 0\end{displaymath}$ (36)

where $\bold r = \bold A \bold y$ .In other words, r_t is the convolution of a_t with y_t, namely, r₀=y₀ a₀ and r₁=y₀ a₁ + y₁ a₀, etc. To optimize this fitting goal we first initialize $\bold a= (1,0,0,\cdots )$ and then put zeros in for missing data in $\bold y$ .Then we iterate over equations (37) to (41).

$\begin{displaymath} \bold r \quad\longleftarrow\quad \bold A \bold y\end{displaymath}$ (37)

$\begin{displaymath} \left[ \begin{array} {c} \Delta \bold y \\ \Delta \bold a... ...\bold A' \\ \bold K' \bold Y' \end{array} \right] \ \bold r\end{displaymath}$ (38)

$\begin{displaymath} \Delta \bold r \quad\longleftarrow\quad \left[ \begin{arr... ...ay} {c} \Delta \bold y \\ \Delta \bold a \end{array} \right]\end{displaymath}$ (39)

$\begin{eqnarray} \bold y &\longleftarrow& {\rm cgstep}( \bold y, \Delta \bold y) \\ \bold a &\longleftarrow& {\rm cgstep}( \bold a, \Delta \bold a)\end{eqnarray}$ (40)

(41)

This is the same idea as all the linear fitting goals we have been solving, except that now we recompute the residual $\bold r$ inside the iteration loop so that as convergence is achieved (if it is achieved), the neglected nonlinear term $\Delta A \Delta Y$ tends to zero.

I no longer exhibit the nonlinear solver missif until I find an example where it produces noticibly better results than multistage linear-least squares.

Next: LEVELED INVERSE INTERPOLATION Up: BOTH MISSING DATA AND Previous: Objections to interpolation error

Stanford Exploration Project
2/27/1998

	$\begin{eqnarray} 0 &\approx & (A \ +\ \Delta A)( Y\ +\ \Delta Y) \\ 0 &\approx &... ...lta Y \\ 0 &\approx & A\,\Delta Y \ +\ Y\,\Delta A \ +\ AY\end{eqnarray}$	(31)
		(32)
		(33)

	$\begin{eqnarray} \bold y &\longleftarrow& {\rm cgstep}( \bold y, \Delta \bold y) \\ \bold a &\longleftarrow& {\rm cgstep}( \bold a, \Delta \bold a)\end{eqnarray}$	(40)
		(41)