If we are not careful, our calculation of the PEF could have the pitfall that it would try to use the missing data to find the PEF, and hence it would get the wrong PEF. To avoid this pitfall, imagine a PEF finder that uses weighted least squares where the weighting function vanishes on those fitting equations that involve missing data. The weighting would be unity elsewhere. Instead of weighting bad results by zero, we simply will not compute them. The residual there will be initialized to zero and never changed. Likewise for the adjoint, these components of the residual will never contribute to a gradient. So now we need a convolution program that produces no outputs where missing inputs would spoil it.
Recall there are two ways of writing convolution, equation () when we are interested in finding the filter inputs, and equation () when we are interested in finding the filter itself. We have already coded equation (), operator helicon . That operator was useful in missing data problems. Now we want to find a prediction-error filter so we need the other case, equation (), and we need to ignore the outputs that will be broken because of missing inputs. The operator module hconest does the job. hconesthelix convolution, adjoint is the filter
Now identify the broken regression equations, those that use missing data. Suppose that y2 and y3 were missing or bad data values in the fitting goal (27). That would spoil the 2nd, 3rd, 4th, and 5th fitting equations. Thus we would want to be sure that w2, w3, w4 and w5 were zero. (We'd still be left enough equations to find (a2,a3).)
(27) |
(28) |
From this example we recognize a general method
for identifying defective regression equations
and weighting them by zero:
Prepare a vector like with ones where data is missing
and zeros where the data is known.
Prepare a vector like where all values are ones.
These are the vectors we put in
equation (28)
to find the mi and hence the needed weights wi.
It is all done in module misinput.
misinputmark bad regression equations