The downward extrapolation is done for each frequency independently and the imaging summation is performed inside each processor. The FFT subroutine is part of Connection Machine Scientific Software Library (CMSSL) and therefore presumably highly optimized, saving the programmer the annoyance of coding his own parallel 2-D and 3-D transforms. The code is extremely condensed, adding to the already compressed formulation of the Fortran 90 language.
A simplified version of the actual program is provided
to give a flavor of the simplicity of algorithm representation.
The subroutine fft_time
executes a FFT over the time axis,
while the subroutines fft_xy
and fft_inv_xy
perform
forward and inverse FFTs in surface coordinates.
All the variables terminated in ``CM'' are Connection Machine parallel
arrays. The phase multiplication is done using the array phasorCM
and the wave extrapolation field slice for different velocities is stored in the
parallel array PkyzCM
.
The Split-step Fourier method can be easily obtained by minimally modifying the PSPI algorithm. The difference is that the downward extrapolation is done with a single velocity and the perturbation slowness phase shift is performed after the downward extrapolation step. Computationally the major difference between PSPI and Split-step is that PSPI performs three Fourier transforms inside the frequency loop while Split-step performs only two. On the Connection Machine this is reflected accurately in the run-time ratio of the two programs. I get a ratio of 1.45 to 1 between the execution time of the kernels of the two programs (158.01 seconds for PSPI and 108.77 seconds for Split-step). Oddly enough, on the Convex I get a ratio of 7.28 to 1 between the two PSPI and Split-step methods.