next up previous print clean
Next: PGF90 compiler issues Up: Biondi et al.: Testing Previous: Large problem results

KIRCHHOFF MIGRATION

Ideally, Kirchoff offset migration can be written to be almost perfectly parallel. Each node can be given a region of image space and the data within a given aperture of the imaging volume. Each node then independently read in the portion of the traveltime table it needs, and sum the corresponding input data to form the output model. No communication between nodes is needed until all threads have finished their given imaging volumes.

Such an implementation was not possible for this test. In order to implement the parallel scheme above, each processor must be able to seek and read the traveltime table while no other thread is operating on the file. OpenMP, accounts for such difficulty with the CRITICAL construct. Unfortunately, the CRITICAL construct is not handled correctly by the Portland Group's pgf90 compiler. To overcome this limitation a section of the traveltimes were read in and then the output CMP's within this region were parallelized over.

For each of the four computers, we tested the speed both within the parallel region (Figure 7), and of the entire program (Figure 8). Within the parallel region all four machines scaled fairly well. The notable exception being the Origin 200 when going across the Cray Link cable (from two to three processors). For overall speed within the parallel region the Xeon four-processor machine performed the best.

 
kirch.par
kirch.par
Figure 7
Relative speed of the parallel portion of the Kirchoff migration on the various testing platforms. 1 represents the SGI, Origin 200; 2 is the SGI 1400L ; 3 is the VA Start X MP; and 4 is the SGI Power Challenge. In each case the solid line represent actual performance, the dashed lines ideal performance.
view

 
kirch.tot
kirch.tot
Figure 8
Relative speed of the total execution time of the Kirchoff migration on the various testing platforms. 1 represents the SGI, Origin 200; 2 is the SGI 1400L ; 3 is the VA Start X MP; and 4 is the SGI Power Challenge. In each case the solid line represent actual performance, the dashed lines ideal performance.
view

For the entire code the results were more interesting. Because Kirchoff migration is so I/O intensive, the performances of the machine's I/O became important. Where the advantage of the Xeon versus the Power Challenge was nearly 4:1 on four processors within the parallel region it was only 2.5:1 when I/O was taken into account. The VA Start X performed particularly poorly when accounting for I/O. Its speed advantage compared to the Power Challenge dropped from 3.9 to 2.1 when accounting for I/O. The VA's I/O problem seems to be more hardware rather than OS related. The Xeon was running the same OS, and saw its speed advantage drop only from 4.1 to 2.7.



 
next up previous print clean
Next: PGF90 compiler issues Up: Biondi et al.: Testing Previous: Large problem results
Stanford Exploration Project
10/25/1999