and
show
the result of doing a real and complex matrix multiplication with a matrix
of four million elements. Note how the Intel specific code is
significantly faster than the corresponding PGF code.
The speed advantage offered by the Intel compiler led
us to choose the dual P3 option for our cluster.
|
float-comp
Figure 2 Speed comparison for matrix multiplication. The horizontal axis is machine (P2-550, P3-800, P4-1700). The different curves represent the PGF compiler, Intel compiler, and Intel with machine-specific instructions. Note the significant advantage of the Intel machine specific code. | ![]() |
|
complex-comp
Figure 3 Speed comparison for complex matrix multiplication. The horizontal axis is machine (P2-550, P3-800, P4-1700). The different curves represent the PGF compiler, Intel compiler, and Intel with machine specific instructions. Note the significant advantage of the Intel machine specific code. | ![]() |