Next: Programming on the CM
Up: Kneib: Velocity Analysis on
Previous: Introduction
The Thinking Machines Corporation, Cambridge, Massachusetts, released the
first CM-2 in April 1987. Since then, more than 40 units have been installed
in the US and Europe.
The CM-2 is one of only few
parallel computers using the one-data--one-processor concept and
concentrating on increasing the number of processors rather than using
faster processors.
The most important feature of the CM's architecture is the concept of
virtual processors (VPs). Each element of a parallel variable (usually
an array element or an array dimension) is associated with one virtual
processor. VPs are
analogous to virtual memory: In either case the user can address a
nearly infinite number of processors, or memory cells, independent
of the actual hardware configuration.
If as usual the number of physical processors NP is bigger than the number of
virtual processors NV, then every VP is related to one physical
processor. If NV
>NP, at least some physical processors have more than one VP
associated with them. The ratio
is called the
VP-ratio. Because
calculations can involve only the currently active set of VPs, i.e. the
VPs currently present in physical processors, one has
to switch from one VP-set to the next, if the VP-ratio is bigger than one.
This is called VP-looping.
Extensive VP-looping is costly, because it requires a physical shift from
data between the external storage system and the processors' memories.
The high-level language-user can always think of the CM processors
as representing
a Cartesian grid, because Cartesian coordinates can be embeded in the
12-dimensional hypercube that forms the CM's hardware interconnectivity.
The n-dimensional grid of VPs is described by the processor's geometry
and can be defined by the user according to the problem being addressed.
You will find an example in my first program sample.
The user accesses the CM via a front-end computer.
All editing, compiling, and linking are
done on the front end. Programs run on the front end, too, but parallelizable
program parts are executed on the CM.
The Connection Machine itself consists of one, two, or four portions a 8192
or 16384 processors. Each of these parts is accessable to different users
because it has its own sequencer that
interprets the parallel instructions sent by the front end and
activates its processor portion accordingly. I always used 8k processors,
the smallest CM-portion available.
Each CM processor has its own 8k memory and arithmetic logical unit.
Next: Programming on the CM
Up: Kneib: Velocity Analysis on
Previous: Introduction
Stanford Exploration Project
1/13/1998