Cray-3/SSS
From Wikipedia, the free encyclopedia
The Cray-3/SSS was a pioneering massively parallel supercomputer project that bonded a Cray-3 to a new SIMD processing unit based entirely in the computer's main memory. It was apparently later considered as an add-on for the Cray T90 series in the form of the T94/SSS, but it seems highly unlikely this was ever built.
The SSS project started after a Cray Computer Corporation (CCC) engineer, Ken Iobst, noticed a novel way to implement a parallel computer. Previous massively-SIMD designs, like the Connection Machines, consisted of a large number of individual processing elements consisting of a simple processor and some local memory. Results that needed to be passed from element to element, and there was always too much of this, were passed along networking links at relatively slow speeds. Iobst's idea was to use the super-fast scatter/gather hardware from the Cray-3 to move the data around instead of using a separate network. This would offer at least an order of magnitude better performance than systems based on "commodity" hardware. Better yet, the machine would still include a complete Cray-3 CPU, allowing the machine as a whole to use either SIMD or vector instructions depending on the particulars of the problem.
Now all that remained was the selection of a processor. Since the machine had a vector processor for heavy lifting, the SIMD processors themselves could be considerably simpler, handling only the most basic instructions. This is where the SSS concept was truly unique; since the problem with most SIMD machines was moving data around, Iobst suggested that the processors be build into the SRAM chips themselves. Memory is normally organized within the RAM chips in a row/column format, with a controller on the chip reading requested data from the chip in parallel across the rows, then assembling the results into 32- or 64-bit words for processing by the CPU. In the SSS concept the chips would also be equipped with a series of single-bit computers operating on a particular column of all the rows at once -- this meant that the processors could access data at incredible speeds, about 100x as fast as normal. Add to this the speed of the "network" implemented by the scatter/gather hardware, and the system could be scaled to sizes considerably greater than existing SIMD systems.
In 1994 the NSA contracted CCC to build a 512,000 processor design with 2048 processors per RAM chip. National Semiconductor was selected to produce Iobst's design, where Mark Norder and Jennifer Schrader modified the design and laid it out for production. However the contract was cancelled in January 1994, long before the machine reached the full prototype stage, and the SSS concept was abandoned.