Abstract
Many, new high performance computers for supercomputing, and real time application are emerging from DARPA's Strategic Computing Initiative. Also, traditional companies and start-ups are rallying to build new, parallel computers. The Technology is based on parallelism and gains in VLSI, and packaging.
Power available from a single computer due to parallelism is doubling almost every year while circuit technology power gains at the maximum clock speed continue to double every 5 years or roughly 14% per year. Maximum power will not be available to a single job unless parallelism is conquered. Neither the user nor computer science community is moving rapidly enough to understand and exploit the potential performance gains coming from an increasing number of parallel processors which are scaleable and compatible to a very high degree.
The “main line” of scientific computing for the next generation (1988-1994) will be a vector multiprocessor with 4-6 and evolving to 64 processors. A range of supers (>$10M), minisupers (<$1M), and personal or solo supers (<$100K) continue to evolve. No economy of scale, measured by processing operation/sec./$, will be observable over the range. In fact, the new class of graphics super appears to provide a dis-economy of scale for general purpose computing. The lack of a high performance National Research Network to couple users regional computers and increasing needs for visualizing results, favor a highly distributed environment for most applications.
Multiprocessors, sans vector processing, with a very large number of processors have not yet been adopted broadly. Poor microprocessor floating point performance and lack of parallelizing compilers limit them in technical applications. However, such computers are clearly superior to existing large systems for transaction processing, batch, program development, and real time. Furthermore, a wide range of computers from two to several hundred processors can be constructed using the same basic components to achieve economy through scalability.
Plain Old one chip micro-Processors (POP's) are becoming very fast, and approach the speed for scalar/integer work of the largest mainframes and supers. Attached vector units will make such uniprocessors very useful and cost-effective in workstations and small computers. A factor of 5-10 is feasible by the early 90's by using ECL technology, in the short term.
Multicomputers, a collection of 32-1024 interconnected computers communicating with one another via passing messages are the most cost-effective, by a factor of 10, for single scientific jobs, provided the problem is compatible with the computer. Multicomputers require reprogramming, are used on one problem at a time, achieve supercomputer power, and cost $25K-$1M depending on the number of computers and their power.
A single instruction, massively large data (SIMD) computer, the Connection Machine, has become the supercomputer for several applications. The current Connection Machine, CM2, is scaleable and comes in a variety of sizes from 8K to 64K processing elements. Like other programmed applications-specific computers the Connection Machine runs only one (or a few) programs at a given time, with a resulting performance/price advantage of a factor of 10 over a general purpose computer.