research-article

The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions, and Trade-Offs

Authors:

B. Ramakrishna Rau,

David W. L. Yen,

Wei Yen,

Ross A. TowieAuthors Info & Claims

Computer, Volume 22, Issue 1

Pages 12 - 26, 28-30, 32-35

https://doi.org/10.1109/2.19820

Published: 01 January 1989 Publication History

Publisher Site

Abstract

The Cydra 5 is a heterogeneous multiprocessor system that targets small work groups or departments of scientists and engineers. The two types of processors are functionally specialized for the different components of the work load found in a departmental setting. The Cydra 5 numeric processor, based on a directed-data-flow architecture, provides consistently high performance on a broader class of numerical computations. The interactive processors offload all nonnumeric work from the numeric processor, leaving it free to spend all its time on the numeric application. The I/O processors permit high-bandwidth I/O transitions with minimal involvement from the interactive or numeric processors. The system architecture and data-flow architecture are described. The numeric processor decisions and tradeoffs are examined, and the main memory system is discussed. Some reflections on the design issues are offered.

References

[1]

1. "Cydra 5 Departmental Supercomputer Product Summary," Cydrome, Inc., Milpitas, Calif., 1988.

Google Scholar

[2]

2. B.R. Rau, "Cydra 5 Directed Dataflow Architecture," Proc. Compcon Spring 88, No. 828, Computer Society Press, Los Alamitos, Calif., pp. 106-113.

Google Scholar

[3]

3. B.R. Rau, C.D. Glaeser, and R.L. Picard, "Efficient Code Generation for Horizontal Architectures: Compiler Techniques and Architectural Support," Proc. Ninth Ann. Int'l Symp. Computer Architecture, M411 (microfiche), Computer Society Press, Los Alamitos, Calif., 1982, pp. 131-139.

Crossref

Google Scholar

[4]

4. W.C. Yen, D.W.L. Yen, and K.S. Fu, "Data Coherence Problem in a Multicache System," IEEE Trans. Computers, Vol. C-34, No. 1, Jan. 1985, pp. 56-65.

Google Scholar

[5]

5. J.A. Fisher, "Very Long Instruction Word Architectures and the ELI-512," Proc. 10th Ann. Int'l Symp. Computer Architecture, M473 (microfiche), Computer Society Press, Los Alamitos, Calif., 1983, pp. 140-150.

Crossref

Google Scholar

[6]

6. A.E. Charlesworth, "An Approach to Scientific Array Processing: The Architectural Design of the AP-120B/FPS-164 Family," Computer, Vol. 14, No. 9, Sept. 1981, pp. 18-27.

Digital Library

Google Scholar

[7]

7. Y.N. Patt, W.-M. Hwu, and M. Shebanow, "HPS, a New Microarchitecture: Rationale and Introduction," Proc. 18th Ann. Workshop Microprogramming, M653 (microfiche), Computer Society Press, Los Alamitos, Calif., 1985, pp. 103-108.

Crossref

Google Scholar

[8]

8. D. Cohen, "A Methodology for Programming a Pipeline Array Processor," Proc. 11th Ann. Workshop Microprogramming, M204 (microfiche), Computer Society Press, Los Alamitos, Calif., 1978, pp. 82-89.

Crossref

Google Scholar

[9]

9. J.R. Ellis, Bulldog: A Compiler for VLIW Architectures, MIT Press, Cambridge, Mass., 1986.

Digital Library

Google Scholar

[10]

10. Arvind and K.P. Gostelow, "The U-Interpreter," Computer, Vol. 15, No. 2, Feb. 1982, pp. 42-49.

Digital Library

Google Scholar

[11]

11. D.J. Kuck, The Structure of Computers and Computation, John Wiley and Sons, New York, 1978.

Crossref

Google Scholar

[12]

12. W. Abu-Sufah and A.D. Mahoney, "Vector Processing on the Alliant FX/8 Multiprocessor," Proc. Int'l Conf. Parallel Processing, M724 (microfiche), Computer Society Press, Los Alamitos, Calif., 1986, pp. 559-563.

Google Scholar

[13]

13. J.M. van Kats and A.J. Van der Steen, "Minisupercomputers, a New Perspective?" Report TR-24, Academisch Computer Centrum Utrecht, University of Utrecht, Utrecht, Netherlands, May 1987.

Google Scholar

[14]

14. H. Hellerman, Digital Computer System Principles, McGraw-Hill, New York, 1967, pp. 228-229.

Google Scholar

[15]

15. F.A. Briggs and E.S. Davidson, "Organization of Semiconductor Memories for Parallel-Pipelined Processors," IEEE Trans. Computers, Vol. C-25, Feb. 1977, pp. 162-169.

Google Scholar

[16]

16. J.J. Dongarra, "Performance of Various Computers Using Standard Linear Equations Software in a Fortran Environment," Tech. Memo No. 23, Argonne National Laboratory, Argonne, Ill., Jan. 1988.

Google Scholar

[17]

17. F. McMahon, "The Livermore Fortran Kernels," National Technical Information Service, Ann Arbor, Mich., Dec. 1986.

Google Scholar

[18]

18. T.C. Oppe and D.R. Kincaid, "The Performance of ITPACK on Vector Computers for Solving Large Sparse Linear Systems Arising in Sample Oil Reservoir Simulation Problems," Comm. Applied Numerical Methods, Vol. 3, 1987, pp. 23-29.

Crossref

Google Scholar

Cited By

View all

Gonzalez AKolli AKhan SLiu SDadu VKarandikar SChang JAsanovic KRanganathan PSolihin YHeinrich M(2023)Profiling Hyperscale Big Data ProcessingProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589082(1-16)Online publication date: 17-Jun-2023
Yeh TMarr DPatt Y(2014)Increasing the instruction fetch rate via multiple branch prediction and a branch address cacheACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2667167(183-192)Online publication date: 10-Jun-2014
Hennessy JPatterson D(2011)Computer Architecture, Fifth EditionundefinedOnline publication date: 29-Sep-2011
Show More Cited By

Index Terms

Recommendations

Overlapped loop support in the Cydra 5
Special issue: Proceedings of ASPLOS-III: the third international conference on architecture support for programming languages and operating systems

The Cydra^TM 5 architecture adds unique support for overlapping successive iterations of a loop to a very long instruction word (VLIW) base. This architecture allows highly parallel loop execution for a much larger class of loops than can be vectorized, ...
Overlapped loop support in the Cydra 5
ASPLOS III: Proceedings of the third international conference on Architectural support for programming languages and operating systems

The Cydra^TM 5 architecture adds unique support for overlapping successive iterations of a loop to a very long instruction word (VLIW) base. This architecture allows highly parallel loop execution for a much larger class of loops than can be vectorized, ...
Architecture and implementation of a VLIW supercomputer
Supercomputing '90: Proceedings of the 1990 ACM/IEEE conference on Supercomputing

Very-Long-Instruction-Word (VLIW) computers achieve high performance by exploiting the fine-grain parallelism present in sequential or vectorizable code. Multiflow's /200 and /300 VLIW systems yielded near-supercomputer performance by this means despite ...

Reviews

Reviewer: Peter C. Patton

The Cydra 5 departmental supercomputer was designed for small work groups or departments of scientists and engineers. It costs about the same as a high-end superminicomputer but it can achieve about one-third to one-half the performance of a supercomputer. It does this by using high-speed, air-cooled, emitter-coupled-logic technology in a product that includes many architectural innovations. The Cydra 5 is a heterogeneous multiprocessor system. The two types of processors are functionally specialized for the different components of the workload found in a departmental setting. The numeric processor, a proprietary device based on directed-dataflow architecture, is aided by a stride-insensitive high-bandwidth main memory system. Interactive processors offload all nonnumeric work from the numeric processor, leaving it free for numerical computation. The Cydra 5 grew out of eight years of research and development that dates back to work done at TRW Array Processors and at ESL (a subsidiary of TRW). The polycyclic architecture developed at TRW and ESL was a precursor of the directed-dataflow architecture developed at Cydrome starting in 1984. The common theme that linked both efforts was the desire to support a dataflow model of computation with as simple a hardware platform as possible. The driving force behind the development of the Cydra 5 was the desire to achieve increased performance over superminis on numerically intensive computations in such a way that the user would not have to discard the software, algorithms, training, or techniques acquired over the years. As a result, the user would be able to move up from the supermini to the minisuper in a transparent fashion. Transparency is important for a product such as the Cydra 5, which is aimed at the growth phase of the minisupercomputer market. Such a product must cater to a broader and less forgiving user group than the pioneers and early adopters who purchased first-generation minisupercomputers. The ideal sought by the design group was to match the “feel” of a conventional minicomputer, such as a VAX, with much higher performance. This paper provides an excellent study of the practical application of a dataflow architecture to a numerical computation environment. A number of Cydra 5 systems are in use at customer sites, and their performance has met the design team's expectations. On such industry-standard benchmarks as Linpack and the Livermore FORTRAN Kernels, the Cydra 5 delivers 15.4 and 5.8 Mflops, respectively. This is the highest performance of any minisupercomputer, including those whose peak performance is twice that of the Cydra, and about one-third the performance of a Cray X-MP, which has nine times the Cydra 5's peak performance. Across a spectrum of typical applications the Cydra 5 achieves between one-fourth and two-thirds of the performance of a Cray X-MP single processor, depending on the extent to which the application is vectorizable.

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Computer Volume 22, Issue 1

January 1989

87 pages

ISSN:0018-9162

Editor:
Bruce Shriver
Univ. of Hawaii

Issue’s Table of Contents

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 January 1989

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

130
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Gonzalez AKolli AKhan SLiu SDadu VKarandikar SChang JAsanovic KRanganathan PSolihin YHeinrich M(2023)Profiling Hyperscale Big Data ProcessingProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589082(1-16)Online publication date: 17-Jun-2023
Yeh TMarr DPatt Y(2014)Increasing the instruction fetch rate via multiple branch prediction and a branch address cacheACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2667167(183-192)Online publication date: 10-Jun-2014
Hennessy JPatterson D(2011)Computer Architecture, Fifth EditionundefinedOnline publication date: 29-Sep-2011
Gorjiara BReshadi MGajski D(2008)Merged Dictionary Code Compression for FPGA Implementation of Custom Microcoded PEsACM Transactions on Reconfigurable Technology and Systems10.1145/1371579.13715831:2(1-21)Online publication date: 1-Jun-2008
Joao JMutlu OKim HAgarwal RPatt Y(2008)Improving the performance of object-oriented languages with dynamic predication of indirect jumpsACM SIGPLAN Notices10.1145/1353536.134629343:3(80-90)Online publication date: 1-Mar-2008
Joao JMutlu OKim HAgarwal RPatt Y(2008)Improving the performance of object-oriented languages with dynamic predication of indirect jumpsACM SIGOPS Operating Systems Review10.1145/1353535.134629342:2(80-90)Online publication date: 1-Mar-2008
Joao JMutlu OKim HAgarwal RPatt Y(2008)Improving the performance of object-oriented languages with dynamic predication of indirect jumpsACM SIGARCH Computer Architecture News10.1145/1353534.134629336:1(80-90)Online publication date: 1-Mar-2008
Joao JMutlu OKim HAgarwal RPatt YEggers SLarus J(2008)Improving the performance of object-oriented languages with dynamic predication of indirect jumpsProceedings of the 13th international conference on Architectural support for programming languages and operating systems10.1145/1346281.1346293(80-90)Online publication date: 1-Mar-2008
Mishra PDutt N(2008)Processor Description LanguagesundefinedOnline publication date: 29-May-2008
Zimmer CHines SKulkarni PTyson GWhalley DKim TSainrat PLumetta SNavarro N(2007)Facilitating compiler optimizations through the dynamic mapping of alternate register structuresProceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems10.1145/1289881.1289912(165-169)Online publication date: 30-Sep-2007
Show More Cited By

Abstract

References

Cited By

Index Terms

Recommendations

Overlapped loop support in the Cydra 5

Overlapped loop support in the Cydra 5

Architecture and implementation of a VLIW supercomputer

Reviews

Access critical reviews of Computing literature here

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations