More Web Proxy on the site http://driver.im/

article

Free access

Effects of building blocks on the performance of super-scalar architecture

Authors:

Edil S. T. Fernandes,

Fernando M. B. BarbosaAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 20, Issue 2

Pages 36 - 45

https://doi.org/10.1145/146628.139681

Published: 01 April 1992 Publication History

Abstract

The inherent low level parallelism of Super-Scalar architectures plays an important role in the processing power provided by these machines: independent functional units promote opportunities for executing several machine operations simultaneously. From the viewpoint of the hardware designer it is very important to assess the influence of each functional unit, and the way they communicate, on the overall performance of the machine. Particularly, it is highly desirable to determine an upper bound in the number of additional functional units which give significant performance improvement ratios.

This work describes experiments that have been carried out to assess the effect of alternative instruction issue mechanisms, multiple functional units, instruction queues, common data bus and other hardware solutions on the performance of Super-Scalar machines. The assessment was obtained by interpreting non optimized object code fo an actual processor on some basic machine models. The paper outline the main aspects of the research, and shows that speed-up ratios of up to 3.35 times were observed during the interpretation of benchmark programs.

References

[1]

R.D. Acosta, J. Kjelstrup, and H.C. Torng, "An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors," IEEE Transactions on Computers, Vol. C-35, No. 9, September 1986, pp. 815-828.

Digital Library

[2]

D.W. Anderson, F.J. Sparacio, and R.M. Tomasulo, "The IBM System/360 Model 91: Machine Philosophy and Instruction Handling," IBM Journal, No. 11, 1967, pp. 2-7.

Digital Library

[3]

Augustus K. Uht, "Hardware Extraction of Low Level Concurrency from Sequential Instruction Streams," Ph.D. Thesis, Carnegie-Mellon University, December 1985.

Digital Library

[4]

M. Butler, T. Yeh, Y. Patt, M. AI- sup, H. Scales, and M. Shebanow, "Single Instruction Stream Parallelism Is Greater than Two," Proceedings of the 18th Annual International Symposium on Computer Architecture, May, 1991, pp. 276-286.

Digital Library

[5]

S. D. Conte, "Elementary Namerical Analysis," McGraw-Hill Book Co., New York, NY, USA, 1965.

Digital Library

[6]

David R. Ditzel, and Hubert R. Mclellan, "Branch Folding in the CRISP Microprocessor: Reducing Branch Delay to Zero," Proceedings of the 14th Annual International Symposium on Computer Architectare, 1987, pp. 2-9.

Digital Library

[7]

Edil S. T. Fernandes, and Fernando M. B. Barbosa, "Effects of Building Blocks on the Performance of Super-Scalar Architectares," Technical Report, COPPE/Sistemas, UFRJ, January 1992, 18 pages.

[8]

Joseph A. Fisher, "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Transactions on Computers, Vol. C-30, No. 7, July 1981, pp. 478-490.

[9]

Joseph A. Fisher, "VLIW Machine: A Multiprocessor for Compiling Scientific Code," Computer, July 1984, pp. 45-53.

Digital Library

[10]

G. Forsythe, and C. B. Moler, "Computer Solution of Linear Algebraic Systerns," Prentice Hall, Englewood Cliffs, New Jersey, 1967.

[11]

G. F. Grohoski, "Machine Organization of the IBM RISC System/6000 Processor," IBM Journal of Research and Development, Vol. 34. No. 1, january 1990, pp. 37- 58.

Digital Library

[12]

Intel, "Intel 80960CA User's Manual," Intel, 1989.

[13]

R. M. Keller, "Look-Ahead Processors," Computing Surveys, Vol. 7, No. 4, December 1975, pp. 177-195.

Digital Library

[14]

D. Knuth, "Art of Computer Programming," Addison-Wesley Publishing Co., USA, 1973.

Digital Library

[15]

R. F Krick and A. Dollas, "The Evolution of Instruction Sequencing," IEEE Computer, April 1991, pp. 5-15.

Digital Library

[16]

N. Margulis, "i860 Microprocessor Internal Architecture," Microprocessors and Microsystems, Vol. 14, No. 2, March 1990, pp. 89-96.

Digital Library

[17]

F. H. McMahon, " Fortran Kernels" MFLOPS," Lawrence Livermore National Laboratory, 1983.

[18]

A. Nicolau, "Percolation Scheduling" A Parallel Compilation Technique," Technical Report TR 85-678, Department of Computer Science, Cornell University, May 1985.

Digital Library

[19]

Y. N. Part, W. Hwu, and M. C. Shebanow, "HPS, a New Microarchitecture: Rationale and Introduction," Proceedings of the 18th International Microprogramming Workshop, December 1985, pp. 103-108.

Digital Library

[20]

A. R. Pleszkun and G. S. Sohi, "The Performance Potential of Multiple Functional Unit Processors," Proceedings of the 15th Annual International Symposium on Computer Architecture, 1988. pp. 37-44.

Digital Library

[21]

J. Smith "Decoupled Access / Execute Computer Architectures," Proceedings of the 9th International Symposium on Compater Architecture, April 1982, pp. 113-119.

Digital Library

[22]

J. E. Thornton, "Parallel Operation in the Control Data 6600." AFIPS Proceedings FJCC, pt 2, Vol. 26, 1964, pp. 33-40.

[23]

G.S. Tjaden and M. Flynn, "Detection and Parallel Execution of Independent Instructions," IEEE Transactions on Computers, Vol. C-19, No. 10, October 1970, pp. 889-895.

[24]

G.S. Tjaden and M. Flynn, "Representation of Concurrency with Ordering Matrices," IEEE Transactions on Computers, Vol. C-22, No. 8, August 1973, pp. 752-761.

[25]

R. M. Tomasulo, "An Efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM Journal, No. 11, 1967, pp. 25-33.

[26]

M. Tokoro, T. Takizuka, E. Tamura, and I. Yamaura, "A Technique of Global Optimization of Microprograms," Proceedings of the 11th Annual Microprogramming Workshop, 1978, pp. 41-50.

Digital Library

[27]

Shlomo Weiss and James E. Smith, "Instruction Issue Logic for Pipelined Supercomputers," Proceeding of the 11th Annual International Symposium on Computer Architecture, June 1984, pp. 110-118.

Digital Library

[28]

J. M. Yohe, "Ha-Tucker Minimum Redundancy Alphabetic Coding Method {Z}," CACM, Vol. 15, No. 5, May 1972, pp. 360- 362.

Digital Library

Index Terms

Effects of building blocks on the performance of super-scalar architecture

Recommendations

Effects of building blocks on the performance of super-scalar architecture
ISCA '92: Proceedings of the 19th annual international symposium on Computer architecture

The inherent low level parallelism of Super-Scalar architectures plays an important role in the processing power provided by these machines: independent functional units promote opportunities for executing several machine operations simultaneously. From ...
Super-scalar processor design
Super-Scalar Processor Design

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 20, Issue 2

Special Issue: Proceedings of the 19th annual international symposium on Computer architecture (ISCA '92)

May 1992

429 pages

ISSN:0163-5964

DOI:10.1145/146628

Editor:
Allan Gotlieb
New York Univ., New York, NY

Issue’s Table of Contents

ISCA '92: Proceedings of the 19th annual international symposium on Computer architecture
May 1992
439 pages
ISBN:0897915097
DOI:10.1145/139669
Chairman:
Allan Gottlieb
New York Unvi., New York, NY

Copyright © 1992 Authors.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 April 1992

Published in SIGARCH Volume 20, Issue 2

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
458
Total Downloads

Downloads (Last 12 months)95
Downloads (Last 6 weeks)9

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents