[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

DSNS (dynamically-hazard-resolved statically-code-scheduled, nonuniform superscalar): yet another superscalar processor architecture

Published: 01 July 1991 Publication History

Abstract

A new superscalar processor architecture, called DSNS (Dynamically-hazard-resolved, Statically-code-scheduled, Nonuniform Superscalar), is proposed and discussed. DSNS has the following major architectural features.1. Dynamically-hazard-resolved superscalar: DSNS is object-code compatible with respect to the degree of superscalar. Pipeline interlock hardware should be provided for detecting and resolving hazards at run time.2. Statically-cade-scheduled superscalar: The performance of DSNS could not be scalable with respect to the degree of superscalar. Compilers must be responsible for scheduling instructions to reduce the pipeline stalls for a particular degree of superscalar.3. Nonuniform superscalar: Although nonuniform superscalar potentially suffers instruction-class conflicts, it can be more cost-effective than uniform superscalar. Again compilers must take care that the class conflicts do not increase structural hazards.4. Static memory disambiguation: The DSNS architecture provides three types of LOAD/STORE instructions; strongly ordered, weakly ordered, and unordered. Memory disambiguation at compile time is responsible for marking each LOAD/STORE instruction. At run time, processors need not detect nor resolve data hazards for every type; they just perform memory accesses inorder for strongly or weakly ordered instructions, and arbitrarily for unordered.5. Static branch prediction with branch-target buffer: Branch instructions predicted as taken by compilers are stored in the branch target buffer. Hardware never guesses the outcomes of branch instructions.6. Early branch resolution with advanced conditioning: Advanced conditioning allows branch decisions to precede further the corresponding branches. It reduces the branch delay and results in resolving branches early in the pipeline.7. Conditional mode execution with dual register files: Dual register file facilitates maintaining the precise machine state that otherwise might be violated by speculative execution such as conditional mode.8. Weakly precise interrupts: The DSNS architecture defines interrupts as being somewhat imprecise but restartable with the help of interrupt handlers. The definition alleviates hardware constraints for ensuring precise interrupts strongly.This paper also presents an implementation of the DSNS architecture. The DSNS processor prototype under development is a four-stage pipelined processor of superscalar-degree four. The instruction pipelines, especially the branch pipeline, are discussed in detail.

References

[1]
[Aiken88] Aiken, A. and Nicolau, A., "A Development Environment for Horizontal Microcode," IEEE Trans. Soft. Eng., vol. 14, no. 5, pp. 584-594, May 1988.
[2]
[AMD87] Advanced Micro Devices, Am29000 Streamlined Instruction Processor User's Manual, Sunnyvale, CA, 1987.
[3]
[Fisher81] Fisher, J. A., "Trace Scheduling: A Technique for Global Microcode Compaction," IEEE Trans. Comput., vol. C-30, no. 7, pp. 478-490, July 1981.
[4]
[Flynn66] Flynn, M. J., "Very High-Speed Computing Systems," Proc. IEEE, vol. 54, no. 12, pp. 1901-1909, Dec. 1966.
[5]
[Hara90a] Hara, T., Kuga, M., Murakami, K., and Tomita, S., "Instruction Supply Mechanism in the SIMP Processor Prototype (in Japanese)," IPS Japan Technical Report, 90-ARC-80-7, Jan. 1990.
[6]
[Hara90b] Hara, T., Nodomi, A., Kuga, M., Murakami, K., and Tomita, S., "Organizations of An Extended Superscalar Processor Prototype Based on the SIMP Architecture (in Japanese)," IEICE Technical Report, CPSY 90- 55, July 1990.
[7]
[Hennessy89] Hennessy, J., "Beyond RISC," Unix Review, vol. 7, no. 9, pp. 48-54, 1989.
[8]
[HePa90] Hennessy, J. L. and Patterson, D. A., Computer Architecture: A Quantitative Approach, Morgan Kaufmann Pubfishers, Inc., San Mateo, CA, 1990.
[9]
[Hwu87] Hwu, W. W. and Patt, Y. N., "Checkpoint Repair for High-Performance Out-of-Order Execution Machines, "IEEE Trans. Comput., vol. C-36, no. 12, pp. 1496-1514, Dec. 1987.
[10]
[IBM90] IBM Corp., IBM RISC System/6000 Technology, Austin, TX, 1990.
[11]
[Intel89] Intel Corp., 80960CA User's Manual, Santa Clara, CA, 1989.
[12]
[Irie89] Irie, N., Kuga, M., Murakarni, K., and Tomita, S., "Static Code Scheduling Algorithm for the SIMP Processor Prototype (in Japanese)," IPS Japan Technical Report , 89-ARC-79-6, Nov. 1989.
[13]
[Johnson89] Johnson, W. M., "Super-Scalar Processor Design," Computer Systems Laboratory Technical Report No. CSL-TR-89-383, Stanford University, June 1989.
[14]
[Jouppi89a] Jouppi, N. P., "Available Instruction-Level Paralleliism for Superscalar and Superpipelined Machines," Proc. ASPLOS-III, pp. 272-282, Apr. 1989.
[15]
[Jouppi89b] Jouppi, N. P., "The Nonuniform Distribution of Instruction-Level and Machine Paralleliism and Its Effect on Performance," IEEE Trans. Comput., vol. 37, no. 12, pp. 1645-1658, Dec. 1989.
[16]
[Krishnamurthy90] Krishnamurthy, S., "A Brief Survey of Papers on Scheduling for Pipelined Processors," SIGPLAN NOTICES, vol. 25, no. 7, pp. 97-106, July 1990.
[17]
[Kuga89] Kuga, M., Irie, N., Hironaka, T., Murakaani, K., and Tomita, S., "Low-Level Parallel Processing Algorithms for the SIMP Processor Prototype (in Japanese)," Trans. IPS Japan, vol. 30, no. 12, pp. 1603-1611, Dec. 1989.
[18]
[Kuga90] Kuga, M., Irie, N., Murakami, K., and Tomita, S., "Performance Evaluation of the Superscalar Processor Based on the SIMP Architecture (in Japanese)," Proc. 1990 IPS Japan Joint Symp. Parallel Processing, pp. 337- 344, May 1990.
[19]
[Lam88] Lam, M., "Software Pipelining: An Effective Scheduling Technique for VLIW Machines," Proc. ACM SIGPLAN'88 Conf. Programming Language Design and Implementation, pp. 318-328, June 1988.
[20]
[Lee84] Lee, J. K. F. and Smith, A. J., "Branch Prediction Strategies and Branch Target Buffer Design," Computer, vol. 17, no. 1, pp. 6-22, Jan. 1984.
[21]
[Lilja88] Lilja, D. J., "Reducing the Branch Penalty in Pipelined Processors," Computer, vol. 21, no. 7, pp. 47-55, July 1988.
[22]
[MaFarling86] McFarling, S. and Hennessy, J., "Reducing the Cost of Branches," Proc. 13th Int'l. Symp. Computer Architecture, pp. 396-403, June 1986.
[23]
[Murakami89] Murakami, K., Irie, N., Kuga, M., and Tomita, S., "SIMP (Single Instruction stream/Multiple instruction Pipelining): A Novel High-Speed Single-Processor Architecture," Proc. 16th Int'l. Symp. Computer Architecture, pp. 78-85, May 1989.
[24]
[Murakami90] Murakami, K., Kuga, M., and Tomita, S., "An Extended Superscalar Processor Prototype Based on the SIMP Architecture (in Japanese)," IEICE Technical Report, CPSY 90-54, July 1990.
[25]
[Rosocha79] Rosocha, W. G. and Lee, E. S., "Performance Enhancement of SISD Processors," Proc. 6th Ann. Symp. Computer Architecture, pp. 216-231, Apr. 1979.
[26]
[Smith J88] Smith, J. E. and Pleszkun, A. R., "Implementing Precise Interrupts in Pipelined Processors," IEEE Trans. Comput., vol. 37, no. 5, pp. 562-573, May 1988.
[27]
[SmithJ89] Smith, J. E., "Dynamic Instruction Scheduling and the Astronautics ZS-I," Computer, vol. 22, no. 7, pp. 21-35, July 1989.
[28]
[SmithM89] Smith, M. D., Johnson, M., and Horowitz, M. A., "Limits on Multiple Instruction Issue," Proc. ASPLOS-III, pp. 290-302, Apr. 1989.
[29]
[SmithM90] Smith, M. D., Lam, M. S., and Horowitz, M. A., "Boosting Beyond Static Scheduling in a Superecalar Processor," Proc. 17th Int'l. Symp. Computer Architecture , pp. 344-354, May 1990.
[30]
[Sohi87] Sohi, G. S. and Vajapeyam, S., "Instruction Issue Logic for High-Performance, Interruptable Pipelined Processors," Proc. 14th Int'l. Symp. Computer Architecture, pp. 27-34, June 1987.
[31]
[Thornton64] Thornton, J. E., "Parallel Operation in the Control Data 6600," Proc. Fall Joint Computer Conf., vol. 26, pp. 33-40, 1964.
[32]
[Tomasulo67] Tomasulo, R. M., "An Efficient Algorithm for Exploiting Multiple Arithmetic Units," IBM J. Res. Dev., vol. 11, pp. 25-33, Jan. 1967.
[33]
[Weiss84] Weiss, S. and Smith, J. E., "Instruction Issue Logic in Pipelined Supercomputers," IEEE Trans. Comput. , vol. C-33, no. 11, pp. 1013-1022, Nov. 1984.

Cited By

View all

Index Terms

  1. DSNS (dynamically-hazard-resolved statically-code-scheduled, nonuniform superscalar): yet another superscalar processor architecture

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM SIGARCH Computer Architecture News
          ACM SIGARCH Computer Architecture News  Volume 19, Issue 4
          June 1991
          184 pages
          ISSN:0163-5964
          DOI:10.1145/122576
          Issue’s Table of Contents

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 01 July 1991
          Published in SIGARCH Volume 19, Issue 4

          Check for updates

          Qualifiers

          • Article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)36
          • Downloads (Last 6 weeks)3
          Reflects downloads up to 11 Dec 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2006)Ultra-Low-Power Processor DesignHigh-Performance Energy-Efficient Microprocessor Design10.1007/978-0-387-34047-0_1(1-30)Online publication date: 2006
          • (1997)Microprocessor DesignLow Power Design in Deep Submicron Electronics10.1007/978-1-4615-5685-5_18(513-541)Online publication date: 1997
          • (1993)The 16-fold wayProceedings of the 26th annual international symposium on Microarchitecture10.5555/255235.255256(60-69)Online publication date: 1-Dec-1993
          • (1993)Register renaming and dynamic speculation: an alternative approachProceedings of the 26th Annual International Symposium on Microarchitecture10.1109/MICRO.1993.282756(202-213)Online publication date: 1993
          • (1992)Performance analysis and design methodology for a scalable superscalar architectureProceedings of the 25th annual international symposium on Microarchitecture10.5555/144953.145820(246-255)Online publication date: 10-Dec-1992
          • (1992)Performance analysis and design methodology for a scalable superscalar architectureACM SIGMICRO Newsletter10.1145/144965.14582023:1-2(246-255)Online publication date: 10-Dec-1992
          • (1992)Performance Analysis And Design Methodology For A Scalable Superscalar Architecture[1992] Proceedings the 25th Annual International Symposium on Microarchitecture MICRO 2510.1109/MICRO.1992.697026(246-255)Online publication date: 1992

          View Options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media