[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1077603.1077614acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
Article

Energy-efficient and high-performance instruction fetch using a block-aware ISA

Published: 08 August 2005 Publication History

Abstract

The front-end in superscalar processors must deliver high application performance in an energy-effective manner. Impediments such as multi-cycle instruction accesses, instruction-cache misses, and mispredictions reduce performance by 48% and increase energy consumption by 21%. This paper presents a block-aware instruction set architecture (BLISS) that defines basic block descriptors in addition to the actual instructions in a program. BLISS allows for a decoupled front-end that reduces the time and energy spent on misspeculated instructions. It also allows for accurate instruction prefetching and energy efficient instruction access. A BLISS-based front-end leads to 14% IPC, 16% total energy, and 83% energy-delay-squared product improvements for wide-issue processors

References

[1]
W. M. Felter et al. On The Performance and Use of Dense Servers. IBM J. RES. and DEV., 47(5/6), September 2003.]]
[2]
R. Ronen, A. Mendelson, et al. Coming Challenges in Microarchitecture and Architecture. Proceedings of the IEEE, 89(3), March 2001.]]
[3]
Kanad Ghose and Milind B. Kamble. Reducing Power in Superscalar Processor Caches Using Subbanking, Multiple Line Buffers and Bit-line Segmentation. In Intl. Symposium on Low Power Electronics and Design, San Diego, CA, August 1999.]]
[4]
Glenn Reinman, Brad Calder, and Todd M. Austin. High Performance and Energy Efficient Serial Prefetch Architecture. In Intl. Symposium on High Performance Computing, Kansai Science City, Japan, May 2002.]]
[5]
A. Ramirez, J. Larriba-Pey, and M. Valero. Branch Prediction Using Profile Data. In EuroPar Conference, Manchester, UK, August 2001.]]
[6]
S. Melvin and Y. Patt. Enhancing Instruction Scheduling with a Block-structured ISA. Intl. Journal on Parallel Processing, 23(3), June 1995.]]
[7]
T. Chen and J.L. Baer. A Performance Study of Software and Hardware Data Prefetching Schemes. In Intl. Symposium on Computer Architecture, Chicago, IL, April 1994.]]
[8]
G. Reinman, B. Calder, and T. Austin. Fetch Directed Instruction Prefetching. In Intl. Symposium on Microarchitecture, Haifa, Israel, November 1999.]]
[9]
G. Reinman, C. Calder, and T. Austin. Optimizations Enabled by a Decoupled Front-End Architecture. IEEE TC, 50(40), April 2001.]]
[10]
D. Burger and M. Austin. Simplescalar Tool Set, Version 2.0. Technical Report CS-TR-97-1342, University of Wisconsin, Madison, June 1997.]]
[11]
D. Brooks, V. Tiwari, and M. Martonosi. Wattch: A Framework for Architectural-Level Power Analysis and Optimizations. In Intl. Symposium on Computer Architecture, Vancouver, BC, Canada, June 2000.]]
[12]
M. Powell et al. Reducing Set-Associative Cache Energy via Way Prediction and Selective Direct-Mapping. In Intl. Symposium on Microarchitecture, Austin, Texas, December 2001.]]
[13]
David H. Albonesi. Selective Cache Ways: On-Demand Cache Resource Allocation. In Intl. Symposium on Microarchitecture, Haifa, Israel, November 1999.]]
[14]
Dharmesh Parikh, Kevin Skadron, Yan Zhang, Marco Barcella, and Mircea R. Stan. Power Issues Related To Branch Prediction. In Intl. Symposium on High-Performance Computer Architecture, Boston, MA, February 2001.]]
[15]
A. Baniasadi and A. Moshovos. Branch Predictor Prediction: A Power-Aware Branch Predictor for High-Performance Processors. In Intl. Conference on Computer Design, Freiburg, Germany, September 2002.]]
[16]
J. Aragon, J. Gonzalez, and A. Gonzalez. Power-Aware Control Speculation Through Selective Throttling. In Intl. Symposium on High-Performance Computer Architecture, Anaheim, CA, February 2003.]]
[17]
R. Wedig and M. Rose. The Reduction of Branch Instruction Execution Overhead Using Structured Control Flow. In Intl. Symposium on Computer Architecture, Ann Arbor, MI, June 1984.]]
[18]
V. Kathail, M. Schlansker, and B. Rau. HPL PlayDoh Architecture Specification. Technical Report HPL-93-80, HP Labs, 1994.]]
[19]
T. Yeh and Y. Patt. A Comprehensive Instruction Fetch Mechanism for a Processor Supporting Speculative Execution. In Intl. Symposium on Microarchitecture, Portland, OR, December 1992.]]
[20]
B. Calder and D. Grunwald. Fast and Accurate Instruction Fetch and Branch Prediction. In Intl. Symposium on Computer Architecture, Chicago, IL, April 1994.]]
[21]
J. Stark, P. Racunas, and Y. Patt. Reducing the Performance Impact of Instruction Cache Misses by Writing Instructions into the Reservation Stations Out-of-Order. In Intl. Symposium on Microarchitecture, Research Triangle Park, NC, December 1997.]]
[22]
D. Friendly, S. Patel, and Y. Patt. Alternative Fetch and Issue Techniques from the Trace Cache Mechanism. In Intl. Symposium on Microarchitecture, Research Triangle Park, NC, December 1997.]]
[23]
S. Patel, M. Evers, and Y. Patt. Improving Trace Cache Effectiveness with Branch Promotion and Trace Packing. In Intl. Symposium on Computer Architecture, Barcelona, Spain, June 1998.]]
[24]
S. Jourdan et al. Extended Block Cache. In Intl. Symposium on High-Performance Computer Architecture, Toulouse, France, January 2000.]]

Cited By

View all
  • (2017)CG-OoOACM Transactions on Architecture and Code Optimization10.1145/315103414:4(1-26)Online publication date: 5-Dec-2017
  • (2013)Low-Power Design of Hybrid Instruction Cache Based on Branch Prediction and Drowsy CacheThe Proceedings of the Second International Conference on Communications, Signal Processing, and Systems10.1007/978-3-319-00536-2_40(335-343)Online publication date: 24-Oct-2013
  • (2007)A low power front-end for embedded processors using a block-aware instruction setProceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems10.1145/1289881.1289926(267-276)Online publication date: 30-Sep-2007
  • Show More Cited By

Index Terms

  1. Energy-efficient and high-performance instruction fetch using a block-aware ISA

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISLPED '05: Proceedings of the 2005 international symposium on Low power electronics and design
    August 2005
    400 pages
    ISBN:1595931376
    DOI:10.1145/1077603
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 August 2005

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. basic blocks
    2. decoupled architecture
    3. energy efficiency
    4. instruction delivery
    5. instruction set architecture

    Qualifiers

    • Article

    Conference

    ISLPED05
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 398 of 1,159 submissions, 34%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)CG-OoOACM Transactions on Architecture and Code Optimization10.1145/315103414:4(1-26)Online publication date: 5-Dec-2017
    • (2013)Low-Power Design of Hybrid Instruction Cache Based on Branch Prediction and Drowsy CacheThe Proceedings of the Second International Conference on Communications, Signal Processing, and Systems10.1007/978-3-319-00536-2_40(335-343)Online publication date: 24-Oct-2013
    • (2007)A low power front-end for embedded processors using a block-aware instruction setProceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems10.1145/1289881.1289926(267-276)Online publication date: 30-Sep-2007
    • (2006)Online strategies for high-performance power-aware thread execution on emerging multiprocessorsProceedings of the 20th international conference on Parallel and distributed processing10.5555/1898699.1898824(298-298)Online publication date: 25-Apr-2006
    • (2006)Simultaneously improving code size, performance, and energy in embedded processorsProceedings of the conference on Design, automation and test in Europe: Proceedings10.5555/1131481.1131544(224-229)Online publication date: 6-Mar-2006
    • (2006)Online strategies for high-performance power-aware thread execution on emerging multiprocessorsProceedings 20th IEEE International Parallel & Distributed Processing Symposium10.1109/IPDPS.2006.1639598(8 pp.)Online publication date: 2006

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media