More Web Proxy on the site http://driver.im/

research-article

Criticality-driven superscalar design space exploration

Authors:

Sandeep Navada,

Niket K. Choudhary,

Eric RotenbergAuthors Info & Claims

PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques

Pages 261 - 272

https://doi.org/10.1145/1854273.1854308

Published: 11 September 2010 Publication History

Abstract

It has become increasingly difficult to perform design space exploration (DSE) of computer systems with a short turnaround time because of exploding design spaces, increasing design complexity and long-running workloads. Researchers have used classical search/optimization techniques like simulated annealing, genetic algorithms, etc., to accelerate the DSE. While these techniques are better than an exhaustive search, a substantial amount of time must still be dedicated to DSE. This is a serious bottleneck in reducing research/development time. These techniques do not perform the DSE quickly enough, primarily because they do not leverage any insight as to how the different design parameters of a computer system interact to increase or degrade performance at a design point and treat the computer system as a "black-box".

We propose using criticality analysis to guide the classical search/optimization techniques. We perform criticality analysis to find the design parameter which is most detrimental to the performance at a given design point. Criticality analysis at a given design point provides a localized view of the region around the design point without performing simulations at the neighboring points. On the other hand, a classical search/optimization technique has a global view of the design space and avoids getting stuck at a local maximum. We use this synergistic behavior between the criticality analysis (good locally) and the classical search/optimization techniques (good globally) to accelerate the DSE.

For the DSE of superscalar processors on SPEC 2000 benchmarks, on average, criticality-driven walk achieves 3.8x speedup over random walk and criticality-driven simulated annealing achieves 2.3x speedup over simulated annealing.

References

[1]

}}M. Agarwal, N. Navale, K. Malik, M. I. Frank. Fetch-Criticality Reduction through Control Independence, In ISCA 2008.

Digital Library

[2]

}}T. Austin, E. Larson, D. Ernst. SimpleScalar: An Infrastructure for Computer System Modeling, IEEE Micro, Feb 2002.

Digital Library

[3]

}}E. Borch, E. Tune, S. Manne, J. Emer. Loose Loops Sink Chips, In HPCA 2002.

Digital Library

[4]

}}D. Brooks, V. Tiwari, M. Martonosi. Wattch: A Framework for Architectural-Level Power Analysis and Optimizations, In ISCA 2000.

Digital Library

[5]

}}G. Cai, K. Chow, T. Nakanishi, J. Hall, M. Barany. Multivariate Power/Performance Analysis for High Performance Mobile Microprocessor Design, In Power Driven Microarchitecture Workshop, June 1998.

[6]

}}N. Choudhary, S. Wadhavkar, T. Shah, S. Navada, H. Hashemi, E. Rotenberg. FabScalar, In WARP-2009.

[7]

}}K. Chow, J. Ding. Multivariate Analysis of Pentium Pro Processor, In Intel Software Developers Conference, Oct. 1997.

[8]

}}T. Conte. Systematic Computer Architecture Prototyping, PhD. Thesis, Department of Electrical Engineering, UIUC, 1992.

Digital Library

[9]

}}C. Dubach, T. M. Jones, M. O'Boyle. Microarchitectural Design Space Exploration using An Architecture-Centric Approach, In MICRO 2007.

Digital Library

[10]

}}S. Eyerman, L. Eeckhout, K. De Bosschere. Efficient Design Space Exploration of High Performance Embedded Out-of-Order Processors, In DATE 2006.

Digital Library

[11]

}}B. Fields, S. Rubin, R. Bodik. Focusing Processor Policies via Critical-Path Prediction, In ISCA 2001.

Digital Library

[12]

}}B. Fields, R. Bodik, M. D. Hill, C. J. Newburn. Interaction Cost: For When Event Counts Just Don't Add Up, IEEE Micro, Nov. 2004.

Digital Library

[13]

}}E. Grochowski, R. Ronen, J. Shen, H. Wang. Best of Both Latency and Throughput, In ICCD 2004.

Digital Library

[14]

}}H. Hashemi Najaf-abadi, E. Rotenberg. Configurational Workload Characterization, In ISPASS 2008.

Digital Library

[15]

}}L Ingber, B Rosen. Genetic algorithms and Very Fast Simulated Reannealing: A Comparison, Mathematical and Computer Modelling, 1992.

Digital Library

[16]

}}E. 0pek, S.A. McKee, B.R. de Supinski, M. Schulz, R. Caruana. Efficiently Exploring Architectural Design Spaces via Predictive Modeling, In ASPLOS 2006.

[17]

}}P. J. Joseph, K. Vaswani, M. Thazhuthaveetil. A Predictive Perfomance Model for Superscalar Processors, In MICRO 2006.

Digital Library

[18]

}}S. Kang, R. Kumar. Magellan: A Framework for Fast Muti-core Design Space Exploration and Optimization Using Search and Machine Learning, In DATE 2008.

Digital Library

[19]

}}T. Karkhanis, J. E. Smith. A First-Order Superscalar Processor Model, In ISCA 2004.

Digital Library

[20]

}}T. Karkhanis, J. E. Smith. Automated Design of Application-Specific Superscalar Processors, In ISCA 2007.

Digital Library

[21]

}}T. Karkhanis. Automated Design of Application-Specific Processors, PhD. Thesis, Department of Electrical Engineering, University of Wisconsin-Madison, 2006.

Digital Library

[22]

}}H. Kannan, M. Budiu, J. Davis, G. Venkataramani. Tuning SOCs using the Dynamic Critical Path, In SOCC 2009.

[23]

}}R. Kumar, D. Tullsen, N. Jouppi. Core Architecture Optimization for Heterogeneous Chip Multiprocessors, In PACT 2006.

Digital Library

[24]

}}P. Laarhoven, E. Aarts. Simulated Annealing: Theory and Applications, Springer, 1987.

Digital Library

[25]

}}B. Lee, D. Brooks. Accurate and Efficient Regression Modeling for Microarchitectural Performance and Power Prediction, In ASPLOS 2006.

Digital Library

[26]

}}R. Nagarajan, X. Chen, R. G. McDonald, D. Burger, S. W. Keckler. Critical Path Analysis of the TRIPS architecture, In ISPASS 2006.

[27]

}}S. Nussbaum, J. E. Smith. Modeling Superscalar Processors via Statistical Simulation, In PACT 2001.

Digital Library

[28]

}}E. Rich, K. Knight. Artificial Intelligence, 2nd Edition. Morgan Kaufmann, 1991.

Digital Library

[29]

}}A. Saidi, N. Binkert, T. N. Mudge, S. K. Reinhardt. Full System Critical Path Analysis, In ISPASS 2008.

Digital Library

[30]

}}A. Saidi, N. Binkert, S. K. Reinhardt, T. N. Mudge. End-To-End Performance Forecasting: Finding Bottlenecks before They Happen, In ISCA 2009.

Digital Library

[31]

}}T. Sherwood, E. Perelman, G. Hamerly, B. Calder. Automatically Characterizing Large Scale Program Behavior, In ASPLOS 2002.

Digital Library

[32]

}}P. Shivakumar, N.P. Jouppi. Cacti 3.0: An Integrated Cache Timing, Power and Area model, Technical report, 2001.

[33]

}}The Standard Performance Evaluation Corporation, http://spec.org

[34]

}}S. Subramaniam, A. Bracy, H. Wang, G. Loh. Criticality-Based Optimizations for Efficient Load Processing, In HPCA 2009.

[35]

}}P. Salverda, C. Zilles. A Criticality Analysis of Clustering in Superscalar Processors, In MICRO 2005.

Digital Library

[36]

}}E. Tune, D. Tullsen, B. Calder. Quantifying Instruction Criticality, In PACT 2002.

Digital Library

[37]

}}R. E. Wunderlich, T. F. Wenisch, B. Falsafi, J. C. Hoe. SMARTS: Accelerating Microarchitecture Simulation via Rigorous Statistical Sampling, In ISCA 2003.

Digital Library

[38]

}}J. Yi, D. Lilja, D. Hawkins. A Statistically Rigorous Approach for Improving Simulation Methodology, In HPCA 2003.

Digital Library

Cited By

Wang LDeng YGong RShi WLuo LWang Y(2020)CSMO-DSEACM Journal on Emerging Technologies in Computing Systems10.1145/337140616:2(1-22)Online publication date: 30-Jan-2020
https://dl.acm.org/doi/10.1145/3371406
Wang LDeng YGong RShi WZhao ZDou Q(2018)A Parallel Algorithm for Instruction Dependence Graph Analysis Based on Multithreading2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)10.1109/BDCloud.2018.00108(716-721)Online publication date: Dec-2018
https://doi.org/10.1109/BDCloud.2018.00108
Srinivasan SKoren IKundu S(2016)Improving performance per Watt of non-monotonic Multicore Processors via bottleneck-based online program phase classification2016 IEEE 34th International Conference on Computer Design (ICCD)10.1109/ICCD.2016.7753337(528-535)Online publication date: Oct-2016
https://doi.org/10.1109/ICCD.2016.7753337
Show More Cited By

Index Terms

Criticality-driven superscalar design space exploration
1. Computer systems organization
  1. Architectures

Recommendations

Design space exploration acceleration through operation clustering

This paper presents a clustering method called clustering design space exploration (CDS-ExpA) to accelerate the architectural exploration of behavioral descriptions in C and SystemC. The trade-offs between faster exploration versus optimality of results ...
A design space exploration framework for reduced bit-width instruction set architecture (rISA) design
ISSS '02: Proceedings of the 15th international symposium on System Synthesis

Code size is a critical concern in many embedded system applications, especially those using RISC cores. One promising approach for reducing code size is to employ a "dual instruction set", where processor architectures support a normal (usually 32 bit) ...
A Scalable and Fast Microprocessor Design Space Exploration Methodology
MCSOC '15: Proceedings of the 2015 IEEE 9th International Symposium on Embedded Multicore/Many-core Systems-on-Chip

Design space exploration of microprocessor is still a challenging task for processor designers. Due to the huge design space, it is hard to determine the optimal configuration of microarchitecture parameters to satisfy the target performance and power ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

PACT '10: Proceedings of the 19th international conference on Parallel architectures and compilation techniques

September 2010

596 pages

ISBN:9781450301787

DOI:10.1145/1854273

General Chair:
Valentina Salapura
IBM TJ Watson Research Center
,
Program Chairs:
Michael Gschwind
IBM Systems & Technology Group
,
Jens Knoop
Technische Universität Wien

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IFIP WG 10.3: IFIP working group 10.3 on concurrent systems
IEEE CS TCPP: IEEE-CS technical committee on parallel processing
SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE CS TCAA: IEEE CS technical committee on architectural acoustics

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

PACT '10

Sponsor:

IFIP WG 10.3
IEEE CS TCPP
SIGARCH
IEEE CS TCAA

PACT '10: International Conference on Parallel Architectures and Compilation Techniques

September 11 - 15, 2010

Vienna, Austria

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
374
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang LDeng YGong RShi WLuo LWang Y(2020)CSMO-DSEACM Journal on Emerging Technologies in Computing Systems10.1145/337140616:2(1-22)Online publication date: 30-Jan-2020
https://dl.acm.org/doi/10.1145/3371406
Wang LDeng YGong RShi WZhao ZDou Q(2018)A Parallel Algorithm for Instruction Dependence Graph Analysis Based on Multithreading2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)10.1109/BDCloud.2018.00108(716-721)Online publication date: Dec-2018
https://doi.org/10.1109/BDCloud.2018.00108
Srinivasan SKoren IKundu S(2016)Improving performance per Watt of non-monotonic Multicore Processors via bottleneck-based online program phase classification2016 IEEE 34th International Conference on Computer Design (ICCD)10.1109/ICCD.2016.7753337(528-535)Online publication date: Oct-2016
https://doi.org/10.1109/ICCD.2016.7753337
Qin FWang LDeng YWang YZhao T(2015)HMCPA: Heuristic Method Utilizing Critical Path Analysis for Design Space Exploration of Superscalar MicroprocessorsComputer Engineering and Technology10.1007/978-3-662-45815-0_3(20-35)Online publication date: 2015
https://doi.org/10.1007/978-3-662-45815-0_3
Forbes EChoudhary NDwiel BRotenberg E(2014)Design-effort alloy: Boosting a highly tuned primary core with untuned alternate cores2014 IEEE 32nd International Conference on Computer Design (ICCD)10.1109/ICCD.2014.6974713(408-415)Online publication date: Oct-2014
https://doi.org/10.1109/ICCD.2014.6974713
Navada SChoudhary NWadhavkar SRotenberg EFensch CO'Boyle MSeznec ABodin F(2013)A unified view of non-monotonic core selection and application steering in heterogeneous chip multiprocessorsProceedings of the 22nd international conference on Parallel architectures and compilation techniques10.5555/2523721.2523743(133-144)Online publication date: 7-Oct-2013
https://dl.acm.org/doi/10.5555/2523721.2523743
Navada SChoudhary NWadhavkar SRotenberg E(2013)Jigsaw: scalable software-defined cachesProceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques10.1109/PACT.2013.6618811(213-224)Online publication date: Oct-2013
https://doi.org/10.1109/PACT.2013.6618811

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten