[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/IISWC.2015.29guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Full Speed Ahead: Detailed Architectural Simulation at Near-Native Speed

Published: 04 October 2015 Publication History

Abstract

Cycle-level micro architectural simulation is the de-facto standard to estimate performance of next-generation platforms. Unfortunately, the level of detail needed for accurate simulation requires complex, and therefore slow, simulation models that run at speeds that are thousands of times slower than native execution. With the introduction of sampled simulation, it has become possible to simulate only the key, representative portions of a workload in a reasonable amount of time and reliably estimate its overall performance. These sampling methodologies provide the ability to identify regions for detailed execution, and through micro architectural state check pointing, one can quickly and easily determine the performance characteristics of a workload for a variety of micro architectural changes. While this strategy of sampling simulations to generate checkpoints performs well for static applications, more complex scenarios involving hardware-software co-design (such as co-optimizing both a Java virtual machine and the micro architecture it is running on) cause this methodology to break down, as new micro architectural checkpoints are needed for each memory hierarchy configuration and software version. Solutions are therefore needed to enable fast and accurate simulation that also address the needs of hardware-software co-design and exploration. In this work we present a methodology to enhance checkpoint-based sampled simulation. Our solution integrates hardware virtualization to provide near-native speed, virtualized fast-forwarding to regions of interest, and parallel detailed simulation. However, as we cannot warm the simulated caches during virtualized fast-forwarding, we develop a novel approach to estimate the error introduced by limited cache warming, through the use of optimistic and pessimistic warming simulations. Using virtualized fast-forwarding (which operates at 90% of native speed on average), we demonstrate a parallel sampling simulator that can be used to accurately estimate the IPC of standard workloads with an average error of 2.2% while still reaching an execution rate of 2.0 GIPS (63% of native) on average. Additionally, we demonstrate that our parallelization strategy scales almost linearly and simulates one core at up to 93% of its native execution rate, 19,000x faster than detailed simulation, while using 8 cores.

Cited By

View all
  • (2024)Pac-Sim: Simulation of Multi-threaded Workloads using Intelligent, Live SamplingACM Transactions on Architecture and Code Optimization10.1145/368054821:4(1-26)Online publication date: 20-Nov-2024
  • (2023)K-D Bonsai: ISA-Extensions to Compress K-D Trees for Autonomous Driving TasksProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589055(1-13)Online publication date: 17-Jun-2023
  • (2023)Faster Functional Warming with Cache MergingProceedings of the DroneSE and RAPIDO: System Engineering for constrained embedded systems10.1145/3579170.3579256(39-47)Online publication date: 17-Jan-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
IISWC '15: Proceedings of the 2015 IEEE International Symposium on Workload Characterization
October 2015
226 pages
ISBN:9781509000883

Publisher

IEEE Computer Society

United States

Publication History

Published: 04 October 2015

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Pac-Sim: Simulation of Multi-threaded Workloads using Intelligent, Live SamplingACM Transactions on Architecture and Code Optimization10.1145/368054821:4(1-26)Online publication date: 20-Nov-2024
  • (2023)K-D Bonsai: ISA-Extensions to Compress K-D Trees for Autonomous Driving TasksProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589055(1-13)Online publication date: 17-Jun-2023
  • (2023)Faster Functional Warming with Cache MergingProceedings of the DroneSE and RAPIDO: System Engineering for constrained embedded systems10.1145/3579170.3579256(39-47)Online publication date: 17-Jan-2023
  • (2023)A Novel Integrated Simulation Framework for Cyber-Physical Systems ModellingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.330008134:10(2684-2698)Online publication date: 1-Oct-2023
  • (2022)Scalable deep learning-based microarchitecture simulation on GPUsProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/3571885.3571990(1-15)Online publication date: 13-Nov-2022
  • (2022)AthenaProceedings of the International Conference on Parallel Architectures and Compilation Techniques10.1145/3559009.3569684(359-371)Online publication date: 8-Oct-2022
  • (2022)SimNetProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35308916:2(1-24)Online publication date: 6-Jun-2022
  • (2021)A Reusable Characterization of the Memory System Behavior of SPEC2017 and SPEC2006ACM Transactions on Architecture and Code Optimization10.1145/344620018:2(1-20)Online publication date: 8-Mar-2021
  • (2020)DSMProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/33921514:2(1-26)Online publication date: 12-Jun-2020
  • (2019)Directed Statistical Warming through Time TravelingProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358264(1037-1049)Online publication date: 12-Oct-2019
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media