[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3023973.3023974acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrapidoConference Proceedingsconference-collections
research-article
Open access

Adaptive Cache Warming for Faster Simulations

Published: 23 January 2017 Publication History

Abstract

The use of hardware-based virtualization allows modern simulators to very quickly fast-forward between sample points and regions of interest. This dramatically reduces the simulation time compared to traditional functional forwarding. However, as the fast-forwarding takes place through virtualized execution on the native hardware, it is unable to warm simulated structures, such as caches. As a result, sampled simulations taking advantage of virtualization for fast-forwarding find their execution time dominated by functional warming.
To address the cost of warming, we present Adaptive Cache Warming (ACW), a new fast method that determines how much warming each sample/phase/application needs. ACW takes advantage of the virtualization-based fast-forwarding to search for the minimum warming time required during simulation. To determine when the cache is sufficiently warm, ACW uses heuristics based on the last-level cache's cold-set misses.
Our results show that typical practice of conservatively warming last-level caches for around 100M instructions is a vast overkill for nearly all checkpoints. By using ACW, we can adapt the warming per-sample and speedup the simulation by 92--103× on average (512× speedup maximum) depending on cache size (2-32MB).

References

[1]
F. Bellard. Qemu, a fast and portable dynamic translator. In USENIX Annual Technical Conference, FREENIX Track, pages 41--46, 2005.
[2]
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: Characterization and architectural implications. In Proc. International Conference on Parallel Architectures and Compilation Techniques (PACT), 2008.
[3]
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood. The gem5 Simulator. SIGARCH Comput. Archit. News, 2011.
[4]
L. Eeckhout, Y. Luo, K. D. Bosschere, and L. K. John. BLRL: Accurate and Efficient Warmup for Sampled Processor Simulation. The Computer Journal, 48(4):451--459, Jan. 2005.
[5]
S. Hassani, G. Southern, and J. Renau. LiveSim: Going Live with Microarchitecture Simulation. In Proc. International Symposium on High-Performance Computer Architecture (HPCA), 2016.
[6]
J. L. Henning. SPEC CPU2006 Benchmark Descriptions. SIGARCH Comput. Archit. News, 2006.
[7]
S.-h. Kang, D. Yoo, and S. Ha. TQSIM: A fast cycle-approximate processor simulator based on QEMU. Journal of Systems Architecture, 66--67:33--47, May 2016.
[8]
Y. Luo, L. K. John, and L. Eeckhout. Self-monitored Adaptive Cache Warm-up for Microprocessor Simulation. In Proc. Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2004.
[9]
A. Patel, F. Afram, S. Chen, and K. Ghose. MARSS: A Full System Simulator for Multicore x86 CPUs. In Proc. Design Automation Conference (DAC), 2011.
[10]
E. Perelman, G. Hamerly, M. Van Biesbrouck, T. Sherwood, and B. Calder. Using SimPoint for Accurate and Efficient Simulation. In Proc. International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), 2003.
[11]
A. Sandberg, N. Nikoleris, T. E. Carlson, E. Hagersten, S. Kaxiras, and D. Black-Schaffer. Full Speed Ahead: Detailed Architectural Simulation at Near-Native Speed. In Proc. International Symposium on Workload Characterization (IISWC), 2015.
[12]
T. Sherwood, E. Perelman, and B. Calder. Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications. In Proc. International Conference on Parallel Architectures and Compilation Techniques (PACT), 2001.
[13]
T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically Characterizing Large Scale Program Behavior. In Proc. Internationl Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2002.
[14]
R. E. Wunderlich, T. F. Wenisch, B. Falsafi, and J. C. Hoe. SMARTS: Accelerating Microarchitecture Simulation via Rigorous Statistical Sampling. In Proc. International Symposium on Computer Architecture (ISCA), June 2003.
[15]
M. T. Yourst. PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator. In Proc. International Symposium on Performance Analysis of Systems & Software (ISPASS), 2007.

Cited By

View all
  • (2023)Profiling gem5 Simulator2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00019(103-113)Online publication date: Apr-2023
  • (2020)A Survey of Cache SimulatorsACM Computing Surveys10.1145/337239353:1(1-32)Online publication date: 6-Feb-2020
  • (2019)Development of a cycle-accurate simulator of the Elbrus processor core memory subsystemRadio industry (Russia)10.21778/2413-9599-2019-29-2-17-2729:2(17-27)Online publication date: 30-May-2019

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
RAPIDO '17: Proceedings of the 9th Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
January 2017
47 pages
ISBN:9781450348409
DOI:10.1145/3023973
This work is licensed under a Creative Commons Attribution International 4.0 License.

In-Cooperation

  • HiPEAC: HiPEAC Network of Excellence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 January 2017

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

RAPIDO '17
RAPIDO '17: Methods and Tools
January 23 - 25, 2017
Stockholm, Sweden

Acceptance Rates

RAPIDO '17 Paper Acceptance Rate 6 of 12 submissions, 50%;
Overall Acceptance Rate 14 of 28 submissions, 50%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)99
  • Downloads (Last 6 weeks)11
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Profiling gem5 Simulator2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00019(103-113)Online publication date: Apr-2023
  • (2020)A Survey of Cache SimulatorsACM Computing Surveys10.1145/337239353:1(1-32)Online publication date: 6-Feb-2020
  • (2019)Development of a cycle-accurate simulator of the Elbrus processor core memory subsystemRadio industry (Russia)10.21778/2413-9599-2019-29-2-17-2729:2(17-27)Online publication date: 30-May-2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media