[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Optimizing Memory Translation Emulation in Full System Emulators

Published: 09 January 2015 Publication History

Abstract

The emulation speed of a full system emulator (FSE) determines its usefulness. This work quantitatively measures where time is spent in QEMU [Bellard 2005], an industrial-strength FSE. The analysis finds that memory emulation is one of the most heavily exercised emulator components. For workloads studied, 38.1% of the emulation time is spent in memory emulation on average, even though QEMU implements a software translation lookaside buffer (STLB) to accelerate dynamic address translation. Despite the amount of time spent in memory emulation, there has been no study on how to further improve its speed. This work analyzes where time is spent in memory emulation and studies the performance impact of a number of STLB optimizations. Although there are several performance optimization techniques for hardware TLBs, this work finds that the trade-offs with an STLB are quite different compared to those with hardware TLBs. As a result, not all hardware TLB performance optimization techniques are applicable to STLBs and vice versa. The evaluated STLB optimizations target STLB lookups, as well as refills, and result in an average emulator performance improvement of 24.4% over the baseline.

References

[1]
Fabrice Bellard. 2005. QEMU, a fast and portable dynamic translator. In Proceedings of the USENIX Annual Technical Conference (ATEC’05). USENIX Association, Berkeley, CA, 41. http://dl.acm.org/citation.cfm?id=1247360.1247401.
[2]
Ravi Bhargava, Benjamin Serebrin, Francesco Spadini, and Srilatha Manne. 2008. Accelerating two-dimensional page walks for virtualized systems. ACM SIGPLAN Notices 43, 3, 26--35.
[3]
Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’06). ACM, New York, NY, 169--190.
[4]
Chao-Jui Chang, Jan-Jan Wu, Wei-Chung Hsu, Pangfeng Liu, and Pen-Chung Yew. 2014. Efficient memory virtualization for cross-ISA system mode emulation. In Proceedings of the 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE’14). ACM, New York, NY, 117--128.
[5]
J. Bradley Chen, Anita Borg, and Norman P. Jouppi. 1992. A simulation based study of TLB performance. ACM SIGARCH Computer Architecture News 20, 2, 114--123.
[6]
Yufei Chen and Haibo Chen. 2013. Scalable deterministic replay in a parallel full-system emulator. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’13). ACM, New York, NY, 207--218.
[7]
Android Developers. 2011. What Is Android.
[8]
Nadeem Firasta, Mark Buxton, Paula Jinbo, Kaveh Nasri, and Shihjong Kuo. 2008. Intel AVX: New Frontiers in Performance Improvements and Energy Efficiency. Intel White Paper.
[9]
Raul A. Garibay Jr., Marc A. Quattromani, and Douglas Beard. 1998. Address translation unit employing a victim TLB. U.S. Patent 5,752,274.
[10]
John L. Hennessy and David A. Patterson. 2003. Computer Architecture: A Quantitative Approach (3rd ed.). Morgan Kaufmann, San Francisco, CA.
[11]
John L. Henning. 2006. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News 34, 4, 1--17.
[12]
Bruce Jacob and Trevor Mudge. 1998. Virtual memory in contemporary microprocessors. IEEE Micro 18, 4, 60--75.
[13]
Tarush Jain and Tanmay Agrawal. 2013. The Haswell microarchitecture—4th generation processor. International Journal of Computer Science and Information Technologies 4, 3, 477--480.
[14]
JVM JVMTI. 2005. Tool Interface, v1. 0.
[15]
Ehsan K. Ardestani and Jose Renau. 2013. ESESC: A fast multicore simulator using time-based sampling. In Proceedings of the IEEE 19th International Symposium on High Performance Computer Architecture (HPCA’19). 448--459.
[16]
Kevin P. Lawton. 1996. Bochs: A portable PC emulator for Unix/X. Linux Journal 1996, 29es, Article No. 7. http://dl.acm.org/citation.cfm?id=326350.326357.
[17]
John Levon. 2004. OProfile Manual. Victoria University of Manchester.
[18]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). ACM, New York, NY, 190--200.
[19]
Peter S. Magnusson, Magnus Christensson, Jesper Eskilson, Daniel Forsgren, Gustav Hållberg, Johan Högberg, Fredrik Larsson, Andreas Moestedt, and Bengt Werner. 2002. Simics: A full system simulation platform. Computer 35, 2, 50--58.
[20]
Bill Ogden. 2013. IBM System z Personal Development Tool: Volume 2 Installation and Basic Use. IBM Redbooks.
[21]
Avadh Patel, Furat Afram, Shunfei Chen, and Kanad Ghose. 2011. MARSS: A full system simulator for multicore x86 CPUs. In Proceedings of the 48th Design Automation Conference (DAC’11). ACM, New York, NY, 1050--1055.
[22]
Georgios Portokalidis, Asia Slowinska, and Herbert Bos. 2006. Argos: An emulator for fingerprinting zero-day attacks for advertised honeypots with automatic signature generation. In Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006 (EuroSys’06). ACM, New York, NY, 15--27.
[23]
Xin Tong, Jack Luo, and Andreas Moshovos. 2013. QTrace: An interface for customizable full system instrumentation. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’13). IEEE, Los Alamitos, CA, 132--133.

Cited By

View all
  • (2023)On-Demand Triggered Memory Management Unit in Dynamic Binary TranslatorAdvanced Parallel Processing Technologies10.1007/978-981-99-7872-4_17(297-309)Online publication date: 4-Aug-2023
  • (2019)Cross-ISA machine instrumentation using fast and scalable dynamic binary translationProceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3313808.3313811(74-87)Online publication date: 14-Apr-2019
  • (2017)HyperMAMBO-X64ACM SIGPLAN Notices10.1145/3140607.305075652:7(228-241)Online publication date: 8-Apr-2017
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization
ACM Transactions on Architecture and Code Optimization  Volume 11, Issue 4
January 2015
797 pages
ISSN:1544-3566
EISSN:1544-3973
DOI:10.1145/2695583
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 January 2015
Accepted: 01 November 2014
Revised: 01 September 2014
Received: 01 April 2014
Published in TACO Volume 11, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Full system emulator
  2. TLB
  3. dynamic address translation

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)160
  • Downloads (Last 6 weeks)31
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)On-Demand Triggered Memory Management Unit in Dynamic Binary TranslatorAdvanced Parallel Processing Technologies10.1007/978-981-99-7872-4_17(297-309)Online publication date: 4-Aug-2023
  • (2019)Cross-ISA machine instrumentation using fast and scalable dynamic binary translationProceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3313808.3313811(74-87)Online publication date: 14-Apr-2019
  • (2017)HyperMAMBO-X64ACM SIGPLAN Notices10.1145/3140607.305075652:7(228-241)Online publication date: 8-Apr-2017
  • (2017)HyperMAMBO-X64Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3050748.3050756(228-241)Online publication date: 8-Apr-2017
  • (2015)Optimizing Control Transfer and Memory Virtualization in Full System EmulatorsACM Transactions on Architecture and Code Optimization10.1145/283702712:4(1-24)Online publication date: 8-Dec-2015

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media