[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2535461.2535463guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Optimizing VM checkpointing for restore performance in VMware ESXi

Published: 26 June 2013 Publication History

Abstract

Cloud providers are increasingly looking to use virtual machine checkpointing for new applications beyond fault tolerance. Existing checkpointing systems designed for fault tolerance only optimize for saving checkpointed state, so they cannot support these new applications, which require better restore performance. Improving restore performance requires a predictive technique to reduce the number of disk accesses to bring in the VM's memory on restore. However, complex VM workloads can diverge at any time due to external inputs, background processes, and timing variation, so predicting which pages the VM will access on restore to reduce faults to disk is impossible. Instead, we focus on predicting which pages the VM will access together on restore to improve the efficiency of disk accesses.
To reduce the number of faults to disk on restore, we group memory pages likely to be accessed together into locality blocks. On each fault, we can load a block of pages that are likely to be accessed with the faulting page, eliminating future faults and increasing disk efficiency. We implement support for locality blocks, along with several other optimizations, in a new checkpointing system for VMware ESXi Server called Halite. Our experiments show that Halite reduces restore overhead by up to 94% for a range of workloads.

References

[1]
Amazon. Amazon EC2 FAQ. aws.amazon.com/ec2/faqs/.
[2]
Roy Bryant, Alexey Tumanov, Olga Irzak, Adin Scannell, Kaustubh Joshi, Matti Hiltunen, Andres Lagar-Cavilla, and Eyal de Lara. Kaleidoscope: cloud micro-elasticity via vm state coloring. In Proceedings of the 6th European Conference on Computer Systems, EuroSys '11, pages 273-286, New York, NY, USA, April 2011. ACM.
[3]
Jeffrey S. Chase, Darrell C. Anderson, Prachi N. Thakar, Amin M. Vahdat, and Ronald P. Doyle. Managing energy and server resources in hosting centers. In Proceedings of the 18th Symposium on Operating System Principles, SOSP '01, pages 103-116, New York, NY, USA, October 2001. ACM.
[4]
Patrick Colp, Chris Matthews, Bill Aiello, and Andrew Warfield. VM Snapshots, February 2009. http://www.xen.org/files/ xensummit_oracle09/VMSnapshots.pdf.
[5]
Tathagata Das, Pradeep Padala, Venkata N. Padmanabhan, Ramachandran Ramjee, and Kang G. Shin. Litegreen: saving energy in networked desktops using virtualization. In Proceedings of the USENIX Annual Technical Conference, USENIX'10, pages 3-3, Berkeley, CA, USA, June 2010. USENIX Association.
[6]
Apache Software Foundation. Apache http server project, 2012. http://httpd.apache.org/.
[7]
Michael R. Hines and Kartik Gopalan. Post-copy based live virtual machine migration using adaptive pre-paging and dynamic self-ballooning. In Proceedings of the 5th Conference on Virtual Execution Environments, VEE '09, pages 51-60, Washington, DC, USA, March 2009. ACM.
[8]
Thom Holwerda. SuperFetch: How it works & myths, May 2009. http://www.osnews.com/story/21471/SuperFetch_ How_it_Works_Myths.
[9]
Yongsoo Joo, Junhee Ryu, Sangsoo Park, and Kang G. Shin. Fast: quick application launch on solid-state drives. In Proceedings of the 9th Conference on File and Storage Technologies, FAST '11, Berkeley, CA, USA, February 2011. USENIX Association.
[10]
Horacio Andrs Lagar-Cavilla, Joseph Andrew Whitney, Adin Matthew Scannell, Philip Patchin, Stephen M. Rumble, Eyal de Lara, Michael Brudno, and Mahadev Satyanarayanan. SnowFlock: rapid virtual machine cloning for cloud computing. In Proceedings of the 4th European Conference on Computer Systems, EuroSys '09, pages 1-12, Nuremberg, Germany, April 2009. ACM.
[11]
K. Li, J. F. Naughton, and J. S. Plank. Low-latency, concurrent checkpointing for parallel programs. IEEE Parallel & Distributed Systems, pages 874-879, August 1994.
[12]
Jianwei Liao and Yutaka Ishikawa. A new concurrent checkpoint mechanism for real-time and interactive processes. In Proceedings of the 34th Computer Software and Applications Conference, COMPSAC '10, pages 47-52, Washington, DC, USA, 2010. IEEE Computer Society.
[13]
Jean loup Gailly and Mark Adler. zlib. zlib.net.
[14]
Michael J. Mior and Eyal de Lara. Flurrydb: a dynamically scalable relational database with virtual machine cloning. In Proceedings of the 4th Annual International Conference on Systems and Storage, SYSTOR '11, pages 1-9, New York, NY, USA, 2011. ACM.
[15]
Eunbyung Park, Bernhard Egger, and Jaejin Lee. Fast and space-efficient virtual machine checkpointing. In Proceedings of the 7th Conference on Virtual Execution Environments, VEE '11, pages 75-86, New York, NY, USA, March 2011. ACM.
[16]
Binh Pham, Viswanathan Vaidyanathan, Aamer Jaleel, and Abhishek Bhattacharjee. CoLT: coalesced large-reach TLBs. In Proceedings of the Conference on Microprogramming and Microarchitecture, MICRO '12. IEEE, December 2012.
[17]
James S. Plank and Kai Li. ickp: A consistent checkpointer for multicomputers. IEEE Parallel & Distributed Technology, 2(2):62-67, June 1994.
[18]
PostgreSQL. pgbench. http://www.postgresql.org/docs/ devel/static/pgbench.html.
[19]
Alan Jay Smith. Sequential program prefetching in memory hierarchies. IEEE Computer, 11(12):7-21, December 1978.
[20]
Michael H. Sun and Douglas M. Blough. Fast, lightweight virtual machine checkpointing. Technical report, Georgia Institute of Technology, 2010.
[21]
VMware. VMware vfabric postgres. http://www.vmware.com/ products/application-platform/vfabric-postgres/ overview.html.
[22]
VMware. VMware vSphere Hypervisor. www.vmware.com/ products/vsphere-hypervisor/overview.html.
[23]
Ross N. Williams. http://www.ross.net/compression/ introduction.html.
[24]
Irene Zhang, Alex Garthwaite, Yury Baskakov, and Kenneth C. Barr. Fast restore of checkpointed memory using working set estimation. In Proceedings of the 7th Conference on Virtual Execution Environments, VEE '11, pages 87-98, New York, NY, USA, March 2011. ACM.

Cited By

View all
  • (2020)BlankIt library debloating: getting what you want instead of cutting what you don’tProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386017(164-180)Online publication date: 11-Jun-2020
  • (2018)CESSNAProceedings of the 2018 Workshop on Mobile Edge Communications10.1145/3229556.3229558(1-6)Online publication date: 7-Aug-2018
  • (2016)HOPEProceedings of the 2016 International Conference on Supercomputing10.1145/2925426.2926257(1-12)Online publication date: 1-Jun-2016
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
USENIX ATC'13: Proceedings of the 2013 USENIX conference on Annual Technical Conference
June 2013
364 pages

Sponsors

  • VMware
  • Akamai: Akamai
  • Google Inc.
  • EMC2: EMC2
  • Facebook: Facebook

Publisher

USENIX Association

United States

Publication History

Published: 26 June 2013

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2020)BlankIt library debloating: getting what you want instead of cutting what you don’tProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3385412.3386017(164-180)Online publication date: 11-Jun-2020
  • (2018)CESSNAProceedings of the 2018 Workshop on Mobile Edge Communications10.1145/3229556.3229558(1-6)Online publication date: 7-Aug-2018
  • (2016)HOPEProceedings of the 2016 International Conference on Supercomputing10.1145/2925426.2926257(1-12)Online publication date: 1-Jun-2016
  • (2015)Towards VM Consolidation Using a Hierarchy of Idle StatesACM SIGPLAN Notices10.1145/2817817.273119550:7(107-119)Online publication date: 14-Mar-2015
  • (2015)PARSACM SIGPLAN Notices10.1145/2817817.273119050:7(215-228)Online publication date: 14-Mar-2015
  • (2015)Speculative Memory CheckpointingProceedings of the 16th Annual Middleware Conference10.1145/2814576.2814802(197-209)Online publication date: 24-Nov-2015
  • (2015)Towards VM Consolidation Using a Hierarchy of Idle StatesProceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/2731186.2731195(107-119)Online publication date: 14-Mar-2015
  • (2015)PARSProceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/2731186.2731190(215-228)Online publication date: 14-Mar-2015
  • (2014)HotRestoreProceedings of the 28th USENIX conference on Large Installation System Administration10.5555/2717491.2717492(1-16)Online publication date: 9-Nov-2014
  • (2014)DreamServerProceedings of International Conference on Systems and Storage10.1145/2611354.2611362(1-11)Online publication date: 30-Jun-2014

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media