[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3154690.3154727guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Engineering record and replay for deployability

Published: 12 July 2017 Publication History

Abstract

The ability to record and replay program executions with low overhead enables many applications, such as reverse-execution debugging, debugging of hard-to-reproduce test failures, and "black box" forensic analysis of failures in deployed systems. Existing record-and-replay approaches limit deployability by recording an entire virtual machine (heavyweight), modifying the OS kernel (adding deployment and maintenance costs), requiring pervasive code instrumentation (imposing significant performance and complexity overhead), or modifying compilers and runtime systems (limiting generality). We investigated whether it is possible to build a practical record-and-replay system avoiding all these issues. The answer turns out to be yes--if the CPU and operating system meet certain non-obvious constraints. Fortunately modern Intel CPUs, Linux kernels and user-space frameworks do meet these constraints, although this has only become true recently. With some novel optimizations, our system RR records and replays real-world low-parallelism workloads with low overhead, with an entirely user-space implementation, using stock hardware, compilers, runtimes and operating systems. RR forms the basis of an open-source reverse-execution debugger seeing significant use in practice. We present the design and implementation of RR, describe its performance on a variety of workloads, and identify constraints on hardware and operating system design required to support our approach.

References

[1]
Reversible debugging tools for C/C++ on Linux & Android. http://undo-software.com. Accessed: 2016-04-16.
[2]
Understanding IntelliTrace part I: What the @#$% is IntelliTrace? https://blogs.msdn.microsoft.com/zainnab/2013/02/12/understanding-intellitrace-parti-what-the-is-intellitrace. Accessed:2016-04-16.
[3]
Wine windows-on-posix framework. https://www.winehq.org. Accessed: 2016-09-20.
[4]
O. Agesen, J. Mattson, R. Rugina, and J. Sheldon. Software techniques for avoiding hardware virtualization exits. In Proceedings of the 2012 USENIX Annual Technical Conference, June 2012.
[5]
G. Altekar and I. Stoica. ODR: Output-deterministic replay for multicore debugging. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, October 2009.
[6]
T. Bergan, N. Hunt, L. Ceze, and S. D. Gribble. Deterministic process groups in dOS. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation, October 2010.
[7]
S. Bhansali, W.-K. Chen, S. de Jong, A. Edwards, R. Murray, M. Drinic, D. Mihočka, and J. Chau. Framework for instruction-level tracing and analysis of program executions. In Proceedings of the 2nd International Conference on Virtual Execution Environments, June 2006.
[8]
D. Bruening, Q. Zhao, and S. Amarasinghe. Transparent dynamic instrumentation. In Proceedings of the 8th International Conference on Virtual Execution Environments, March 2012.
[9]
B. Burg, R. Bailey, A. Ko, and M. Ernst. Interactive record/replay for web application debugging. In Proceedings of the 26th ACM Symposium on User Interface Software and Technology, October 2013.
[10]
A. Burtsev, D. Johnson, M. Hibler, E. Eide, and J. Regehr. Abstractions for practical virtual machine replay. In Proceedings of the 12th ACM SIGPLAN/ SIGOPS International Conference on Virtual Execution Environments, April 2016.
[11]
M. E. Chastain. https://lwn.net/1999/0121/a/mec.html, January 1999. Accessed: 2016-04-16.
[12]
J.-D. Choi, B. Alpern, T. Ngo, M. Sridharan, and J. Vlissides. A perturbation-free replay platform for cross-optimized multithreaded applications. In Proceedings of the 15th International Parallel and Distributed Processing Symposium, April 2001.
[13]
P. Deva. http://chrononsystems.com/blog/design-and-architecture-of-the-chronon-record-0, December 2010. Accessed: 2016-04-16.
[14]
D. Devecsery, M. Chow, X. Dou, J. Flinn, and P. M. Chen. Eidetic systems. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation, October 2014.
[15]
B. Dolan-Gavitt, J. Hodosh, P. Hulin, T. Leek, and R. Whelan. Repeatable reverse engineering with PANDA. In Proceedings of the 5th Program Protection and Reverse Engineering Workshop, December 2015.
[16]
P. Dovgalyuk. Deterministic replay of systems execution with multi-target QEMU simulator for dynamic analysis and reverse debugging. 2012.
[17]
G. Dunlap, S. King, S. Cinar, M. A. Basrai, and P. M. Chen. ReVirt: Enabling intrusion analysis through virtual-machine logging and replay. In Proceedings of the 5th USENIX Symposium on Operating Systems Design and Implementation, December 2002.
[18]
G. W. Dunlap, D. G. Lucchetti, M. A. Fetterman, and P. M. Chen. Execution replay of multiprocessor virtual machines. In Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, March 2008.
[19]
J. Engblom. A review of reverse debugging. In System, Software, SoC and Silicon Debug Conference, September 2012.
[20]
J. Engblom, D. Aarno, and B. Werner. Full-system simulation from embedded to high-performance systems. In Processor and System-on-Chip Simulation, 2010.
[21]
D. Geels, G. Altekar, S. Shenker, and I. Stoica. Replay debugging for distributed applications. In Proceedings of the 2006 USENIX Annual Technical Conference, June 2006.
[22]
C. Gottbrath. Reverse debugging with the TotalView debugger. In Cray User Group Conference, May 2008.
[23]
Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, and Z. Zhang. R2: An application-level kernel for record and replay. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation, December 2008.
[24]
D. Hower and M. Hill. Rerun: Exploiting episodes for lightweight memory race recording. In Proceedings of the 35th Annual International Symposium on Computer Architecture, June 2008.
[25]
U. Hlzle, C. Chambers, and D. Ungar. Optimizing dynamically-typed object-oriented languages with polymorphic inline caches. In Proceedings of the 1991 European Conference on Object-Oriented Programming, July 1991.
[26]
O. Laadan, N. Viennot, and J. Nieh. Transparent, lightweight application execution replay on commodity multiprocessor operating systems. In Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems, June 2010.
[27]
T. Liu, C. Curtsinger, and E. Berger. Dthreads: Efficient deterministic multithreading. In Proceedings of the ACM SIGOPS 23rd Symposium on Operating Systems Principles, October 2011.
[28]
V. Malyugin, J. Sheldon, G. Venkitachalam, B. Weissman, and M. Xu. ReTrace: Collecting execution trace with virtual machine deterministic replay. In Proceedings of the Workshop on Modeling, Benchmarking and Simulation, June 2007.
[29]
A. J. Mashtizadeh, T. Garfinkel, D. Terei, D. Mazières, and M. Rosenblum. Towards practical default-on multi-core record/replay. In Proceedings of the 22nd International Conference on Architectural Support for Programming Languages and Operating Systems (to appear), April 2017.
[30]
P. Montesinos, L. Ceze, and J. Torrellas. DeLorean: Recording and deterministically replaying shared-memory multiprocessor execution efficiently. In Proceedings of the 35th Annual International Symposium on Computer Architecture, June 2008.
[31]
S. Narayanasamy, G. Pokam, and B. Calder. Bugnet: Continuously recording program execution for deterministic replay debugging. In Proceedings of the 32nd Annual International Symposium on Computer Architecture, June 2005.
[32]
R. O'Callahan, C. Jones, N. Froyd, K. Huey, A. Noll, and N. Partush. Engineering record and replay for deployability: Extended technical report. http://arxiv.org/abs/1705.05937.
[33]
M. Olszewski, J. Ansel, and S. Amarasinghe. Kendo: Efficient deterministic multithreading in software. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, March 2009.
[34]
H. Patil, C. Pereira, M. Stallcup, G. Lueck, and J. Cownie. PinPlay: A framework for deterministic replay and reproducible analysis of parallel programs. In Proceedings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization, April 2010.
[35]
G. Pokam, K. Danne, C. Pereira, R. Kassa, T. Kranich, S. Hu, J. Gottschlich, N. Honarmand, N. Dautenhahn, S. King, and J. Torrellas. Quick-Rec: Prototyping an intel architecture extension for record and replay of multithreaded programs. In Proceedings of the 40th Annual International Symposium on Computer Architecture, June 2013.
[36]
Y. Saito. Jockey: A user-space library for record-replay debugging. In Proceedings of the 6th International Symposium on Automated Analysis-driven Debugging, September 2005.
[37]
K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov. Addresssanitizer: A fast address sanity checker. In Proceedings of the 2012 USENIX Annual Technical Conference, June 2012.
[38]
D. Srinivasan and X. Jiang. Time-traveling forensic analysis of VM-based high-interaction honeypots. In Security and Privacy in Communication Networks: 7th International ICST Conference, September 2011.
[39]
K. Veeraraghavan, D. Lee, B. Wester, J. Ouyang, P. M. Chen, J. Flinn, and S. Narayanasamy. Doubleplay: Parallelizing sequential logging and replay. In Proceedings of the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, March 2011.
[40]
V. Weaver, D. Terpstra, and S. Moore. Nondeterminism and overcount on modern hardware performance counter implementations. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, April 2013.
[41]
M. Xu, R. Bodik, and M. D. Hill. A "flight data recorder" for enabling full-system multiprocessor deterministic replay. In Proceedings of the 30th Annual International Symposium on Computer Architecture, June 2003.

Cited By

View all
  • (2024)Program Environment FuzzingProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690229(720-734)Online publication date: 2-Dec-2024
  • (2019)Different is GoodProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security10.1145/3319535.3345654(1883-1897)Online publication date: 6-Nov-2019
  • (2019)You can't debug what you can't seeProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3317550.3321428(163-169)Online publication date: 13-May-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
USENIX ATC '17: Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference
July 2017
811 pages
ISBN:9781931971386

Sponsors

  • VMware
  • NetApp
  • Microsoft: Microsoft
  • Facebook: Facebook
  • ORACLE: ORACLE

Publisher

USENIX Association

United States

Publication History

Published: 12 July 2017

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Program Environment FuzzingProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690229(720-734)Online publication date: 2-Dec-2024
  • (2019)Different is GoodProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security10.1145/3319535.3345654(1883-1897)Online publication date: 6-Nov-2019
  • (2019)You can't debug what you can't seeProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3317550.3321428(163-169)Online publication date: 13-May-2019
  • (2019)Sparse record and replay with controlled schedulingProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314635(576-593)Online publication date: 8-Jun-2019
  • (2019)Replayable Execution Optimized for Page Sharing for a Managed Runtime EnvironmentProceedings of the Fourteenth EuroSys Conference 201910.1145/3302424.3303978(1-16)Online publication date: 25-Mar-2019
  • (2019)Transparent tracing of microservice-based applicationsProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3297403(1252-1259)Online publication date: 8-Apr-2019
  • (2018)SledgehammerProceedings of the 13th USENIX conference on Operating Systems Design and Implementation10.5555/3291168.3291208(545-560)Online publication date: 8-Oct-2018
  • (2018)iReplayer: in-situ and identical record-and-replay for multithreaded applicationsACM SIGPLAN Notices10.1145/3296979.319238053:4(344-358)Online publication date: 11-Jun-2018
  • (2018)SysTaintProceedings of the 8th Software Security, Protection, and Reverse Engineering Workshop10.1145/3289239.3289245(1-12)Online publication date: 3-Dec-2018
  • (2018)iReplayer: in-situ and identical record-and-replay for multithreaded applicationsProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192380(344-358)Online publication date: 11-Jun-2018
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media