[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/1247360.1247361guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Debugging operating systems with time-traveling virtual machines

Published: 10 April 2005 Publication History

Abstract

Operating systems are difficult to debug with traditional cyclic debugging. They are non-deterministic; they run for long periods of time; they interact directly with hard-ware devices; and their state is easily perturbed by the act of debugging. This paper describes a time-traveling virtual machine that overcomes many of the difficulties associated with debugging operating systems. Time travel enables a programmer to navigate backward and forward arbitrarily through the execution history of a particular run and to replay arbitrary segments of the past execution. We integrate time travel into a general-purpose debugger to enable a programmer to debug an OS in reverse, implementing commands such as reverse breakpoint, reverse watchpoint, and reverse single step. The space and time overheads needed to support time travel are reasonable for debugging, and movements in time are fast enough to support interactive debugging. We demonstrate the value of our time-traveling virtual machine by using it to understand and fix several OS bugs that are difficult to find with standard debugging tools. Reverse debugging is especially helpful in finding bugs that are fragile due to non-determinism, bugs in device drivers, bugs that require long runs to trigger, bugs that corrupt the stack, and bugs that are detected after the relevant stack frame is popped.

References

[1]
{1} H. Agrawal, R. A. DeMillo, and E. H. Spafford. An Execution-Backtracking Approach to Debugging. IEEE Software, 8(3), May 1991.
[2]
{2} D. F. Bacon and S. C. Goldstein. Hardware-Assisted Replay of Multiprocessor Programs. In Proceedings of the ACM/ONR Workshop on Parallel and Distributed Debugging , May 1991.
[3]
{3} P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the Art of Virtualization. In Proceedings of the 2003 Symposium on Operating Systems Principles, October 2003.
[4]
{4} B. Boothe. Efficient algorithms for bidirectional debugging. In Proceedings of the 2000 Conference on Programming Language Design and Implementation (PLDI), pages 299-310, June 2000.
[5]
{5} T. C. Bressoud and F. B. Schneider. Hypervisor-Based Fault-Tolerance. In Proceedings of the 1995 Symposium on Operating Systems Principles, pages 1-11, December 1995.
[6]
{6} S.-K. Chen, W. K. Fuchs, and J.-Y. Chung. Reversible Debugging Using Program Instrumentation. IEEE Transactions on Software Engineering, 27(8):715-727, August 2001.
[7]
{7} A. Chou, J. Yang, B. Chelf, S. Hallem, and D. Engler. An Empirical Study of Operating Systems Errors. In Proceedings of the 2001 Symposium on Operating Systems Principles, pages 73-88, October 2001.
[8]
{8} J. Dike. A user-mode port of the Linux kernel. In Proceedings of the 2000 Linux Showcase and Conference, October 2000.
[9]
{9} G. W. Dunlap, S. T. King, S. Cinar, M. Basrai, and P. M. Chen. ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay. In Proceedings of the 2002 Symposium on Operating Systems Design and Implementation (OSDI), pages 211-224, December 2002.
[10]
{10} S. I. Feldman and C. B. Brown. IGOR: a system for program debugging via reversible execution. In Proceedings of the 1988 ACM SIGPLAN/SIGOPS Workshop on Parallel and Distributed Debugging, pages 112-123, November 1988.
[11]
{11} K. Fraser, S. Hand, R. Neugebauer, I. Pratt, A. Warfield, and M. Williamson. Reconstructing I/O. Technical Report UCAM-CL-TR-596, University of Cambridge Computer Laboratory, August 2004.
[12]
{12} R. P. Goldberg. Survey of Virtual Machine Research. IEEE Computer, pages 34-45, June 1974.
[13]
{13} J. Gray. Why do computers stop and what can be done about it? In Proceedings of the 1986 Symposium on Reliability in Distributed Software and Database Systems, pages 3-12, January 1986.
[14]
{14} J. Katcher. PostMark: A New File System Benchmark. Technical Report TR3022, Network Appliance, October 1997.
[15]
{15} S. T. King, G. W. Dunlap, and P. M. Chen. Operating System Support for Virtual Machines. In Proceedings of the 2003 USENIX Technical Conference, pages 71-84, June 2003.
[16]
{16} T. J. LeBlanc and J. M. Mellor-Crummey. Debugging Parallel Programs with Instant Replay. IEEE Transactions on Computers, pages 471-482, April 1987.
[17]
{17} J. LeVasseur, V. Uhlig, J. Stoess, and S. Gotz. Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines. In Proceedings of the 2004 Symposium on Operating Systems Design and Implementation (OSDI), December 2004.
[18]
{18} P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A Full System Simulation Platform. IEEE Computer, 35(2):50-58, February 2002.
[19]
{19} J. M. Mellor-Crummey and T. J. LeBlanc. A Software Instruction Counter. In Proceedings of the 1989 International Conference on Architectural Support for Programming Languages and Operating Systems, pages 78-86, April 1989.
[20]
{20} I. Molnar, February 2005. personal communication.
[21]
{21} R. H. B. Netzer and M. H. Weaver. Optimal Tracing and Incremental Reexecution for Debugging Long-Running Programs. In Proceedings of the 1994 Conference on Programming Language Design and Implementation, June 1994.
[22]
{22} M. Russinovich and B. Cogswell. Replay for concurrent non-deterministic shared-memory applications. In Proceedings of the 1996 Conference on Programming Language Design and Implementation, pages 258-266, May 1996.
[23]
{23} C. P. Sapuntzakis, R. Chandra, B. Pfaff, J. Chow, M. S. Lam, and M. Rosenblum. Optimizing the Migration of Virtual Computers. In Proceedings of the 2002 Symposium on Operating Systems Design and Implementation, December 2002.
[24]
{24} S. Srinivasan, S. Kandula, C. Andrews, and Y. Zhou. Flashback: A light-weight rollback and deterministic replay extension for software debugging. In Proceedings of the 2004 USENIX Technical Conference, June 2004.
[25]
{25} J. D. Strunk, G. R. Goodson, M. L. Scheinholtz, C. A. Soules, and G. R. Ganger. Self-securing storage: Protecting data in compromised systems. In Proceedings of the 2000 Symposium on Operating Systems Design and Implementation (OSDI), October 2000.
[26]
{26} J. Sugerman, G. Venkitachalam, and B.-H. Lim. Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor. In Proceedings of the 2001 USENIX Technical Conference, June 2001.
[27]
{27} A. Tolmach and A. W. Appel. A Debugger for Standard ML. Journal of Functional Programming, 5(2):155-200, April 1995.
[28]
{28} A. Whitaker, M. Shaw, and S. D. Gribble. Scale and Performance in the Denali Isolation Kernel. In Proceedings of the 2002 Symposium on Operating Systems Design and Implementation (OSDI), December 2002.
[29]
{29} M. Xu, R. Bodik, and M. D. Hill. A "Flight Data Recorder" for Enabling Full-system Multiprocessor Deterministic Replay. In Proceedings of the 2003 International Symposium on Computer Architecture, June 2003.
[30]
{30} M. V. Zelkowitz. Reversible execution. Communications of the ACM, 16(9):566, September 1973.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ATEC '05: Proceedings of the annual conference on USENIX Annual Technical Conference
April 2005
588 pages

Publisher

USENIX Association

United States

Publication History

Published: 10 April 2005

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)AgamottoProceedings of the 29th USENIX Conference on Security Symposium10.5555/3489212.3489355(2541-2557)Online publication date: 12-Aug-2020
  • (2019)DETERProceedings of the 16th USENIX Conference on Networked Systems Design and Implementation10.5555/3323234.3323270(437-451)Online publication date: 26-Feb-2019
  • (2019)You can't debug what you can't seeProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3317550.3321428(163-169)Online publication date: 13-May-2019
  • (2018)SledgehammerProceedings of the 13th USENIX conference on Operating Systems Design and Implementation10.5555/3291168.3291208(545-560)Online publication date: 8-Oct-2018
  • (2018)LogDriveJournal of Cloud Computing: Advances, Systems and Applications10.5555/3181718.32879927:1(1-25)Online publication date: 1-Dec-2018
  • (2018)CRIMESProceedings of the 19th International Middleware Conference10.1145/3274808.3274812(40-52)Online publication date: 26-Nov-2018
  • (2018)SEEDE: simultaneous execution and editing in a development environmentProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering10.1145/3238147.3238182(270-281)Online publication date: 3-Sep-2018
  • (2017)Safe Inspection of Live Virtual MachinesACM SIGPLAN Notices10.1145/3140607.305076652:7(97-111)Online publication date: 8-Apr-2017
  • (2017)Safe Inspection of Live Virtual MachinesProceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3050748.3050766(97-111)Online publication date: 8-Apr-2017
  • (2016)Time-travel debugging for JavaScript/Node.jsProceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering10.1145/2950290.2983933(1003-1007)Online publication date: 1-Nov-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media