[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Open access

Self-Healing in Modern Operating Systems: A few early steps show there’s a long (and bumpy) road ahead.

Published: 01 December 2004 Publication History

Abstract

Driving the stretch of Route 101 that connects San Francisco to Menlo Park each day, billboard faces smilingly reassure me that all is well in computerdom in 2004. Networks and servers, they tell me, can self-defend, self-diagnose, self-heal, and even have enough computing power left over from all this introspection to perform their owner-assigned tasks.

References

[1]
See http://sun.com/software/solaris/10/ and http://sun.com/msg/.
[2]
Brown, A., and D. Patterson. 2001. Embracing failure: A case for recovery-oriented computing (ROC). High Performance Transaction Processing Symposium, Asilomar, CA (October 2001); see http://roc.cs.berkeley.edu/.
[3]
Candea, G., and A. Fox. 2001. Recursive restartability: Turning the reboot hammer into a scalpel. Proceedings of the 8th Workshop on Hot Topics in Operating Systems (May 2001); see http://i30www.ira.uka.de/conferences/HotOS/.
[4]
Mewburn, L. 2001. The design and implementation of the NetBSD rc.d system. Proceedings of the 2001 Usenix Annual Technical Conference, Boston, MA (June 2001); see http://www.mewburn.net/luke/papers/.

Cited By

View all

Recommendations

Reviews

Bayard Kohlhepp

Blue screens of death and cryptic error messages are all too common in contemporary systems, while graceful recovery and restart are all too rare. This paper documents a first step on the thousand-mile journey to self-healing operating systems. Shapiro provides a clear description of the total lack of self-healing in modern software, and then describes the architectural approach he used in Sun Solaris 10 to address the problem. The paper is easy to read and understand, doesn't waste time on accusation or self-promotion, and provides a small bibliography that will get an interested reader started quickly on topical research. Sun, IBM, HP, and Microsoft are all touting "self-healing," "adaptive," "dynamic," and/or "crashless" system initiatives, but this is the first public revelation of a vendor's actual work that I am aware of. The paper is nine pages long, but reads like five; there are many graphics and figures consuming page real estate. I liked the paper so much I would like to write more praise, but it's so short that verbosity seems inappropriate. Read it; it's good. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Queue
Queue  Volume 2, Issue 9
Programming Languages
December/January 2004-2005
65 pages
EISSN:1542-7749
DOI:10.1145/1039511
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2004
Published in QUEUE Volume 2, Issue 9

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Editor picked

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2,427
  • Downloads (Last 6 weeks)240
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Intelligence in cyberspace: the road to cyber singularityJournal of Experimental & Theoretical Artificial Intelligence10.1080/0952813X.2020.1784296(1-35)Online publication date: 28-Jun-2020
  • (2016)Toward Smart Embedded SystemsACM Transactions on Embedded Computing Systems10.1145/287293615:2(1-27)Online publication date: 17-Feb-2016
  • (2016)Building adaptive self-healing systems within a resource contested environmentHeliyon10.1016/j.heliyon.2016.e001002:4(e00100)Online publication date: Apr-2016
  • (2014)IasoFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-014-3503-18:3(378-390)Online publication date: 1-Jun-2014
  • (2013)Self-* in Multimedia Communication OverlaysComputer Communications10.1016/j.comcom.2012.12.00936:7(817-833)Online publication date: 1-Apr-2013
  • (2012)CSI KernelIEEE Software10.1109/MS.2012.15429:6(9-12)Online publication date: 1-Nov-2012
  • (2011)A Scalable Fault Management Architecture for ccNUMA ServerProceedings of the 2011 Third International Conference on Intelligent Networking and Collaborative Systems10.1109/INCoS.2011.35(709-714)Online publication date: 30-Nov-2011
  • (2011)A survey on self-healing systems: approaches and systemsComputing10.1007/s00607-010-0107-y91:1(43-73)Online publication date: 1-Jan-2011
  • (2011)Runtime Behavior Monitoring and Self-Adaptation in Service-Oriented SystemsSocially Enhanced Services Computing10.1007/978-3-7091-0813-0_6(117-138)Online publication date: 30-May-2011
  • (2011)Behavior Monitoring in Self-Healing Service-Oriented SystemsSocially Enhanced Services Computing10.1007/978-3-7091-0813-0_5(95-116)Online publication date: 30-May-2011
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Magazine Site

View this article on the magazine site (external)

Magazine Site

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media