More Web Proxy on the site http://driver.im/

article

Free access

The Rio file cache: surviving operating system crashes

Authors:

Subhachandra Chandra,

Christopher Aycock,

Gurushankar Rajamani,

David LowellAuthors Info & Claims

ACM SIGOPS Operating Systems Review, Volume 30, Issue 5

Pages 74 - 83

https://doi.org/10.1145/248208.237154

Published: 01 September 1996 Publication History

Abstract

One of the fundamental limits to high-performance, high-reliability file systems is memory's vulnerability to system crashes. Because memory is viewed as unsafe, systems periodically write data back to disk. The extra disk traffic lowers performance, and the delay period before data is safe lowers reliability. The goal of the Rio (RAM I/O) file cache is to make ordinary main memory safe for persistent storage by enabling memory to survive operating system crashes. Reliable memory enables a system to achieve the best of both worlds: reliability equivalent to a write-through file cache, where every write is instantly safe, and performance equivalent to a pure write-back cache, with no reliability-induced writes to disk. To achieve reliability, we protect memory during a crash and restore it during a reboot (a "warm" reboot). Extensive crash tests show that even without protection, warm reboot enables memory to achieve reliability close to that of a write-through file system. Adding protection makes memory even safer than a write-through file system while adding essentially no overhead. By eliminating reliability-induced disk writes, Rio performs 4-22 times as fast as a write-through file system, 2-14 times as fast as a standard Unix file system, and 1-3 times as fast as an optimized system that risks losing 30 seconds of data and metadata.

References

[1]

M. Abbott, D. Har, L. Herger, M. Kauffmann, K. Mak, J. Murdock, C. Schulz, B. Smith, B. Tremaine, D. Yeh, and L. Wong. Durable Memory RS/6000 System Design. In Proceedings of the 1994 International Symposium on Fault-Tolerant Computing, pages 414-423, 1994.]]

[2]

The Power Protection Handbook. Technical report, American Power Conversion, 1996.]]

[3]

Mary G. Baker, John H. Hartman, Michael D. Kupfer, Ken W. Shirriff, and John K. Ousterhout. Measurements of a Distributed File System. In Proceedings of the 13th ACM Symposium on Operating Systems Principles, pages 198-212, October 1991.]]

Digital Library

[4]

Mary Baker, Satoshi Asami, Etienne Deprit, John Ousterhout, and Margo Seltzer. Non- Volatile Memory for Fast Reliable File Systems. In Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASP- LOS-V), pages 10-22, October 1992.]]

Digital Library

[5]

Mary Baker and Mark Sullivan. The Recovery Box: Using Fast Recovery to Provide High Availability in the UNIX Environment. In Proceedings USENIX Summer Conference, June 1992.]]

[6]

Mary Louise Gray Baker. Fast Crash Recovery in Distributed File Systems. PhD thesis, University of California at Berkeley, January 1994.]]

[7]

Michel Banatre, Gilles Muller, Bruno Rochat, and Patrick Sanchez. Design decisions for the FTM: a general purpose fault tolerant machine. In Proceedings of the 1991 International Symposium on Fault-Tolerant Computing, pages 71-78, June 1991.]]

[8]

James H. Barton, Edward W. Czeck, Zary Z. Segall, and Daniel P. S iewiorek. Fault injection experiments using FIAT. IEEE Transactions on Computers, 39(4):575-582, April 1990.]]

Digital Library

[9]

John Chapin, Mendel Rosenblum, Scott Devine, Tirthankar Lahiri, Dan Teodosiu, and Anoop Gupta. Hive' Fault Containment for Shared-Memory Multiprocessors. In Proceedings of the 1995 Symposium on Operating Systems Principles, December 1995.]]

Digital Library

[10]

Peter M. Chen, Wee Teck Ng, Gurushankar Rajamani, and Christopher M. Aycock. The Rio File Cache: Surviving Operating System Crashes. Technical Report CSE-TR- 286-96, University of Michigan, March 1996.]]

[11]

George Copeland, Tom Keller, Ravi Krishnamurthy, and Marc Smith. The Case for Safe RAM. In Proceedings of the Fifteenth International Conference on Very Large Data Bases, pages 327-335, August 1989.]]

Digital Library

[12]

DEC 3000 300/400/500/600/700/800/900 AXP Models System Programmer's Manual. Technical report, Digital Equipment Corporation, July 1994.]]

[13]

D.J. DeWitt, R.H. Katz, F. Olken, L.D. Shapiro, M. R. Stonebraker, and D. Wood. Implementation Techniques for Main Memory Database Systems. In Proceedings of the 1984 A CM SIGMOD International Conference on Management of Data, pages 1-8, June 1984.]]

Digital Library

[14]

Jason Gait. Phoenix: A Safe In-Memory File System. Communications of the A CM, 33(i):81-86, January 1990.]]

Digital Library

[15]

Gregory R. Ganger and Yale N. Patt. Metadata Update Performance in File Systems. 1994 Operating Systems Design and Implementation (OSDI), November 1994.]]

Digital Library

[16]

Jim Gray. A Census of Tandem System Availability between 1985 and 1990. IEEE Transactions on Reliability, 39(4), October 1990.]]

[17]

Robert B. Hagmann. Reimplemenfing the Cedar File System Using Logging and Group Commit. In Proceedings of the 1987 Symposium on Operating Systems Principles, pages 155-162, November 1987.]]

Digital Library

[18]

John H. Hartman and John K. Ousterhout. Letter to the Editor. Operating Systems Review, 27(1):7-9, January 1993.]]

[19]

John. L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach, 2nd Edition. Morgan Kaufmann Publishers, Inc., 1990. page 493.]]

Digital Library

[20]

John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols, M. Satyanarayanan, Robert N. Sidebotham, and Michael J. West. Scale and Performance in a Distributed File System. A CM Transactions on Computer Systems, 6(1):51-81, February 1988.]]

Digital Library

[21]

Mark Scott Johnson. Some Requirements for Architectural Support of Software Debugging. In Proceedings of the 1982 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 140- 148, April 1982.]]

Digital Library

[22]

Ghani A. Kanawati, Nasser A. Kanawati, and Jacob A. Abraham. FERRARI: A Flexible Software-Based Fault and Error Injection System. IEEE Transactions on Computers, 44(2):248-260, February 1995.]]

Digital Library

[23]

Gerry Kane and Joe Heinrich. MIPS RISC Architecture. Prentice Hall, 1992.]]

Digital Library

[24]

Wei-Lun Kao, Ravishankar K. Iyer, and Dong Tang. FINE: A Fault Injection and Monitoring Environment for Tracing the UNIX System Behavior under Faults. IEEE Transactions on Software Engineering, 19(11):1105-1118, November 1993.]]

Digital Library

[25]

Peter B. Kessler. Fast breakpoints: Design and implementation, in Proceedings of the 1990 Conference on Programming Language Design and Implementation (PLDI), pages 78-84, June 1990.]]

Digital Library

[26]

Inhwan Lee and RavishankarK. Iyer. Faults, Symptoms, and Software Fault Tolerance in the Tandem GUARDIAN Operating System. In International Symposium on Fault-Tolerant Computing (FTCS), pages 20-29, 1993.]]

[27]

Samuel J. Leffier, Marshall Kirk McKusick, Michael J. Karels, and John S. Quarterman. The Design and Implementation of the 4.3BSD Unix Operating System. Addison-Wesley Publishing Company, 1989.]]

[28]

Barbara Liskov, Sanjay Ghemawat, Robert Gruber, Paul Johnson, Liuba Shrira, and Michael Williams. Replication in the Harp File System. In Proceedings of the 1991 Symposium on Operating System Principles, pages 226-238, October 1991.]]

Digital Library

[29]

Marshall Kirk McKusick, Michael J. Karels, and Keith Bostic. A Pageable Memory Based Filesystem. In Proceedings US- ENIX Summer Conference, June 1990.]]

[30]

J. Moran, Russel Sandberg, D. Coleman, J. Kepecs, and Bob Lyon. Breaking Through the NFS Performance Barrier. In Proceedings of EUUG Spring 1990, April 1990.]]

[31]

Masataka Ohta and Hiroshi Tezuka. A Fast /tmp File System by Delay Mount Option. In Proceedings USENIX Summer Conference, pages 145-150, June 1990.]]

[32]

John K. Ousterhout, Herve Da Costa, et al. A Trace-Driven Analysis of the UNIX 4.2 BSD File System. In Proceedings of the 1985 Symposium on Operating System Principles, pages 15-24, December 1985.]]

Digital Library

[33]

Mendel Rosenblum and John K. Ousterhout. The Design and Implementation of a Log-Structured File System. ACM Transactions on Computer Systems, 10(1):26-52, February 1992.]]

Digital Library

[34]

Abraham Silberschatz and Peter B. Galvin. Operating System Concepts. Addison-Wesley, 1994. page 200.]]

Digital Library

[35]

Richard L. Sites, editor. Alpha Architecture Reference Manual. Digital Press, 1992.]]

Digital Library

[36]

SPEC SDM Release 1.0 Technical Fact Sheet. Technical report, Franson and Haggerty Associates, 1991.]]

[37]

M. Sullivan and M. Stonebraker. Using write protected data structures to improve software fault tolerance in highly available database management systems. In Proceedings of the 1991 International Conference on Very Large Data Bases (VLDB), pages 171-180, September 1991.]]

Digital Library

[38]

Mark Sullivan and R. Chillarege. Software Defects and Their Impact on System Availability-A Study of Field Failures in Operating Systems. In Proceedings of the 1991 International Symposium on Fault-Tolerant Computing, June 1991.]]

[39]

Andrew S. Tanenbaum. Distributed Operating Systems. Prentice-Hall, 1995. page 146.]]

Digital Library

[40]

Robert Wahbe. Efficient Data Breakpoints. In Proceedings of the 1992 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 1992.]]

Digital Library

[41]

Robert Wahbe, Steven Lucco, Thomas E. Anderson, and Susan L. Graham. Efficient Software-Based Fault Isolation. In Proceedings of the 14th ACM Symposium on Operating Systems Principles, pages 203-216, December 1993.]]

Digital Library

[42]

Michael Wu and Willy Zwaenepoel. eNVy: A Non-Volatile, Main Memory Storage System. In Proceedings of the 1994 International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), October 1994.]]

Digital Library

Cited By

Jacob B(2022)The Memory SystemundefinedOnline publication date: 5-Mar-2022
Kwon YDunn ALee MHofmann OXu YWitchel E(2016)SegoACM SIGOPS Operating Systems Review10.1145/2954680.287237250:2(277-290)Online publication date: 25-Mar-2016
Karna AYuting Chen (2013)Anticrasher: Predicting and preventing impending crashes on runtime at user end2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI)10.1109/ICACCI.2013.6637213(448-453)Online publication date: Aug-2013
Show More Cited By

Index Terms

The Rio file cache: surviving operating system crashes
1. Hardware
  1. Hardware test
    1. Memory test and repair
2. Software and its engineering
  1. Software creation and management
    1. Software post-development issues
      1. Backup procedures
  2. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        File systems management
        Memory management
        Virtual memory
    2. Extra-functional properties
      1. Software fault tolerance

Recommendations

The Rio file cache: surviving operating system crashes

One of the fundamental limits to high-performance, high-reliability file systems is memory's vulnerability to system crashes. Because memory is viewed as unsafe, systems periodically write data back to disk. The extra disk traffic lowers performance, ...
The Rio file cache: surviving operating system crashes
ASPLOS VII: Proceedings of the seventh international conference on Architectural support for programming languages and operating systems

One of the fundamental limits to high-performance, high-reliability file systems is memory's vulnerability to system crashes. Because memory is viewed as unsafe, systems periodically write data back to disk. The extra disk traffic lowers performance, ...
The Design and Verification of the Rio File Cache

Today's file systems are limited in speed and reliability by memory's vulnerability to operating system crashes. Because memory is viewed as unsafe, systems periodically write modified file data back to disk. These extra disk writes lower system ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review

ACM SIGOPS Operating Systems Review Volume 30, Issue 5

Dec. 1996

273 pages

ISSN:0163-5980

DOI:10.1145/248208

Chairmen:
Bill Dally
Massachusetts Institute of Technology
,
Susan Eggers
Univ. of Washington, Seattle

Issue’s Table of Contents

ASPLOS VII: Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
October 1996
290 pages
ISBN:0897917677
DOI:10.1145/237090
Chairmen:
Bill Dally
Massachusetts Institute of Technology
,
Susan Eggets
Univ. of Washington, Seattle

Copyright © 1996 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1996

Published in SIGOPS Volume 30, Issue 5

Check for updates

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

158
Total Citations
View Citations
5,130
Total Downloads

Downloads (Last 12 months)311
Downloads (Last 6 weeks)40

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jacob B(2022)The Memory SystemundefinedOnline publication date: 5-Mar-2022
Kwon YDunn ALee MHofmann OXu YWitchel E(2016)SegoACM SIGOPS Operating Systems Review10.1145/2954680.287237250:2(277-290)Online publication date: 25-Mar-2016
Karna AYuting Chen (2013)Anticrasher: Predicting and preventing impending crashes on runtime at user end2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI)10.1109/ICACCI.2013.6637213(448-453)Online publication date: Aug-2013
Doh IKim YKim EChoi JLee DNoh S(2013)Towards greener data centers with storage class memoryFuture Generation Computer Systems10.1016/j.future.2013.05.01229:8(1969-1980)Online publication date: 1-Oct-2013
Ovidiu NMirceaast GNovac M(2006)Data Loss Rate versus Mean Time To Failure in Memory HierarchiesAdvances in Systems, Computing Sciences and Software Engineering10.1007/1-4020-5263-4_48(305-307)Online publication date: 2006
Wietrzyk VOrgun MVaradharajan V(2001)On the Analysis of On-Line Database ReorganizationCurrent Issues in Databases and Information Systems10.1007/3-540-44472-6_23(293-306)Online publication date: 1-Jun-2001
Lu YShu JZhang J(2019)Mitigating Synchronous I/O Overhead in File Systems on Open-Channel SSDsACM Transactions on Storage10.1145/331936915:3(1-25)Online publication date: 31-May-2019
Kim JKim SYun JWon YHung CPapadopoulos G(2019)Energy efficient IO stack design for wearable deviceProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3297491(2152-2159)Online publication date: 8-Apr-2019
Gogte VDiestelhorst SWang WNarayanasamy SChen PWenisch T(2018)Persistency for synchronization-free regionsACM SIGPLAN Notices10.1145/3296979.319236753:4(46-61)Online publication date: 11-Jun-2018
Gogte VDiestelhorst SWang WNarayanasamy SChen PWenisch TFoster JGrossman D(2018)Persistency for synchronization-free regionsProceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3192366.3192367(46-61)Online publication date: 11-Jun-2018
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents