[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1159803.1159822acmconferencesArticle/Chapter ViewAbstractPublication PagesicfpConference Proceedingsconference-collections
Article

Stabilizers: a modular checkpointing abstraction for concurrent functional programs

Published: 16 September 2006 Publication History

Abstract

Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execution in multi-threaded code is not obvious, however. For a thread to correctly rexecute a region of code, it must ensure that all other threads which have witnessed its unwanted effects within that region are also reverted to a meaningful earlier state. If not done properly, data inconsistencies and other undesirable behavior may result. however, automatically determining what constitutes a consistent global checkpoint is not straightforward since thread interactions are a dynamic property of the program.In this paper, we present a safe and efficient checkpointing mechanism for Concurrent ML (CML) that can be used to recover from transient faults. We introduce a new linguistic abstraction called stabilizers that permits the specification of per-thread monitors and the restoration of globally consistent checkpoints. Safe global states are computed through lightweight monitoring of communication events among threads (e.g. message-passing operations or updates to shared variables).Our experimental results on several realistic, multithreaded, server-style CML applications, including a web server and a windowing toolkit, show that the overheads to use stabilizers are small, and lead us to conclude that they are a viable mechanism for defining safe checkpoints in concurrent functional programs.

References

[1]
A. Adya, R. Gruber, B. Liskov, and U. Maheshwari. Efficient Optimistic Concurrency Control Using Loosely Synchronized Clocks. SIGMOD Record (ACM Special Interest Group on Management of Data), 24(2):23--34, June 1995.
[2]
Saurabh Agarwal, Rahul Garg, Meeta S. Gupta, and Jose E. Moreira. Adaptive Incremental Checkpointing for Massively Parallel Systems. In ICS '04: Proceedings of the 18th annual international conference on Supercomputing, pages 277--286, New York, NY, USA, 2004. ACM Press.
[3]
Micah Beck, James S. Plank, and Gerry Kingsley. Compiler-Assisted Checkpointing. Technical report, University of Tennessee, Knoxville, TN, USA, 1994.
[4]
Greg Bronevetsky, Daniel Marques, Keshav Pingali, and Paul Stodghill. Automated Application-Level Checkpointing of MPI Programs. In PPoPP '03: Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming, pages 84--94, New York, NY, USA, 2003. ACM Press.
[5]
Greg Bronevetsky, Daniel Marques, Keshav Pingali, Peter Szwed, and Martin Schulz. Application-Level Checkpointing for Shared Memory Programs. In ASPLOS-XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, pages 235--247, New York, NY, USA, 2004. ACM Press.
[6]
R. Bruni, H. Melgratti, and U. Montanari. Theoretical Foundations for Compensations in Flow Composition Languages. In POPL '05: Proceedings of the 32nd ACM SIGPLAN-SIGACT sysposium on Principles of programming languages, pages 209--220, New York, NY, USA, 2005. ACM Press.
[7]
G. Candea, S. Kawamoto, Y. Fujiki, G. Friedman, and A. Fox. Microreboot - A Technique for Cheap Recovery. In 6th Symposium on Operating Systems Design and Implementation, San Francisco, California, 2004.
[8]
Yuqun Chen, James S. Plank, and Kai Li. CLIP: A Checkpointing Tool for Message-Passing Parallel Programs. In Supercomputing '97: Proceedings of the 1997 ACM/IEEE conference on Supercomputing, pages 1--11, New York, NY, USA, 1997. ACM Press.
[9]
Jan Christiansen and Frank Huch. Searching for Deadlocks while Debugging Concurrent Haskell Programs. In ICFP '04: Proceedings of the ninth ACM SIGPLAN international conference on Functional programming, pages 28--39, New York, NY, USA, 2004. ACM Press.
[10]
Panos K. Chrysanthis and Krithi Ramamritham. ACTA: the SAGA continues. In Database Transaction Models for Advanced Applications, pages 349--397. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.
[11]
William R. Dieter and James E. Lumpp Jr. A User-level Checkpointing Library for POSIX Threads Programs. In FTCS '99: Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing, page 224, Washington, DC, USA, 1999. IEEE Computer Society.
[12]
Kevin Donnelly and Matthew Fluet. Transactional events. In ICFP '06: Proceedings of the Eleventh ACM SIGPLAN International Conference on Functional Programming, New York, NY, USA, 2006. ACM Press.
[13]
E.N. (Mootaz) Elnozahy, Lorenzo Alvisi, Yi-Min Wang, and David B. Johnson. A Survey of Rollback-Recovery Protocols in Message-Passing Systems. ACM Comput. Surv., 34(3):375--408, 2002.
[14]
John Field and Carlos A. Varela. Transactors: a Programming Model for Maintaining Globally Consistent Distributed State in Unreliable Environments. In POPL '05: Proceedings of the 32nd ACM SIGPLAN-SIGACT sysposium on Principles of programming languages, pages 195--208, New York, NY, USA, 2005. ACM Press.
[15]
Matthew Flatt and Robert Bruce Findler. Kill-safe Synchronization Abstractions. In PLDI '04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, pages 47--58, New York, NY, USA, 2004. ACM Press.
[16]
Jim Gray and Andreas Reuter. Transaction Processing. Morgan-Kaufmann, 1993.
[17]
Tim Harris and Keir Fraser. Language support for lightweight transactions. In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 388--402. ACM Press, 2003.
[18]
Tim Harris, Simon Marlow, Simon Peyton Jones, and Maurice Herlihy. Composable Memory Transactions. In ACM Conference on Principles and Practice of Parallel Programming, 2005.
[19]
Maurice Herlihy, Victor Luchangco, Mark Moir, and William N. Scherer, III. Software transactional memory for dynamic-sized data structures. In ACM Conference on Principles of Distributed Computing, pages 92--101, 2003.
[20]
http://www.mlton.org.
[21]
D. Hulse. On Page-Based Optimistic Process Checkpointing. In IWOOOS '95: Proceedings of the 4th International Workshop on Object-Orientation in Operating Systems, page 24, Washington, DC, USA, 1995. IEEE Computer Society.
[22]
Mangesh Kasbekar and Chita Das. Selective Checkpointing and Rollback in Multithreaded Distributed Systems. In 21st International Conference on Distributed Computing Systems, 2001.
[23]
H.T. Kung and John T. Robinson. On Optimistic Methods for Concurrency Control. TODS, 6(2):213--226, 1981.
[24]
Kai Li, Jeffrey Naughton, and James Plank. Real-time Concurrent Checkpoint for Parallel Programs. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 79--88, 1990.
[25]
John Reppy. Concurrent Programming in ML. Cambridge University Press, 1999.
[26]
Martin Rinard. Effective Fine-Grained Synchronization for Automatically Parallelized Programs Using Optimistic Synchronization Primitives. ACM Transactions on Computer Systems, 17(4):337--371, November 1999.
[27]
Michael F. Ringenburg and Dan Grossman. Atomcaml: first-class atomicity via rollback. In ICFP '05: Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming, pages 92--104, New York, NY, USA, 2005. ACM Press.
[28]
Asser N. Tantawi and Manfred Ruschitzka. Performance Analysis of Checkpointing Strategies. ACM Trans. Comput. Syst., 2(2):123--144, 1984.
[29]
Andrew P. Tolmach and Andrew W. Appel. Debugging Standard ML Without Reverse Engineering. In LFP '90: Proceedings of the 1990 ACM conference on LISP and functional programming, pages 1--12, New York, NY, USA, 1990. ACM Press.
[30]
Andrew P. Tolmach and Andrew W. Appel. Debuggable Concurrency Extensions for Standard ML. In PADD '91: Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging, pages 120--131, New York, NY, USA, 1991. ACM Press.
[31]
Adam Welc, Suresh Jagannathan, and Antony Hosking. Safe futures for java. In Proceedings of the ACM SIGPLAN Conference on Object-Oriented Programming Systems, Languages, and Applications, pages 439--453. ACM Press, 2005.
[32]
Adam Welc, Suresh Jagannathan, and Antony L. Hosking. Transactional Monitors for Concurrent Objects. In European Conference on Object-Oriented Programming, pages 519--542, 2004.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICFP '06: Proceedings of the eleventh ACM SIGPLAN international conference on Functional programming
September 2006
308 pages
ISBN:1595933093
DOI:10.1145/1159803
  • General Chair:
  • John Reppy,
  • Program Chair:
  • Julia Lawall
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 41, Issue 9
    Proceedings of the 2006 ICFP conference
    September 2006
    296 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1160074
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 September 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. checkpointing
  2. concurrent ML
  3. concurrent programming
  4. error recovery
  5. exception handling
  6. transactions

Qualifiers

  • Article

Conference

ICFP06
Sponsor:

Acceptance Rates

Overall Acceptance Rate 333 of 1,064 submissions, 31%

Upcoming Conference

ICFP '25
ACM SIGPLAN International Conference on Functional Programming
October 12 - 18, 2025
Singapore , Singapore

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2013)Towards Efficient Abstractions for Concurrent ConsensusRevised Selected Papers of the 14th International Symposium on Trends in Functional Programming - Volume 832210.1007/978-3-642-45340-3_5(76-90)Online publication date: 14-May-2013
  • (2011)Atomic boxesProceedings of the 25th European conference on Object-oriented programming10.5555/2032497.2032538(634-657)Online publication date: 25-Jul-2011
  • (2011)Communicating memory transactionsACM SIGPLAN Notices10.1145/2038037.194157746:8(157-168)Online publication date: 12-Feb-2011
  • (2011)Communicating memory transactionsProceedings of the 16th ACM symposium on Principles and practice of parallel programming10.1145/1941553.1941577(157-168)Online publication date: 12-Feb-2011
  • (2011)Atomic Boxes: Coordinated Exception Handling with Transactional MemoryECOOP 2011 – Object-Oriented Programming10.1007/978-3-642-22655-7_29(634-657)Online publication date: 2011
  • (2009)Programming in Manticore, a heterogenous parallel functional languageProceedings of the Third summer school conference on Central European functional programming school10.5555/1939128.1939132(94-145)Online publication date: 21-May-2009
  • (2009)Partial memoization of concurrency and communicationACM SIGPLAN Notices10.1145/1631687.159657544:9(161-172)Online publication date: 31-Aug-2009
  • (2009)Partial memoization of concurrency and communicationProceedings of the 14th ACM SIGPLAN international conference on Functional programming10.1145/1596550.1596575(161-172)Online publication date: 31-Aug-2009
  • (2008)Transactional events for MLProceedings of the 13th ACM SIGPLAN international conference on Functional programming10.1145/1411204.1411222(103-114)Online publication date: 20-Sep-2008
  • (2008)Transactional events for MLACM SIGPLAN Notices10.1145/1411203.141122243:9(103-114)Online publication date: 20-Sep-2008
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media