[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Controller/Precompiler for Portable Checkpointing

Published: 01 February 2006 Publication History

Abstract

This paper presents CPPC (Controller/Precompiler for Portable Checkpointing), a checkpointing tool designed for heterogeneous clusters and Grid infrastructures through the use of portable protocols, portable checkpoint files and portable code. It works at variable level being user-directed, thus generating small checkpoint files. It allows parallel processes to checkpoint independently, without runtime coordination or message-logging. Consistency is achieved at restart time by negotiating the restart point. A directive-based checkpointing precompiler has also been implemented to ease up user's effort. CPPC was designed to work with parallel MPI programs, though it can be used with sequential ones, and easily extended to parallel programs written using different message-passing libraries, due to its highly modular design. Experimental results are shown using CPPC with different test applications.

Cited By

View all
  • (2007)enhancing fault-tolerance of large-scale MPI scientific applicationsProceedings of the 9th international conference on Parallel Computing Technologies10.5555/2392094.2392111(153-161)Online publication date: 3-Sep-2007
  • (2007)CPPC-GProceedings of the 7th international conference on Parallel processing and applied mathematics10.5555/1786194.1786294(852-859)Online publication date: 9-Sep-2007

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEICE - Transactions on Information and Systems
IEICE - Transactions on Information and Systems  Volume E89-D, Issue 2
February 2006
474 pages
ISSN:0916-8532
EISSN:1745-1361
Issue’s Table of Contents

Publisher

Oxford University Press, Inc.

United States

Publication History

Published: 01 February 2006

Author Tags

  1. MPI
  2. checkpointing
  3. fault tolerance
  4. parallel programming

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2007)enhancing fault-tolerance of large-scale MPI scientific applicationsProceedings of the 9th international conference on Parallel Computing Technologies10.5555/2392094.2392111(153-161)Online publication date: 3-Sep-2007
  • (2007)CPPC-GProceedings of the 7th international conference on Parallel processing and applied mathematics10.5555/1786194.1786294(852-859)Online publication date: 9-Sep-2007

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media