8000 Releases · dmtcp/dmtcp · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Releases: dmtcp/dmtcp

DMTCP 4.0.0

25 Jun 22:46
Compare
Choose a tag to compare

This is a major release which introduces breaking checkpoint-image format. As such, the checkpoint images are not compatible with older releases. Other fixes include:

  • bug-fixes related to corner cases related to initialization.
  • bug-fixes to support custom malloc libraries.
  • bug-fix related to a regression involving interval checkpointing.
  • fixed a regression involving --restartdir.
  • support for close_range system call.
  • Logging improvements.

Changelog:

Full Changelog: 3.2.0...4.0.0

DMTCP 3.2.0

27 Feb 03:31
Compare
Choose a tag to compare

This minor release includes:

  • support for [vvar_vclock] memory regions present on modern kernels.
  • bug fix for pthread_cancel handling.
  • bug fix for dlopen(NULL, ...) calls.
  • bug fix for thread handling on RISCV.

Full Changelog: v3.1.2...3.2.0

DMTCP v3.1.2

14 Oct 06:06
Compare
Choose a tag to compare

A regression in 3.1.1 caused "dmtcp_launch -i XX ..." to fail.
A commit was created to fix this.

DMTCP v3.1.1

08 Oct 14:36
Compare
Choose a tag to compare
  • jalib/jalloc.cpp: bool_atomic_dwcas() -- Align the storage buffers for DMTCP internal allocations to 128 bits (16 bytes)
  • This affected primarily ARM64. 128-bit data types must be 16-byte aligned, or the CPU throws a SIGBUS error
  • Small number of minor other change, primarily refactoring for maintenance

DMTCP v3.1.0

30 Sep 16:59
Compare
Choose a tag to compare
  • Many bug fixes for robustness, performance
  • Supports: x86_64, aarch64 (ARM64), RISC-V
  • Supports 32-bit arm and x86 (but not recently tested; bug reports welcome)
  • New flags: --stale-timeout (default: 8 hours) and --timeout (default: none)
  • python3 executable is now the standard for DMTCP:
  • Obsolete DMTCP plugins removed
  • Enhanced use of atomics for internal lock-free data structures
    (a regresssion fixed for better performance for OpenMP)
  • DMTCP tested to support new platforms:
    MANA ckpt for MPI (release 1.0.0); CUDA ckpt (experimental;
    McMini (Model Checker: MINImal for easy modification)
    (release 1.0.0; experimental branch for deep debugging))
  • Enhanced util/gdb-dmtcp-utils.py tools for GDB debugging
  • Enhanced tools for debugging user code in GDB after restart
  • See NEWS file for further details

Contributors: @aayushi363 @dahongli @JainTwinkle @karya0 @xuyao0127

DMTCP 3.0 released

10 Jun 02:22
Compare
Choose a tag to compare

Summary

For some time, it has been recommended to use the latest github master branch for new projects using DMTCP. This release formalizes that status. At this time, the InfiniBand plugin is deprecated and likely doesn't work. Further, the DMTCP flag '--no-coordinator' is not currently supported. It may be brought back to life if important use cases are seen. AARCH64 support may or may not work. Please write to developers if needed. DMTCP now requires C++14.

However, for transparent checkpointing of MPI, please see: https://github.com/mpickpt/mana That project is undergoing intensive testing. Please write to the developers for the latest status.

There is also a highly experimental branch to support transparent checkpointing of CUDA: https://github.com/DMTCP-CRAC/CRAC-early-development. Please write to the developers for plans to replace that experimental version.

Major DMTCP enhancements:

  • The plugin facility for end users has now been made more flexible. In particular, a plugin can now declare a PRESUSPEND phase. See DMTCP test/plugin/presuspend/ for an example plugin using presuspend. See the mpi-proxy-split plugin of the MANA project for a real-world example.

  • DMTCP now includes the ability to create an MTCP restart plugin, for use in split processes (see above). The lower-half application can use the MTCP restart plugin to restore the upper half from its checkpoint image.

  • The DMTCP key-value database (KVDB) was extended, for use by user plugins.

  • A new GDB utility, DMTCP/util/gdb-dmtcp-utils, is provided. Source this file into GDB when debugging DMTCP or other software. 'gdb-dmtcp-utils' does not depend on DMTCP, and can be used more generally.

Other enhancements and bug fixes:

  • Much of the DMTCP coordinator was rewritten to be more flexible, and support the new split process model.
  • DMTCP ordered maps were made more efficient.
  • Support for Linux Hugepages was added.
  • DMTCP supports Microsoft Windows WSL
  • New events, RUNNING and THREAD_RESUME, were added.
  • Added DMTCP_COORD_WRITE_CKPT environment variable
  • Improved DMTCP logging for use when debugging DMTCP
  • DMTCP now simulate vfork using fork.
  • Added ability to truncate append-only/RW files on restart.
  • Add './configure --disable-dlsym-wrapper' for special cases
  • MAP_FIXED_NOREPLACE used for safer execution during restart
  • Preserving user-requested rlimit across checkpoint-restart
  • Fixed SysV msg queue logic
  • Fixed freopen logic
  • Many smaller bug fixes

DMTCP 2.6.0

14 Aug 19:01
Compare
Choose a tag to compare

Version 2.6.0 release notes

Newer flags for configure:

  • Rename --enable-debug to --enable-logging
  • Add --enable-debug: "-Wall -g3 -O0" (for debugging DMTCP)

Newer flags for dmtcp_restart:

  • Add --debug-restart-pause flag to dmtcp_restart

Bug fixes and enhancements:

  • Fixes for glibc versions greater than or equal to 2.24
  • Fix deadlock in system() wrapper when the child crashes
  • Fix deadlock when a process is forked in the resume phase (issue #691)
  • jsocket: Warn user if peer closes socket while draining (issue #701)
  • Fix epoll1 test (initialize addrlen for accept()) (#705)
  • Fix to correctly calculate Coordinator/Host IP:
    Affects some distributed applications
  • Allow restored stack to grow if needed.
  • Fix bug in POSIX timer: race condition manifested in test/timer.c/Ubuntu-18.04
  • Modified InfiniBand plugin for more robust support
    (primarily of interest for MPI)
  • The floating point environment (fegetenv()) is now restored on restart.
    (Formerly, only the rounding mode (fegetround()) was restored.)
  • The current resource limits (rlim_cur) for RLIMIT_NOFILE and RLIMIT_STACK
    are restored if possible.
  • Mutex ownership and robust mutexes are now supported if DMTCP is configured
    with --enable-mutex-wrappers. (However, this configuration can also
    add runtime overhead if mutex operations are called very frequently.)
    [Thanks to Johannes Stoelp, Laurent Buchard, Pankaj Mehta of Synopsys, Inc.]
  • Fix bug if stack grows a lot after a restart.
  • Improved support for pty's
  • util/gdbinit-example added for those who wish to debug DMTCP internals.
  • Many bug fixes

DMTCP 2.5.2

15 Nov 20:31
Compare
Choose a tag to compare
  • All fixes in Release DMTCP-2.4.9 are incorporated in this release.
  • An incompatibility of DMTCP with Open MPI 1.10 when using orterun (mpirun)
    was discovered. This does not affect recent versions, such as Open MPI 2.x.
  • In some rare cases, open files were not properly restored due to
    a use-after-free bug. This is now fixed.
  • In some rare cases, one process had created a SysV shared memory object,
    and a different process was assigned to restore it on restart. This
    was not handled correctly, and is now fixed.
  • Correctly restore CPU affinities of threads
  • Virtualized SysV shared memory keys to avoid race condition on restart
  • Fixed logic for checking if relative path to file was a duplicate
    of another existing path
  • The NSCD area for name service caching daemon was not handled correctly
    in CentOS 6.8 and later correctly. Fixed now.
  • The Linux sched.h include file for scheduling of cores was added to
    satisfy some older Linux distros that needed it for compiling DMTCP.
  • Fixed a regression in which --enable-debug (for verbose debug logs)
    was not being properly written.
  • The DMTCP coordinator was displaying a spurious warning, "Failed to find
    coordinator IP address", because it did not check for a canoncial hostname.
    A related issue prevented DMTCP from working properly on some
    SUSE/openSUSE distros.

DMTCP 2.4.9

14 Nov 22:47
Compare
Choose a tag to compare

Version 2.4.9 release notes

  • Fixed a regression causing deleted NFS files to be handled incorrectly
  • Fixed handling of glibc for versions greater than glibc-2.24
  • Errors and warnings with gcc-7.x are fixed
  • A rare bug affecting pthread_cancel, etc., created incorrect pid on restart
  • man pages fixed: Description section was always describing dmtcp_command

DMTCP 2.5.1

05 Sep 20:33
Compare
Choose a tag to compare

Version 2.5.1 release notes

This release mostly provides added robustness. Two notable items of
added functionality are:
i. DMTCP_RESTART_PAUSE and DMTCP_RESTART_PAUSE0 environment variables
for easier debugging upon initial restart
ii. The --debug-logs flag was added to dmtcp_launch/dmtcp_restart.
One can now turn on logging individually for separate plugins,
instead of only turning it on globally.

An incompatibility of DMTCP with Open MPI 1.10 when using orterun (mpirun)
was discovered. This may also affect some other versions of Open MPI 1.10.
This bug will be fixed in a future release.

  • Fixed an issue when starting multiple DMTCP coordinators on same host
    at approximately the same time
  • Fixed issue with PBS scheduler for HPC
  • Fixed issue when restarting on a different host with a larger
    limit on the number of open file descriptors
  • dmtcp_launch/dmtcp_restart now accept '--debug-logs' flag to specify
    which DMTCP plugins should produce logging information
    (It used to be all or nothing.)
  • Improved robustness for IB (InfiniBand) plugin
  • Fixed DMTCP_RESTART_PAUSE and DMTCP_RESTART_PAUSE0 environment variables
    for debugging upon restart
  • The brk() call was failing on restart on Debian due to overly strict assert
  • dmtcp_launch was hanging on some RHEL5 and RHEL6 due to deadlock with
    libc low-level locks. Fixed now.
  • Updated tls_pid_offset in DMTCP to handle newer GLIBc (versions > 2.24)
  • Fixed launch of 32-bit binary when forking/execing from a 64-bit executable
  • Fixed issue that can affect a parent holding a malloc-lock while forking
  • Fixed issue when a user thread calls 'dmtcp_get_coord_ckpt_dir()'
0