8000 Release v2025.03.0 · LLNL/RAJAPerf · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

v2025.03.0

Latest
Compare
Choose a tag to compare
@rhornung67 rhornung67 released this 19 Jun 15:25
b66b9d7

This release contains new features, bug fixes, and build improvements.

Please download the RAJAPerf-v2025.03.0.tar.gz file below. The others will not work due to the way RAJAPerf uses git submodules.

  • New features and usage changes:

    • Added option to print algorithmic complexity of each kernel (at request of benchmarking team).
    • Removed older RAJA reductions from OpenMP target offload variants of kernels with reductions. The officially supported RAJA reductions for OpenMP target offload are the newer valop reductions.
    • Added resource argument to all RAJA kernel variants for consistency and following RAJA usage recommendations.
    • Added EMPTY kernel to Basic group that does nothing inside a loop body to measure the minimal cost of launching a kernel.
    • Added FEMSWEEP kernel, which represents a FEM-based linear sweep used in deterministic transport codes.
    • Added kernel launch tunings to the LTIMES and LTIMES_NOVIEW kernels that use the RAJA::launch API. This is intended to be used to understand performance differences between the RAJA::kernel and RAJA::launch APIs.
    • Added RAJA Views to base variants of LTIMES kernel.
    • Added citation on GitHub project page to P3HPC paper presented at SC24 on using Caliper and Thicket in RAJA Performance Suite.
    • Add a command line option to enable custom scan tunings, default to on.
    • For comm kernels, modified the MPI buffer allocation to do one large allocation and dole it out with alignment specified by the --align option.
    • Added caliper configuration information to some build scripts as examples on how to use.
  • Build changes / improvements:

    • The RAJA submodule has been updated to v2025.03.2.
    • The BLT submodule has been updated to v0.7.0, which is the version used by the RAJA submodule version.
    • Kokkos submodule updated to v3.7.02.
  • Bug fixes / improvements:

    • Fixes for Windows builds.
    • Fixed memory issue in SYCL variants of FIR kernel.
    • Fixed issues in OpenMP target offload variants of HISTOGRAM and MULTI_REDUCE kernels.
    • FIxed issue where multiple Caliper files get generated erroneously for a single run of the Suite.
    • Fixed some potential race condition issue in how data copies are handled in the kernels.
    • Get HIP wavefront size from RAJA configuration rather than hard code it in multiple kernels.
    • Fixed hang in HIP custom scan implementation with warp size 32, on consumer cards such as Radeon 7900 XTX.
    • Fixed compilation issues related to Kokkos.
0