8000 GitHub - NVIDIA/jitify at jitify2
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

NVIDIA/jitify

 
 

Repository files navigation

Jitify

A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).

Integrating runtime compilation into existing CUDA applications can be tricky and time-consuming. Jitify aims to simplify this process by hiding the complexities behind a simple, high-level interface.

Jitify2

The latest version of Jitify is now available in jitify2.hpp under the jitify2 namespace. The old version is still available in jitify.hpp, but will not be receiving further updates.

Simple example

#include <jitify2.hpp>
#include <cuda_runtime_api.h>

int main() {
  cudaFree(0);  // Initialize CUDA context

  std::string program_source = R"(
#include <cmath>
#include <cuda_fp16.h>

template <int N, typename T>
__global__ void my_kernel(T* data) { *data = std::pow(*data, T{N}); }
)";
  float h_data = 3.f;
  float* d_data;
  cudaMalloc((void**)&d_data, sizeof(float));
  cudaMemcpy(d_data, &h_data, sizeof(float), cudaMemcpyHostToDevice);

  using jitify2::get_cuda_include_dir, jitify2::Program, jitify2::ProgramCache;
  using jitify2::reflection::Template, jitify2::reflection::Type;

  static ProgramCache<> cache(
      /*max_size=*/100,
      *Program("my_program", program_source)
           // Preprocess source code and load all included headers.
           ->preprocess(
               {"-I" + get_cuda_include_dir(), "-arch=sm_80", "-arch=sm_90"}));

  dim3 grid(1), block(1);
  cache
      // Compile, link, and load the program, and obtain the loaded kernel.
      .get_kernel(Template("my_kernel").instantiate(2, Type<float>()))
      // Configure the kernel launch.
      ->configure(grid, block)
      // Launch the kernel.
      ->launch(d_data);
  return cudaDeviceSynchronize() != cudaSuccess;
}

Features

Jitify provides/takes care of the following things:

  • All NVRTC, nvJitLink, and CUDA Driver API calls.
  • Simple kernel instantiation and launch syntax with unmangled names.
  • Caching compiled kernels in memory and on disk.
  • Convenient offline-preprocessing workflow that makes it easy to load and embed all required runtime headers into your application.
  • JIT-safe standard library headers (e.g., float.h, stdint.h, limits etc.).
  • Linking to pre-compiled PTX/CUBIN/FATBIN/object/library files.
  • Easy error handling with optional exceptions.
  • Support for all recent CUDA versions on both Linux and Windows.
  • Support for pre-compiled headers, nvcc runtime compilation, source minification, and much more!

Things you can do with Jitify and NVRTC:

  • Rapidly port existing code to use CUDA Runtime Compilation.
  • Dramatically reduce code volume and offline-compilation times.
  • Increase kernel performance by baking in runtime constants and autotuning.

How to build

Jitify is just a single header file:

#include <jitify2.hpp>

Link with: -ldl (all cuda libraries are dynamically loaded at runtime by default)

Running tests

The unit tests can be built and run using CMake as follows:

$ mkdir build && cd build && cmake ..
$ make check -j6

Note that the tests in jitify2_test.cu may also be useful as a form of documentation for many jitify features.

Documentation

User guide

See jitify2_user_guide.md for a detailed guide on using jitify and all available options.

API documentation

Doxygen documentation can be generated by running:

$ mkdir build && cd build && cmake ..
$ make doc

The HTML and LaTeX results are placed into the doc/ subdirectory.

License

BSD-3-Clause

Principle authors

Ben Barsdell (NVIDIA)

Kate Clark (NVIDIA)

About

A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0