8000 CUDA support · Issue #8 · sxysxy/OIDN-python · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

CUDA support #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kadir014 opened this issue Apr 12, 2025 · 3 comments
Open

CUDA support #8

kadir014 opened this issue Apr 12, 2025 · 3 comments

Comments

@kadir014
Copy link

Thank you for the nice wrapper. I managed to use the CUDA support with these steps if anyone else wants to do it as well:

  1. The latest version on PyPI (0.2) doesn't support the new Pythonic API, clone the repo and install an editable build. (pip install -e .)
  2. Download the latest OIDN binaries from here: https://github.com/RenderKit/oidn/releases
  3. Replace all the dll files and copy OpenImageDenoise_device_cuda.dll as well.
  4. In __init__.py, add CUDA dll as well
    ctypes.CDLL(os.path.join(cur_path, f"lib.win.x64/OpenImageDenoise_device_cuda.dll"))
  5. Disable the GetDeviceError function (it was raising a struct related error on my end.)
  6. In Buffer.create function, change the tensor creation function to the current arguments:
    bf.buffer_delegate = torch.zeros(*storage_shape, dtype=torch.float32)
  7. And done! Thanks to this I'm using denoising close to realtime in my toy pathtracer project.

If I can find time I can just open a PR as well.

@kadir014 kadir014 changed the title Running OIDN on CUDA CUDA support Apr 12, 2025
@sxysxy
Copy link
Owner
sxysxy commented Apr 12, 2025

It seems nice. I'd also like to try to improve this project after I finish my master's degree. Current project mainly provides bindings to the raw C API. Those 'pythonic APIs' still expose too much underlying concept such as buffers, devices, and are still not elegant. We just want it to denoise images from path-tracers. I think it's a good idea to make a very simple interface:

def denoise(image : Union[np.ndarray, PIL.Image, torch.tensor], **maybe_some_options): ...

This could handle 99% situations...

@sxysxy
Copy link
Owner
sxysxy commented Apr 12, 2025

There are some annoying issues with CUDA support. CUDA kernels can not run on GPUs with incompatible compute capability(e.g. kernels compiled for compute capability 7.5 can not run on 8.6 devies such Nvidia A100). Old version of precompiled binaries from https://github.com/RenderKit/oidn/releases may not ensure compatibility on newest GPU.

Possible Solutions:

  • Compile the source code of RenderKit/oidn when using pip to install. But this can lead to problems for users who don't know much about building python native extensions. Those codes are relevent to CUDA, proning to many enviromental issues
  • Maintain precompiled versions. The maintainers of this repository build oidn binaries from source codes and test usability...

@kadir014
Copy link
Author
kadir014 commented Apr 12, 2025

I used the edited CUDA version to try realtime-ish denoising in my toy pathtracer project.

In my tests oidn.Filter.execute took ~28% of the time (around 9ms), whereas managing oidn.Buffers, converting them to numpy arrays, reading the data into moderngl.Textures, etc... was the main bottleneck. It took ~72% of the time (around 20ms).

I wonder if oidn.Buffer implementation could be improved for this case.

But for non-real-time scenarios, even a more straightforward approach like the def denoise(...) idea above would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
0