-
Notifications
You must be signed in to change notification settings - Fork 24.4k
[RFC] Cuda support matrix for Release 2.4 #123456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
12.x has inexplicably higher memory use than 11.8 for training 2d condition unet models. |
@bghira Could you add a link to the corresponding issue, please? |
@atalman Agreed, but also CUDA 12.x is now out for > 1 year and we are also providing PyTorch binaries with CUDA 12 for > 1 year. |
@ptrblck this is internal research that my group worked on, and we never filed an issue as it was unclear which level the issue was introduced in, and we didn't have the resources to dig into that. i can say that the vast (lol, pun) majority of cloud instances/containers/kernels the users will have access to presently will be limited to CUDA 11.8 - it's convenient enough that making 12.1 the minimum feels premature, despite how long that's been available. making ROCm 6 the minimum made sense, because everything about ROCm 5.x was awful, other than the fact that it supported a few more GPUs than 6 does. but CUDA 11.8 was very mature and isn't showing its age yet. |
In this case the claim is not actionable and since we are already using CUDA 12.1 in the default PyTorch binary (installable via
Could you share any information here too?
This RFC focuses on CUDA and we should not discuss rocm here. |
i don't think anyone even installs torch that way due to the high probability of issues. the download page for torch "builds" a command for people to use, which ends up adding the i guess these claims aren't enough to go on, and the newer version will just remain inaccessible for a while. |
I think ~25 million downloads/month as of now don't confirm your claim: https://pypistats.org/packages/torch In any case, if you have any valid issues, please let us know and we are happy to follow up! |
i don't think those stats confirm yours.
|
@bghira Again, if you have concrete issues, please create separate issues for them and we are happy to help. |
just because you don't like them doesn't make them invalid. i have trouble understanding why an nvidia representative is being so difficult about keeping support for CUDA 11.8 in a future pytorch release, which is entirely what I am here advocating for.. your approach is essentially coming across as if CUDA 12.1 is going to be the default unless someone provides really good reasons why it shouldn't be. i thought "we can't use 12.1" would work. this isn't where issues with CUDA 12.1 get reported. maybe someone else should be handling this ticket, since you are too personally involved. can @atalman be the one to respond from now on? thank you |
You are misunderstanding my posts, since I asked about concrete issues to follow up with in my very first response. Speculations just diverge the tracking issue here and are not helpful. I also don't have trouble keeping PyTorch + CUDA 11.8 binaries alive longer as I even added the concern about dropping compute capabilities.
It is already the default installable via This will be my last response to you, @bghira, since you are still diverging this topic without any actionable items. |
@atalman For option 3:
|
Will the NCCL be upgraded to 2.21.5 if the CUDA version is 12.4? |
For |
Closing this one Since 2.4 release is complete |
Uh oh!
There was an error while loading. Please reload this page.
🚀 [RFC] Cuda support matrix for Release 2.4
Opening this RFC to discuss CUDA version support for future PyTorch releases:
Option 1 - CUDA 11 and CUDA 12:
CUDA 11.8, CUDNN 8.9.7.29
CUDA 12.4, CUDNN 8.9.7.29 - Version hosted on pypi
Option 2 - CUDA 12:
CUDA 12.1, CUDNN 8.9.7.29
CUDA 12.4, CUDNN 8.9.7.29 - Version hosted on pypi
Option 3
CUDA 11.8, CUDNN 8.9.7.29
CUDA 12.1, CUDNN 8.9.7.29 - Version hosted on pypi, as stable
CUDA 12.4, CUDNN 8.9.7.29 - Experimental version
(Please note CUDNN version listed here 8.9.7.29 is not final, we may upgrade it for 2.4 release)
One advantage of Option 1 is the fact that older cuda driver is not compatible with CUDA 12 hence people with older drivers will benefit latest pytorch.
Please refer to:
https://docs.nvidia.com/deploy/cuda-compatibility/index.html#minor-version-compatibility
cc @seemethere @malfet @osalpekar @ptrblck @ezyang
The text was updated successfully, but these errors were encountered: