-
Notifications
You must be signed in to change notification settings - Fork 416
Does transformerEngine support 2080ti? #1680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@cyanguwa Could you take a look? 2080ti is Turing (sm75). |
@SeekPoint could you give more details about the workload that triggered this - e.g. datatype, shapes of tensors? Also, if you could rerun with those suggested environment variables (NVTE_DEBUG=1 NVTE_DEBUG_LEVEL=2) and paste the output that would be super helpful. |
I have 4 cards namely special version of 2080ti, each card have 22GB GPU memory the case come from https://swift.readthedocs.io/zh-cn/latest/Instruction/Megatron-SWIFT%E8%AE%AD%E7%BB%83.html Considering the log is huge, only paste the trackback . the traceback: [rank0]: Traceback (most recent call last): |
Hi @SeekPoint, the part of the log that would help us narrow the issue should actually be before the traceback - with the NVTE_DEBUG and NVTE_DEBUG_LEVEL environment variables set there should be lines logging the reasons for why it did not choose any backend. |
while I run another project which used TransformerEngine, it trigger exception:
https://github.com/NVIDIA/TransformerEngine/blob/main/transformer_engine/pytorch/attention.py
# raise exception if no backend is available
if sum([use_flash_attention, use_fused_attention, use_unfused_attention]) == 0:
raise ValueError(
"No dot product attention backend is available for the provided inputs. Please"
" run with NVTE_DEBUG=1 NVTE_DEBUG_LEVEL=2 to find out the reasons for"
" disabling all backends."
)
The text was updated successfully, but these errors were encountered: