Open
Description
Summary
Keeps track of known issues and the current state of the integration, with the goal to bump order in priority list with higher confidence
Tasks
- CUDNN sdp attention causes loss explosion #139298
- Match Query's stride layout [cuDNN][SDPA] Match
query
's memory layout ordering foroutput
in cuDNN SDPA #138354 - gradOutput vs Ouput stride mismatch, blocked by cudnn bump: [cuDNN][SDPA] Match
query
's memory layout ordering foroutput
in cuDNN SDPA #138354 (comment) - Fill Value twiddiling: [cuDNN][SDPA] change fill value for
bool
to floating-point type mask conversion #140837 - Debugging PR: [cuDNN] Add an option to force cuDNN usage (incl. SDPA) #139699
- Update CuDNN backend to respect enable_gqa [CuDNN Attention] Performance Grouped Query Attention #139586
- Mitigated: SDPA: CUDNN backend error w/ q_seq_len = 1 #138529 but should see if fixed in new version of cudnn
- WSL: RuntimeError: cuDNN Frontend error: [cudnn_frontend] Error: No execution plans support the graph. huggingface/diffusers#9704 | need to fix or guard correctly
- cuDNN frontend:
FBCODE
bump to frontend version 1.12
cc @csarofeen @ptrblck @xwang233 @eqy @msaroufim @mikaylagawarecki