change self_attn_type #21

vince62s · 2024-06-11T14:08:05Z

I think it should be self_attn_backend, and should not be a model setting but a training setting on top of an inference setting

values should be: "flash2", "pytorch"
"pytorch" would include the sdpa_kernels used here: https://github.com/eole-nlp/eole/blob/main/eole/modules/multi_headed_attn.py#L637

we could test if flash2 is installed at training/inference start and adjust backend if necessary.

thus we could remove the flash2 setting from the MHA (redundant with self_attn_backend)

francoishernandez added the enhancement New feature or request label Jun 12, 2024

vince62s closed this as completed Jun 13, 2024

francoishernandez linked a pull request Jun 13, 2024 that will close this issue

review flash/sdpa arg #25

Merged

Provide feedback