SageAttention fork for build system integration
This repo makes it easy to build SageAttention for multiple Python, PyTorch, and CUDA versions, then distribute the wheels to other people. See releases for the wheels, and the workflow to build them on Windows.
If you only need to build and run on your own machine, you can clone this repo, install the dependencies in pyproject.toml
(include the correct torch version such as torch 2.7.1+cu128
), then run python setup.py install
(this avoids the environment checks of pip).