- Clone the repository and its submodules:
git clone --recursive https://github.com/HanGuo97/log-linear-attention.git
cd log-linear-attention
- Install the package and its dependencies:
pip install -e .
pip install -e flame/
pip install -r flame/3rdparty/torchtitan/requirements.txt
We provide a Dockerfile
for containerized setup. To use it:
# Build the Docker image
DOCKER_BUILDKIT=1 docker build \
-t log-linear-attention \
-f Dockerfile \
.
# Run the container
docker run -ti \
--gpus all \
log-linear-attention \
bash
-
Configure the data preprocessing:
- Open
hattention/preprocess_data.py
- Modify the save path to your desired location
- Open
-
Run the preprocessing script:
python -m hattentions.preprocess_data
Note
The data preprocessing step may take a while.
- Navigate to the training framework:
cd flame/
- Launch training with the following command:
bash ../scripts/train_flame.sh --name [NAME] --config [CONFIG] --seed [--ac]
NAME
: Name for the experiment and save pathCONFIG
: Name of the config file inconfigs/flame/
(without .json extension)--seed
: Create a seed checkpoint before training--ac
: Optional flag to enable activation checkpointing
Note
- Before training, modify the absolute file paths in
scripts/train_flame.sh
to match your setup - The first training step will compile Triton kernels, which may take tens of minutes
Special thanks to Tianyuan Zhang, Jyo Pari, Adam Zweiger, and Yu Zhang for lots of help and discussions.