8000 GitHub - wh-forker/dLLM-cache: Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).

License

Notifications You must be signed in to change notification settings

wh-forker/dLLM-cache

 
 

Repository files navigation

dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).

🔥 News

  • [2025/05/23] The code of our paper has been released.
  • [2025/05/22] Our paper has been released.

✨️ Key Highlights

radar_speed

  • Speedup: Achieves up to 9.1x speedup over standard dLLM pipelines, with no performance loss on most tasks.
  • Evaluation: Evaluated on LLaDA 8B and Dream 7B.
  • Latency: Approaches ARM-level inference speeds in many scenarios.

🚀 Pipeline

Here's an overview of the process behind our dLLM-Cache method: pipeline

🛠️ Installation

To get started with dLLM-Cache, follow the installation instructions below.

  1. Clone the Repository:
git clone https://github.com/maomaocun/dLLM-Cache.git
cd dLLM-Cache
  1. Set Up the Environment: Create a Python environment with conda or virtualenv and install dependencies:
bash install.sh
  1. demo:
python demo_{model_name}.py
  1. Running Experiments: Run experiments using the provided scripts:
bash scripts/run_{model_name}_{task_name}_base.sh

📘 Example Usage

  1. GSM8K with LLaDA
bash scripts/run_LLaDA_gsm8k_base.sh
  1. BBH with Dream
bash scripts/run_Dream_bbh_base.sh

📮 Contact

If you have any questions, please email yangyicun187@gmail.com.

🎉 Acknowledgements

This repository was built off of LLaDA, Dream and lm-evaluation-harness.

📌 Citation

If you find dLLM-Cache useful for your research and applications, please cite using this BibTeX:

@misc{liu2025dllm,
      title={dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching}, 
      author={Zhiyuan Liu and Yicun Yang and Yaojie Zhang and Junjie Chen and Chang Zou and Qingyan Wei and Shaobo Wang and Linfeng Zhang},
      year={2025},
      url={https://github.com/maomaocun/dLLM-cache},
}

🌟 Star History

Star History Chart

About

Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 67.8%
  • Shell 32.2%
0