We introduce LLaDA 1.5, a competitive large diffusion language model, trained by variance-reduced preference optimization (VRPO).
Compared with LLaDA-8B-Instruct, LLaDA 1.5 achieves better performance on a wide range of tasks, including Math, Code, and Alignment tasks.
The LLaDA 1.5 model is available on Huggingface. Please employ the transformers to load.
from transformers import AutoModel, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('GSAI-ML/LLaDA-1.5', trust_remote_code=True)
model = AutoModel.from_pretrained('GSAI-ML/LLaDA-1.5', trust_remote_code=True, torch_dtype=torch.bfloat16)
The model is based on LLaDA-8B-Instruct, you can use the code for LLaDA-8B-Instruct to inference.
If you have any questions, please feel free to contact fengqizhu@ruc.edu.cn.
Please consider cite:
@article{zhu2025llada,
title={LLaDA 1.5: Variance-Reduced Preference Optimization for Large Language Diffusion Models},
author={Zhu, Fengqi and Wang, Rongzhen and Nie, Shen and Zhang, Xiaolu and Wu, Chunwei and Hu, Jun and Zhou, Jun and Chen, Jianfei and Lin, Yankai and Wen, Ji-Rong and others},
journal={arXiv preprint arXiv:2505.19223},
year={2025}
}