fix grpo lora split module #3635

hjh0119 · 2025-03-24T11:20:36Z

PR type

peft model may start with a prefix like model.language_model, so using startswith might not work as expected

Paste your experiment result here(if needed).

…0325 * commit '20cfa3396705a50230d5cb64850e54ba5043ee2c': (23 commits) compat vllm0.8.1 (modelscope#3656) fix grpo vllm tp (modelscope#3658) fix label_names (modelscope#3657) set grpo multi turn max tokens (modelscope#3655) update docs (modelscope#3653) fix grpo epsilon(modelscope#3652) Fix template torch_dtype (modelscope#3651) [grpo] separate the epsilon (modelscope#3599) fix grpo pt ddp (modelscope#3648) fix prm (modelscope#3647) grpo reset prefix cache (modelscope#3640) fix grpo warning (modelscope#3630) support qwen2_5_vl_32b (modelscope#3642) fix reward model (modelscope#3641) fix grpo lora split module (modelscope#3635) fix grpo cosine reward (modelscope#3638) Support deepseek v3 0324 (modelscope#3637) update docs (modelscope#3633) fix (modelscope#3632) support train_sampler_random (modelscope#3631) ... # Conflicts: # docs/source/Instruction/GRPO.md # docs/source/Instruction/命令行参数.md # docs/source_en/Instruction/Command-line-parameters.md # docs/source_en/Instruction/GRPO.md

fix

4feb7d9

Jintao-Huang approved these changes Mar 24, 2025

View reviewed changes

hjh0119 merged commit 3232b69 into modelscope:main Mar 25, 2025
1 of 2 checks passed

hjh0119 deleted the grpo-split-fix branch March 25, 2025 01:58