8000 Release v3.4.1 · modelscope/ms-swift · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

v3.4.1

Compare
Choose a tag to compare
@Jintao-Huang Jintao-Huang released this 13 May 06:33
· 175 commits to main since this release

中文版

新特性

  1. 序列并行: 支持在PT/SFT/DPO阶段使用ulysses序列并行。兼容deepspeed、packing、flash_attn、streaming等训练技术。训练脚本参考这里
  2. GRPO: 支持自定义奖励模型逻辑,内置了一个生成式奖励模型的例子,训练脚本参考这里
  3. Megatron-SWIFT: 更新megatron-core到0.12.0;新增max_epochs参数,在epoch到达max_epochs时停止训练并保存权重;新增wandb参数记录训练日志。
  4. 最佳实践:新增从零开始快速训练视觉语言模型的最佳实践,参考这里
  5. 外部贡献:支持GRPO使用judge0执行生成的代码;支持指定freeze/activate parameters使用正则表达式;支持对初始化模型中未初始化参数指定初始化策略。感谢招商银行技术团队的贡献。

新模型

  1. XiaomiMiMo/MiMo-7B-RL系列
  2. deepseek-ai/DeepSeek-Prover-V2-7B系列
  3. OpenGVLab/InternVL3-1B-Pretrained系列

English Version

New Features

  1. Sequence Parallelism: Supports the use of Ulysses sequence parallelism during PT/SFT/DPO stages. Compatible with training techniques such as DeepSpeed, packing, flash_attn, and streaming. Refer to the training script here.
  2. GRPO: Supports custom reward model logic. Includes a built-in example of a generative reward model. Refer to the training script here.
  3. Megatron-SWIFT: Updated megatron-core to version 0.12.0. Added the max_epochs parameter to stop training and save weights when the epoch reaches max_epochs. Added the wandb parameter to log training metrics.
  4. Best Practices: Added best practices for quickly training vision-language models from scratch. Refer to the guide here.
  5. External Contributions: Supports GRPO using judge0 for executing generated code. Allows specifying freeze/activate parameters using regular expressions. Supports defining initialization strategies for uninitialized parameters in the initial model. Thanks to the contributions from the technical team at China Merchants Bank.

New Models

  1. XiaomiMiMo/MiMo-7B-RL Series
  2. deepseek-ai/DeepSeek-Prover-V2-7B Series
  3. OpenGVLab/InternVL3-1B-Pretrained Series

What's Changed

New Contributors

Full Changelog: v3.4.0...v3.4.1

0