8000 Release v3.3.0 · modelscope/ms-swift · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

v3.3.0

Compare
Choose a tag to compare
@Jintao-Huang Jintao-Huang released this 11 Apr 06:36
· 277 commits to main since this release

中文版

新特性

  1. 支持DAPO算法,训练文档参考:https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO.html#dapo
  2. 支持多模态模型的序列packing,包括qwen2-vl、qwen2.5-vl、qwen2.5-omni和internvl2.5系列,训练速度提升100%。训练脚本参考:https://github.com/modelscope/ms-swift/tree/main/examples/train/packing
  3. 新增SWIFT和Megatron-SWIFT镜像,参考这里:https://swift.readthedocs.io/zh-cn/latest/GetStarted/SWIFT%E5%AE%89%E8%A3%85.html#id3
  4. 多模态/Omni/Moe量化能力增强,量化脚本参考这里:https://github.com/modelscope/ms-swift/blob/main/examples/export/quantize

新模型

  1. Qwen/Qwen2.5-Omni-7B
  2. LLM-Research/Llama-4-Scout-17B-16E-Instruct系列
  3. cognitivecomputations/DeepSeek-V3-0324-AWQ

English Version

New Features

  1. Supports the DAPO algorithm; training documentation can be found here: https://swift.readthedocs.io/en/latest/Instruction/GRPO.html#dapo
  2. Supports sequence packing for multimodal models, including qwen2-vl, qwen2.5-vl, qwen2.5-omni, and the internvl2.5 series, with a 100% increase in training speed. Training scripts can be found here: https://github.com/modelscope/ms-swift/tree/main/examples/train/packing
  3. Added SWIFT and Megatron-SWIFT mirrors, see details here: https://swift.readthedocs.io/en/latest/GetStarted/SWIFT-installation.html#mirror
  4. Enhanced quantization capabilities for Multimodal/Omni/Moe models, shell scripts can be found here: https://github.com/modelscope/ms-swift/blob/main/examples/export/quantize

New Models

  1. Qwen/Qwen2.5-Omni-7B
  2. LLM-Research/Llama-4-Scout-17B-16E-Instruct series
  3. cognitivecomputations/DeepSeek-V3-0324-AWQ

What's Changed

New Contributors

Full Changelog: v3.2.2...v3.3.0

0