8000 [docs] fix seq_parallel by Jintao-Huang · Pull Request #3838 · modelscope/ms-swift · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[docs] fix seq_parallel #3838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions docs/source/GetStarted/SWIFT安装.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@
pip install 'ms-swift'
# 使用评测
pip install 'ms-swift[eval]' -U
# 使用序列并行
pip install 'ms-swift[seq_parallel]' -U
# 全能力
pip install 'ms-swift[all]' -U
```
Expand Down
2 changes: 1 addition & 1 deletion docs/source/Instruction/预训练与微调.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@ ms-swift使用了分层式的设计思想,用户可以使用命令行界面、
- Any-to-Any模型训练:参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/all_to_all)。
- 其他能力:
- 数据流式读取: 在数据量较大时减少内存使用。参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/streaming/train.sh)。
- 序列并行: 参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/sequence_parallel)。
- packing: 将多个序列拼成一个,让每个训练样本尽可能接近max_length,提高显卡利用率,参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/packing/train.sh)。
- 长文本训练: 参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/long_text)。
- lazy tokenize: 在训练期间对数据进行tokenize而不是在训练前tokenize(多模态模型可以避免在训练前读入所有多模态资源),这可以避免预处理等待并节约内存。参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/lazy_tokenize/train.sh)。

小帖士:
Expand Down
2 changes: 0 additions & 2 deletions docs/source_en/GetStarted/SWIFT-installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@ You can install it using pip:
pip install 'ms-swift'
# For evaluation usage
pip install 'ms-swift[eval]' -U
# For sequence parallel usage
pip install 'ms-swift[seq_parallel]' -U
# Full capabilities
pip install 'ms-swift[all]' -U
```
Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/Instruction/Pre-training-and-Fine-tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,8 @@ Additionally, we offer a series of scripts to help you understand the training c
- Any-to-Any Model Training: Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/all_to_all).
- Other Capabilities:
- Streaming Data Reading: Reduces memory usage when handling large datasets. Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/streaming/train.sh).
- Sequence Parallelism: Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/sequence_parallel).
- Packing: Combines multiple sequences into one, making each training sample as close to max_length as possible to improve GPU utilization. Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/packing/train.sh).
- Long Text Training: Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/long_text).
- Lazy Tokenize: Performs tokenization during training instead of pre-training (for multi-modal models, this avoids the need to load all multi-modal resources before training), which can reduce preprocessing wait times and save memory. Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/lazy_tokenize/train.sh).


Expand Down
0