From 1100c3b47a54b707feb010d7f8babd0933a31a6c Mon Sep 17 00:00:00 2001
From: Jintao Huang <huangjintao.hjt@alibaba-inc.com>
Date: Fri, 11 Apr 2025 14:06:26 +0800
Subject: [PATCH] fix docs

---
 "docs/source/GetStarted/SWIFT\345\256\211\350\243\205.md"       | 2 --
 ...\256\255\347\273\203\344\270\216\345\276\256\350\260\203.md" | 2 +-
 docs/source_en/GetStarted/SWIFT-installation.md                 | 2 --
 docs/source_en/Instruction/Pre-training-and-Fine-tuning.md      | 2 +-
 4 files changed, 2 insertions(+), 6 deletions(-)

diff --git "a/docs/source/GetStarted/SWIFT\345\256\211\350\243\205.md" "b/docs/source/GetStarted/SWIFT\345\256\211\350\243\205.md"
index 14b867854e..82229bf6a3 100644
--- "a/docs/source/GetStarted/SWIFT\345\256\211\350\243\205.md"
+++ "b/docs/source/GetStarted/SWIFT\345\256\211\350\243\205.md"
@@ -8,8 +8,6 @@
 pip install 'ms-swift'
 # 使用评测
 pip install 'ms-swift[eval]' -U
-# 使用序列并行
-pip install 'ms-swift[seq_parallel]' -U
 # 全能力
 pip install 'ms-swift[all]' -U
 ```
diff --git "a/docs/source/Instruction/\351\242\204\350\256\255\347\273\203\344\270\216\345\276\256\350\260\203.md" "b/docs/source/Instruction/\351\242\204\350\256\255\347\273\203\344\270\216\345\276\256\350\260\203.md"
index 004928f068..19fe24953e 100644
--- "a/docs/source/Instruction/\351\242\204\350\256\255\347\273\203\344\270\216\345\276\256\350\260\203.md"
+++ "b/docs/source/Instruction/\351\242\204\350\256\255\347\273\203\344\270\216\345\276\256\350\260\203.md"
@@ -65,8 +65,8 @@ ms-swift使用了分层式的设计思想，用户可以使用命令行界面、
 - Any-to-Any模型训练：参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/all_to_all)。
 - 其他能力：
   - 数据流式读取: 在数据量较大时减少内存使用。参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/streaming/train.sh)。
-  - 序列并行: 参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/sequence_parallel)。
   - packing: 将多个序列拼成一个，让每个训练样本尽可能接近max_length，提高显卡利用率，参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/packing/train.sh)。
+  - 长文本训练: 参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/long_text)。
   - lazy tokenize: 在训练期间对数据进行tokenize而不是在训练前tokenize（多模态模型可以避免在训练前读入所有多模态资源），这可以避免预处理等待并节约内存。参考[这里](https://github.com/modelscope/swift/blob/main/examples/train/lazy_tokenize/train.sh)。
 
 小帖士：
diff --git a/docs/source_en/GetStarted/SWIFT-installation.md b/docs/source_en/GetStarted/SWIFT-installation.md
index fd69f2c0e0..4cf428e976 100644
--- a/docs/source_en/GetStarted/SWIFT-installation.md
+++ b/docs/source_en/GetStarted/SWIFT-installation.md
@@ -8,8 +8,6 @@ You can install it using pip:
 pip install 'ms-swift'
 # For evaluation usage
 pip install 'ms-swift[eval]' -U
-# For sequence parallel usage
-pip install 'ms-swift[seq_parallel]' -U
 # Full capabilities
 pip install 'ms-swift[all]' -U
 ```
diff --git a/docs/source_en/Instruction/Pre-training-and-Fine-tuning.md b/docs/source_en/Instruction/Pre-training-and-Fine-tuning.md
index 6aa8d31eb2..126d534e7f 100644
--- a/docs/source_en/Instruction/Pre-training-and-Fine-tuning.md
+++ b/docs/source_en/Instruction/Pre-training-and-Fine-tuning.md
@@ -68,8 +68,8 @@ Additionally, we offer a series of scripts to help you understand the training c
 - Any-to-Any Model Training: Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/all_to_all).
 - Other Capabilities:
   - Streaming Data Reading: Reduces memory usage when handling large datasets. Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/streaming/train.sh).
-  - Sequence Parallelism: Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/sequence_parallel).
   - Packing: Combines multiple sequences into one, making each training sample as close to max_length as possible to improve GPU utilization. Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/packing/train.sh).
+  - Long Text Training: Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/long_text).
   - Lazy Tokenize: Performs tokenization during training instead of pre-training (for multi-modal models, this avoids the need to load all multi-modal resources before training), which can reduce preprocessing wait times and save memory. Refer to [here](https://github.com/modelscope/swift/blob/main/examples/train/lazy_tokenize/train.sh).