Train+Inference with Qwen 2.5 VL (3B) #1396

optas · 2025-02-06T20:37:16Z

Description

This PR adds the minimal changes needed to train and run inference with the newer Qwen VL model (2.5v) inside Oumi; with single image and multi-turn text data.

Worth noting:

Training was tested with both a single and four A:100 GPUs (the latter via DataParallel mode).
Training was tested with spda and flash_attention_2 attention mechanisms. The latter had approx. 12% more `train_tokens_per_second'.
[Important] Qwen2.5 is integrated on the latest transformers dev version (as of 02/05/25):
-- Specifically, Oumi has tested this model with transformers 4.49.0.dev0
-- Upgrading to this version seems necessary to train/use this model to avoid encountering KeyError: 'qwen2_5_vl' , but it might break other Oumi utilities, which at the time is fully tested to be compatible with transformers>=4.48.0,<4.49.

Towards: OPE-988

Related issues

Fixes # (issue)

Before submitting

This PR only changes documentation. (You can ignore the following checks in that case)
Did you read the contributor guideline Pull Request guidelines?
Did you link the issue(s) related to this PR in the section above?
Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

configs/recipes/vision/qwen2_5_vl_3b/sft/gcp_job.yaml

scripts/benchmarks/minimal_multimodal_training.py

src/oumi/core/configs/internal/supported_models.py

src/oumi/datasets/chat_templates/qwen2.5-vl-instruct.jinja

src/oumi/core/configs/internal/supported_models.py

configs/recipes/vision/qwen2_5_vl_3b/sft/train.yaml

tests/unit/datasets/test_chat_templates.py

… optas/qwen_2.5

optas added 3 commits February 5, 2025 22:52

init changes

3b3c58a

cleanup launch job

a4c38de

Merge remote-tracking branch 'origin/main' into optas/qwen_2.5

7130ff0

optas requested review from nikg4 and oelachqar February 6, 2025 20:37