8000 Train+Inference with Qwen 2.5 VL (3B) by optas · Pull Request #1396 · oumi-ai/oumi · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Train+Inference with Qwen 2.5 VL (3B) #1396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Feb 7, 2025
Merged

Train+Inference with Qwen 2.5 VL (3B) #1396

merged 9 commits into from
Feb 7, 2025

Conversation

optas
Copy link
Contributor
@optas optas commented Feb 6, 2025

Description

This PR adds the minimal changes needed to train and run inference with the newer Qwen VL model (2.5v) inside Oumi; with single image and multi-turn text data.

Worth noting:

  • Training was tested with both a single and four A:100 GPUs (the latter via DataParallel mode).
  • Training was tested with spda and flash_attention_2 attention mechanisms. The latter had approx. 12% more `train_tokens_per_second'.
  • [Important] Qwen2.5 is integrated on the latest transformers dev version (as of 02/05/25):
    -- Specifically, Oumi has tested this model with transformers 4.49.0.dev0
    -- Upgrading to this version seems necessary to train/use this model to avoid encountering KeyError: 'qwen2_5_vl' , but it might break other Oumi utilities, which at the time is fully tested to be compatible with transformers>=4.48.0,<4.49.

Towards: OPE-988

Related issues

Fixes # (issue)

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

@optas optas requested review from nikg4 and oelachqar February 6, 2025 20:37
@nikg4 nikg4 requested a review from wizeng23 February 6, 2025 21:02
@optas optas changed the title Serving Qwen 2.5 VL (3B) Train+Inference with Qwen 2.5 VL (3B) Feb 6, 2025
@oumi-ai oumi-ai deleted a comment from nikg4 Feb 7, 2025
@optas optas merged commit 407d02e into main Feb 7, 2025
2 checks passed
@optas optas deleted the optas/qwen_2.5 branch February 7, 2025 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0