8000 Add support for repetition_penalty in GrpoParams by REDDITARUN · Pull Request #1654 · oumi-ai/oumi · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Add support for repetition_penalty in GrpoParams #1654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

REDDITARUN
Copy link
Contributor
@REDDITARUN REDDITARUN commented Apr 25, 2025

Description

This PR adds support for the repetition_penalty parameter in the GrpoParams class.

repetition_penalty is a generation parameter commonly used in language models to discourage or encourage the repetition of tokens. By default, it is set to 1.0 (no penalty). Values >1.0 reduce repetition in generated text, and values <1.0 increase it.

This change allows users to fine-tune output repetition behavior during generation through Oumi’s generation interface, bringing it in line HuggingFace TRL GRPO.

Tested that TRL GRPO training works, and also tested that regular training isn't affected.

Related issues

Fixes #1655

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

@nikg4 nikg4 requested review from nikg4, jgreer013 and wizeng23 April 25, 2025 19:53
@nikg4
Copy link
Collaborator
nikg4 commented Apr 25, 2025

@REDDITARUN Thanks for sending this PR! Also, could you please open a feature request for this task, and include relevant context why it's needed ?

@REDDITARUN
Copy link
Contributor Author

Thanks @nikg4! Just submitted the feature request here: #1655
Let me know if anything else is needed! 🙌

Copy link
Contributor
@wizeng23 wizeng23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this! Please address the one comment, and then it can be merged.

Co-authored-by: William Zeng <10782997+wizeng23@users.noreply.github.com>
@REDDITARUN
Copy link
Contributor Author

Thanks! Reverted the newline. Should be good now 🙌

@wizeng23
Copy link
Contributor

Please make sure pre-commit run --all-files --show-diff-on-failure run locally doesn't error. There's a linter error about a comment being too long, could you please resolve?

@REDDITARUN
Copy link
Contributor Author

Got it. I'll fix the linter error. Thanks for the heads up!

@wizeng23 wizeng23 merged commit 0665cee into oumi-ai:main May 6, 2025
1 of 2 checks passed
717A
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Add support for repetition_penalty in TRL GRPO
3 participants
0