8000 [Evaluation] Migrate LM Harness integration point from `simple_evaluate` to `evaluate` by kaisopos · Pull Request #1455 · oumi-ai/oumi · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

[Evaluation] Migrate LM Harness integration point from simple_evaluate to evaluate #1455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Feb 25, 2025

Conversation

kaisopos
Copy link
Contributor
@kaisopos kaisopos commented Feb 20, 2025

Description

  1. Migrating LM Harness integration point from simple_evaluate to evaluate, to support (follow-up PR) LM Harness custom tasks.
  2. Adding unit tests for LM Harness and AlpacaEval backends.
  3. Adding integration tests for LM Harness.

Related issues

Towards OPE-1029, OPE-1097

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

Copy link
Contributor
@oelachqar oelachqar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaisopos if we don't have them already this is a good opportunity to add unit tests !

Copy link
Contributor
@wizeng23 wizeng23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add in your PR description the steps you took to test this PR? Would be good to verify that the results didn't change as a result of this migration, and that the output platform_task_config looks correct.

Copy link
Collaborator
@taenin taenin left a comment
8000

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to adding unit tests before you submit here.

@kaisopos
Copy link
Contributor Author

+1 to adding unit tests before you submit here.

Yep, will add unit tests before checkin, as discussed.
Unsure about integration tests though (this PR or a follow-up)

@taenin
Copy link
Collaborator
taenin commented Feb 20, 2025

+1 to adding unit tests before you submit here.

Yep, will add unit tests before checkin, as discussed. Unsure about integration tests though (this PR or a follow-up)

Up to you! Personally I like adding integration tests separately, but adding them here is fine as well.

@kaisopos kaisopos merged commit 3aa4f75 into main Feb 25, 2025
2 checks passed
@kaisopos kaisopos deleted the kostas/evaluation_lm_harness_migration branch February 25, 2025 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants
0