Added provider names #1513

ivanleomk · 2025-05-06T11:46:27Z

This PR adds in a set of provider names that makes it easy for users to be able to find the right provider format type

Important

Add KnownModelName type alias for provider models and integrate into from_provider function for type safety.

Models:
- Add KnownModelName type alias in instructor/models.py for known provider model names.
Functions:
- Update from_provider in instructor/auto_client.py to use KnownModelName for type safety.
- Refactor from_provider to handle max_tokens more cleanly.
CI/CD:
- Add INSTRUCTOR_ENV: CI to environment variables in .github/workflows/test.yml.

^{This description was created by}^{for ad201d9. You can customize this summary. It will automatically update as commits are pushed.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to d753255 in 3 minutes and 17 seconds. Click for details.

Reviewed 1791 lines of code in 22 files
Skipped 0 files when reviewing.
Skipped posting 20 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. instructor/auto_client.py:21

Draft comment:
Consider abstracting repetitive provider-specific client instantiation logic into a common helper to DRY up the code.
Reason this comment was not posted:
Confidence changes required: 66% <= threshold 85% None

2. instructor/auto_client.py:78

Draft comment:
Provide a more descriptive error message for missing MISTRAL_API_KEY, maybe include instructions for setting it.
Reason this comment was not posted:
Confidence changes required: 66% <= threshold 85% None

3. instructor/dsl/simple_type.py:60

Draft comment:
The logic for handling list types with unions is complex; consider adding more inline comments or breaking into helper functions for clarity.
Reason this comment was not posted:
Confidence changes required: 66% <= threshold 85% None

4. instructor/models.py:1

Draft comment:
Consider sorting or grouping the literals in KnownModelName for improved maintainability.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 85% None

5. tests/test_auto_client.py:45

Draft comment:
The CI-based skip logic for providers might be fragile; ensure these conditions are well-documented or reconsider the CI testing strategy.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 85% None

6. .github/workflows/test.yml:31

Draft comment:
Ensure consistent usage of the 'INSTRUCTOR_ENV: CI' variable across all workflow steps to avoid potential conflicts.
Reason this comment was n 8000 ot posted:
Confidence changes required: 80% <= threshold 85% None

7. README.md:57

Draft comment:
Great provider initialization examples; consider clarifying that async clients require using 'await' on their async methods.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 85% The comment is purely informative, reminding the author about the use of await with async clients. It doesn't suggest a code change or improvement, nor does it point out a potential issue or ask for confirmation of intent. Therefore, it violates the rule against making purely informative comments.

8. docs/blog/posts/string-based-init.md:1

Draft comment:
Excellent comprehensive documentation. Consider adding provider‐specific notes about extra kwargs required for some providers.
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 85% None

9. examples/automodel/run.py:12

Draft comment:
Consider using a logging framework instead of print statements for better production usage.
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 85% None

10. instructor/auto_client.py:22

Draft comment:
There's repetitive try/except blocks for each provider. Consider refactoring these branches into a common abstraction to reduce duplication.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

11. instructor/models.py:1

Draft comment:
Add a module-level docstring explaining the purpose of the 'KnownModelName' alias, and consider using standard type alias syntax for clarity.
Reason this comment was not posted:
Confidence changes required: 60% <= threshold 85% None

12. instructor/dsl/simple_type.py:60

Draft comment:
The special handling for Python < 3.10 via string matching seems brittle. Consider a more robust type introspection or add more detailed comments on its limitations.
Reason this comment was not posted:
Confidence changes required: 75% <= threshold 85% None

13. tests/test_auto_client.py:1

Draft comment:
Tests in this file directly call external APIs, which may cause flakiness. Consider using mocks to simulate provider responses for better test stability.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

14. tests/test_auto_client.py:110

Draft comment:
The substring check for the error message in 'test_mode_parameter' is fragile. Consider verifying the error type or using a more robust error code check.
Reason this comment was not posted:
Confidence changes required: 80% <= threshold 85% None

15. pyproject.toml:38

Draft comment:
The change of 'reportMissingImports' to 'warning' is significant. Ensure this is intentional as it may mask potential missing import issues during development.
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 85% None

16. docs/blog/index.md:30

Draft comment:
Typo detected: The link text is 'Fine-tuning and Distillation in AI Models', however, the URL is 'posts/distilation-part1.md'. Consider renaming it to 'posts/distillation-part1.md' to fix the typo and maintain consistency.
Reason this comment was not posted:
Comment was on unchanged code.

17. instructor/client_gemini.py:43

Draft comment:
Typographical error: In the error message on line 43, 'genai.generativemodel' should be corrected to 'genai.GenerativeModel' for consistency with the imported class name.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

18. instructor/models.py:70

Draft comment:
I noticed a potential inconsistency in the Mistral models' names. On line 70, the model is named "mistral/codestral-latest" while later we see names like "mistral/mistral-large-latest" and "mistral/mistral-saba-latest". Similarly, on line 73 there's "mistral/pixtral-large-latest" and on line 75 "mistral/ministral-3b-latest". Could you please verify if these spellings ('codestral', 'pixtral', 'ministral') are intentional or if they should consistently use 'mistral'?
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 85% The comment is asking for verification of model names, which violates the rule "Do NOT ask the PR author to confirm their intention or verify things." These appear to be actual model names from Mistral, and the variations are likely intentional product names. Without strong evidence that these are incorrect names, we should assume the author knows the correct model identifiers. The different prefixes could potentially indicate a real naming error that needs to be fixed. The inconsistency is quite noticeable. While the inconsistency exists, these are likely intentional product names from Mistral. Without being able to verify the official model names, we should trust that the author has used the correct identifiers. The comment should be deleted as it merely asks for verification without strong evidence of an actual issue, and speculates about potential problems.

19. tests/test_auto_client.py:16

Draft comment:
Typographical error: In the USER_EXTRACTION_PROMPT, the word 'strays' at line 16 seems out of place. Please verify if it's meant to be 'stays' or another term.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 85% While this is technically a typo, it doesn't impact the test functionality since the tests only verify name="ivan" and age=28. The typo is in example text that's just used for testing. The rules say not to make purely informative comments or comments about obvious things. This seems like a minor cosmetic issue that doesn't affect code quality or functionality. The typo could potentially confuse other developers reading the tests. It's in a constant that's used throughout multiple test cases. While the typo might be slightly confusing, it doesn't impact the actual test coverage or functionality. The tests are clearly focused on extracting specific fields (name and age) which work correctly regardless of this typo. Delete the comment. The typo is a minor cosmetic issue that doesn't affect functionality, and the rules specifically say not to make purely informative comments or comments about obvious things.

20. tests/test_auto_client.py:24

Draft comment:
Typographical error: The provider name 'mistral/ministral-8b-latest' on line 24 might contain a typo. Please check if 'ministral' is the intended spelling or if it should be corrected.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 80% vs. threshold = 85% The typo seems real - "ministral" is likely meant to be "mistral" since Mistral AI is a known LLM provider. However, this is test code and the provider string could be intentionally wrong to test error handling. Looking at the test cases, there are already tests for invalid providers, but this one is in the main PROVIDERS list used for actual testing. I could be wrong about this being a typo - maybe "ministral" is a real model variant. Without being able to check Mistral AI's documentation, I can't be 100% certain. Even with some uncertainty, this appears to be a clear typo since "Mistral" is the company name and all their models use that spelling. The comment is specific and actionable. Keep the comment as it identifies what appears to be a genuine typo in a provider name that would affect the tests.

Workflow ID: wflow_DAH9SQjpYlYjkkBE

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ivanleomk · 2025-05-06T11:51:58Z

All tests for auto model pass locally

tests/test_auto_client.py ..........................                                                                                                                                   [100%]

With this, we get auto completion of the model names

But if the user decides to pass in a normal string, we don't throw an error either which is really nice

ellipsis-dev

Important

Looks good to me! 👍

Reviewed ad201d9 in 51 seconds. Click for details.

Reviewed 19 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 6 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. instructor/auto_client.py:58

Draft comment:
Use of Union[str, KnownModelName] is in line with our style guidelines. Ensure consistency with similar typing elsewhere.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 85% None

2. instructor/auto_client.py:60

Draft comment:
Changed type annotation for 'mode' to use Union[instructor.Mode, None]. Confirm that this remains consistent with our overall type hinting style.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 85% None

3. instructor/auto_client.py:62

Draft comment:
Return type union now uses Union[Instructor, AsyncInstructor]. Verify that using the alias 'InstructorType' is not more appropriate for DRY purposes.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 85% None

4. instructor/auto_client.py:58

Draft comment:
Consider whether you want to stick with explicit Union[] for type hints, since Python 3.10+ allows using ‘|’. The use of ‘# noqa: UP007’ suggests a backward compatibility decision—ensure this is intended.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 85% None

5. instructor/auto_client.py:60

Draft comment:
The replacement of 'instructor.Mode | None' with 'Union[instructor.Mode, None]' is consistent. Double-check if the noqa comments are needed or if the project can adopt the newer syntax.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 85% None

6. instructor/auto_client.py:62

Draft comment:
Consider standardizing the return type syntax: using 'Union[Instructor, AsyncInstructor]' vs. 'Instructor | AsyncInstructor' for consistency with other parts of the codebase.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 85% None

Workflow ID: wflow_8bGTPNOtBCnQLaYU

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ivanleomk added 2 commits May 6, 2025 19:42

feat: added and tested the indiivdual model names

6d17408

fix: adding instructor ci env var

d753255

ivanleomk changed the base branch from main to automodel May 6, 2025 11:47

github-actions bot added documentation Improvements or additions to documentation enhancement New feature or request python Pull requests that update python code size:L This PR changes 100-499 lines, ignoring generated files. labels May 6, 2025

ellipsis-dev bot reviewed May 6, 2025

View reviewed changes

jxnl approved these changes May 6, 2025

View reviewed changes

fix: changed to union type

ad201d9

ellipsis-dev bot reviewed May 6, 2025

View reviewed changes

ivanleomk merged commit 052fea5 into automodel May 6, 2025
5 of 6 checks passed

ivanleomk deleted the feat/add-provider-names branch May 6, 2025 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Added provider names #1513

Added provider names #1513

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Added provider names #1513

Added provider names #1513

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!