Description
Problem:
Currently, the corpora_ai
module only has an implementation for the OpenAI backend (i.e., corpora_ai_openai
). Without a second provider, it's difficult to test if the corpora_ai
interface is truly provider-agnostic and not too tightly coupled with OpenAI-specific implementations.
Goal:
Integrate a second Large Language Model (LLM) provider to demonstrate and validate the abstraction layer in corpora_ai
. Possible alternatives include:
- Anthropic/Caude: These can be considered if they meet the operational requirements and are accessible.
- Self-Hosted Solution: Utilizing models available from Hugging Face that can be hosted locally to ensure variety in LLM providers without dependency on external APIs.
Requirements:
- Develop a module like
corpora_ai_openai
, for a new provider. - Ensure the new provider follows the existing standard API defined by
corpora_ai
. - Modify
provider_loader.py
to support dynamic loading of the new provider based on an environment variable. - Create unit tests for the new provider similar to the
test_openai_provider_success
intest_provider_loader.py
.
Benefits:
- Ensure the
corpora_ai
abstraction layer is sufficiently generic to allow plug-and-play functionality for different LLM providers. - Identify and correct any assumptions or bias towards the OpenAI implementation in the current interface.
Further Considerations:
- Evaluate the ease of integration and documentation of potential providers.
- Consider performance implications and evaluate any specific API limitations or rate limits.
- Explore licensing or community support for the chosen provider to ensure long-term viability.
By completing this task, we will have a better-designed interface that is extensible and adaptable for multiple LLM providers, enhancing the versatility of corpora_ai
. This will facilitate easier integration of new technologies as they become available, without extensive refactoring.
Avoid reliance solely on OpenAI's API to ensure a versatile infrastructure that anticipates scalability and integration across diverse platforms.