Open
Description
Currently, the RAG SDK only supports hosted models. If we could enable the use of local models, similar to web-llm, that would be great. The only issue is that while they are OpenAI-compatible, they don’t provide an endpoint to query from. I believe we could write a simple HTTP server using bun.js and trigger it only if the user decides to use one of the local LLMs.
For example:
- Start the model.
- Hook it up to a simple server.
- Generate a predefined URL and provide it to our base LLM client. And, add it like any other model.
The rest should be straightforward.
Metadata
Metadata
Assignees
Labels
No labels