-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New InferenceClient
endpoint type
#1813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
} | ||
|
||
export const endpointHfInferenceParametersSchema = z.object({ | ||
type: z.literal("hfinference"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would name it either inference-providers
(the name of the product) or InferenceClient
(the name of the library class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise it's confusing with provider="hf-inference"
endpointHfInferenceParametersSchema.parse(input); | ||
|
||
const client = baseURL | ||
? new InferenceClient(config.HF_TOKEN).endpoint(baseURL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? new InferenceClient(config.HF_TOKEN).endpoint(baseURL) | |
? new InferenceClient(config.HF_TOKEN, { endpointUrl: baseURL }) |
src/lib/server/models.ts
Outdated
@@ -380,6 +306,8 @@ const addEndpoint = (m: Awaited<ReturnType<typeof processModel>>) => ({ | |||
return endpoints.tgi(args); | |||
case "local": | |||
return endpoints.local(args); | |||
case "hfinference": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case "hfinference": | |
case "inference-providers": |
This will become the default endpoint type. It already lets us enable tool calling on a lot of models where we didn't have it before.
Making sure everything works for all the prod models and switching to it in this PR.