Releases: InftyAI/llmaz
Releases · InftyAI/llmaz
v0.1.4
What's Changed
🚀 Major Features:
- Initialize documentation site by @rudeigerc in #388
- feat: add TensorRT-LLM as backend by @cr7258 in #392
- RFC: Gateway Metric Aggregator by @kerthcet in #404
- Proposal for karpenter intergation by @carlory in #439
✨ Features:
- feat: add preStop hook for llamacpp and tgi in the BackendRuntime by @cr7258 in #381
- feat: support speculative decoding for llamacpp by @cr7258 in #402
- Add global configmap by @kerthcet in #431
- Add dispatcher & memoryStore & latencyAwarePlugin by @kerthcet in #440
- feat: support runai streamer for vllm by @cr7258 in #423
🐛 Bugs:
- feat: update sglang version to v0.4.5 to fix /health_generate endpoint 404 error by @cr7258 in #383
- fix: remove trailing slashes from envoyproxy repository URLs in Chart.yaml by @OKevinoo in #407
♻️ Cleanups:
- use lws as sub chart by @carlory in #408
- Disable prometheus by default by @kerthcet in #416
- Clarify Kubernetes version requirement and fallback plan in Key Features by @SanjanShiv in #380
- Add ci test with helm chart by @kerthcet in #432
- fix: add ut for backend runtime. by @X1aoZEOuO in #428
- Add inftyai-scheduler support and config updates by @carlory in #447
New Contributors
- @cr7258 made their first contribution in #370
- @kenwoodjw made their first contribution in #372
- @SanjanShiv made their first contribution in #380
- @rudeigerc made their first contribution in #388
- @OKevinoo made their first contribution in #407
- @IRONICBo made their first contribution in #409
- @X1aoZEOuO made their first contribution in #428
Full Changelog: v0.1.3...v0.1.4
v0.1.3
What's Changed
🚀 Major Features:
- Add open-webui as the default chatbot by @kerthcet in #357
- Add enovy ai gateway (helm dependencies + doc) by @pacoxu in #360
✨ Features:
- feature: install lws controller in llmaz-system namespace by @googs1025 in #354
♻️ Cleanups:
- Add default to configName by @kerthcet in #349
- feat: move lws controller replicas to global values by @Whitea029 in #358
- kubectl apply using
--server-side
to avoid apply failure by @pacoxu in #363
New Contributors
- @Whitea029 made their first contribution in #358
Full Changelog: v0.1.2...v0.1.3
v0.1.2
What's Changed
✨ Features:
- feat: spread container env to initContianers by @nayihz in #294
- feature(BackendRuntime): support lifecycle hook fields for BackendRuntime by @googs1025 in #303
- feat: add e2e test to verify service is avaliable by @nayihz in #310
- feat: make elasticConfig.maxReplicas to a required parameter by @nayihz in #331
- feat: metrics support for llmaz by @googs1025 in #316
♻️ Cleanups:
- build(deps): bump github.com/google/go-cmp from 0.6.0 to 0.7.0 by @dependabot in #283
- API refactor by @kerthcet in #285
- Refactor multi-host example by @kerthcet in #286
- cleanup: upgrade vllm version to v0.7.3 by @nayihz in #289
- Enable debug mode in workflow by @kerthcet in #290
- fix: fix some warnings when building images by @nayihz in #295
- build(deps): bump sigs.k8s.io/structured-merge-diff/v4 from 4.5.0 to 4.6.0 by @dependabot in #298
- build(deps): bump sigs.k8s.io/controller-runtime from 0.20.2 to 0.20.3 by @dependabot in #299
- build(deps): bump github.com/onsi/ginkgo/v2 from 2.22.2 to 2.23.0 by @dependabot in #297
- cleanup: upgrade e2e test tools by @nayihz in #296
- chore: fix BackendRuntime crds field Commands -> Command by @googs1025 in #315
- feature: add in-tree vllm BackendRuntime for preStop Hook by @googs1025 in #319
- cleanup: remove kube-rbac-proxy by @nayihz in #321
- use local GOPROXY by @pacoxu in #322
- add image repo for lws and all backend runtime, like llama.cpp by @pacoxu in #328
- build(deps): bump github.com/onsi/gomega from 1.36.3 to 1.37.0 by @dependabot in #337
- build(deps): bump github.com/onsi/ginkgo/v2 from 2.23.3 to 2.23.4 by @dependabot in #336
- build(deps): bump sigs.k8s.io/structured-merge-diff/v4 from 4.6.0 to 4.7.0 by @dependabot in #338
- chore: refactor handleUnexpectedCondition func by @googs1025 in #344
- fix(env): ensure BackendRuntimeConfig.Envs overrides base Envs by @googs1025 in #341
New Contributors
Full Changelog: v0.1.1...v0.1.2
v0.1.1
v0.1.0
v0.0.9
What's Changed
✨ Features:
- Support ollama by @qinguoyi in #193
- Feat: add argName to specify server arguments by @kerthcet in #226
🐛 Bugs:
- fix: helm INSTALLATION FAILED by @googs1025 in #203
- Fix: model-loader will not load any files by @kerthcet in #228
♻️ Cleanups:
- Update Features of Overview by @kerthcet in #214
- Remove ElasticConfig from Service by @kerthcet in #224
Full Changelog: v0.0.8...v0.0.9
v0.0.8
What's Changed
🚀 Major Features:
✨ Features:
- feat:support apply llmaz to any ns by @qinguoyi in #172
- feat:update model loader by @qinguoyi in #178
🐛 Bugs:
♻️ Cleanups:
- Add release checklist by @kerthcet in #159
- chore: bump LWS version to v0.4.0 by @googs1025 in #162
- Bump sigs.k8s.io/lws from 0.4.0 to 0.4.1 by @dependabot in #185
- feature(webhook): add BackendRuntimeConfig resources validation by @googs1025 in #170
- fix:load models cost seconds by @qinguoyi in #175
- Update Revision default to main by @kerthcet in #176
- Downsize model-loader image by @qinguoyi in #179
New Contributors
- @googs1025 made their first contribution in #162
- @qinguoyi made their first contribution in #168
Full Changelog: v0.0.7...v0.0.8
v0.0.7
What's Changed
🚀 Major Features:
- [1/N] Add backendRuntime CRD by @kerthcet in #138
- [2/N] Add backendRuntime implementation by @kerthcet in #139
- Add helm chart support by @kerthcet in #142
✨ Features:
🐛 Bugs:
- Fix resource limits could be small than requests by @kerthcet in #136
- Fix filename error by @kerthcet in #147
♻️ Cleanups:
Full Changelog: v0.0.6...v0.0.7
v0.0.6
What's Changed
🚀 Major Features:
✨ Features:
- Add model label to Playground by @kerthcet in #111
- Add new conditions to Playground by @kerthcet in #120
- Change ModelClaims API by @kerthcet in #125
🐛 Bugs:
- fix wrong field path in the openmodel webhook by @carlory in #107
- Playground should be triggered to create Services and then Pods once the model is created by @carlory in #109
- Fix watch for changes to LeaderWorkerSet created by llmaz and trigger a Reconcile for the owner by @carlory in #108
♻️ Cleanups:
New Contributors
Full Changelog: v0.0.5...v0.0.6