-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Insights: kserve/kserve
Overview
Could not load contribution data
Please try again later
27 Pull requests merged by 14 people
-
Stop and resume a model [Raw Deployment]
#4455 merged
Jun 11, 2025 -
[Bug] Fixes error in trace logging
#4514 merged
Jun 11, 2025 -
Fix: do not update poetry dependency when install hf cpu deps
#4516 merged
Jun 11, 2025 -
Fix pss restricted warnings
#4327 merged
Jun 6, 2025 -
Initial segregation of the storage module from KServe SDK
#4391 merged
Jun 5, 2025 -
Upgrade vLLM to v0.9.0.1
#4507 merged
Jun 5, 2025 -
upgrade vllm to v0.9.0 and Torch to v2.7.0
#4501 merged
Jun 4, 2025 -
Update kserve-resources helm chart to disable desired servingruntimes
#4485 merged
Jun 2, 2025 -
Rename CRD file to reflect all KServe CRDs (Fixes #4396)
#4494 merged
Jun 2, 2025 -
Add Jooho to approvers in OWNERS file
#4504 merged
May 30, 2025 -
fix: Update TextIteratorStreamer to skip special tokens
#4490 merged
May 29, 2025 -
chore: drop pydantic v1 support
#4353 merged
May 29, 2025 -
Upgrade Torch to v2.6.0 everywhere
#4450 merged
May 28, 2025 -
chore: remove 'default' suffix compatibility
#4178 merged
May 27, 2025 -
Generate Release 0.15.2
#4497 merged
May 27, 2025 -
fix: update workflow to use ubuntu-latest for rerun PR tests
#4496 merged
May 27, 2025 -
docs: enhance security documentation with detailed reporting and prevention mechanisms
#4495 merged
May 26, 2025 -
Add predictor_config to ModelServer init function
#4491 merged
May 23, 2025 -
Rework the order in which the knative autoscaler configmap is read during reconciliation
#4471 merged
May 23, 2025 -
fix: huggingface e2e test output mismatch and add tests for stream requests
#4482 merged
May 23, 2025 -
config: enable ModelCar by default
#4316 merged
May 22, 2025 -
Fixes CVE-2025-43859
#4468 merged
May 16, 2025 -
Fixes vLLM V1 failures: Revert back the approach to initiate the background engine task
#4470 merged
May 15, 2025 -
Improve code coverage
#4385 merged
May 15, 2025 -
Fix: add type specification for nthread argument in argument parser
#4410 merged
May 15, 2025 -
Publish 0.15.1 release
#4466 merged
May 15, 2025 -
LMCache Integration with vLLM runtime
#4320 merged
May 14, 2025
12 Pull requests opened by 9 people
-
4380 - Inference logging to blob storage
#4473 opened
May 14, 2025 -
Add support overriding model mount path in model server container(Follow-up to PR #3814)
#4478 opened
May 19, 2025 -
Huggingface ARM64 Support & Refactor multi-platform build workflows
#4480 opened
May 20, 2025 -
Add Code Coverage change report for PRs
#4487 opened
May 22, 2025 -
feat: support remote storage URI injection for serving runtimes
#4492 opened
May 24, 2025 -
Allow support with latest xgboost models
#4493 opened
May 25, 2025 -
ci: PR style check
#4499 opened
May 27, 2025 -
Improve KServe model server observability with metrics and distributed tracing
#4508 opened
Jun 5, 2025 -
Moved webhdfs dependencies to pyproject.toml file
#4518 opened
Jun 11, 2025 -
Add the option to configure knative ns from values file
#4519 opened
Jun 12, 2025 -
[API] Define LLMInferenceService and LLMInferenceServiceConfig types and CRDs
#4522 opened
Jun 13, 2025 -
Refactor KServe to use global context for PredictorConfig
#4526 opened
Jun 13, 2025
20 Issues closed by 7 people
-
Huggingface Server trace logging throw error
#4515 closed
Jun 11, 2025 -
one question about custom runtime, thanks very much.
#4512 closed
Jun 8, 2025 -
Support for Multiple ContainerStorageContainers
#4361 closed
Jun 5, 2025 -
Make storage initializer install only what is needed for it to run
#3489 closed
Jun 5, 2025 -
Upgrade vLLM to v0.9.0.1
#4506 closed
Jun 5, 2025 -
Upgrade vllm to v0.9.0 and Torch to v2.7.0
#4500 closed
Jun 4, 2025 -
A few Python component still not updated to Torch v2.6.0 CPU
#4449 closed
May 28, 2025 -
No manifest for version v0.15.2
#4498 closed
May 27, 2025 -
Predictor health check can only be set via global args
#4452 closed
May 23, 2025 -
Configuration of by isvc causing breaking issue with rollout
#4486 closed
May 23, 2025 -
Collocate custom transformer and predictor: [errno 98] address already in use in kserve-container
#4463 closed
May 23, 2025 -
ModelCar stable, but not available by default
10000 #4315 closed
May 22, 2025 -
Missing YAML manifests in release of v0.15.1
#4481 closed
May 22, 2025 -
Enable custom schema upload for Inference Service Payload Logging
#4483 closed
May 21, 2025 -
Update h11 package to 0.16.0 (CVE-2025-43859)
#4457 closed
May 16, 2025 -
pytorch does not support enable swagger ui
#4475 closed
May 16, 2025 -
xgb server converts nthread to str causing errors
#4322 closed
May 15, 2025 -
Integrate LMCache for improved performance with KV Cache sharing
#4203 closed
May 14, 2025
20 Issues opened by 14 people
-
[LLMInferenceService] [Router] Reconcile managed `HTTPRoute(s)`
#4525 opened
Jun 13, 2025 -
[LLMInferenceService] [Router] Reconcile managed `Ingress`
#4524 opened
Jun 13, 2025 -
[LLMInferenceService] Implement reconciliation for single-node Deployment
#4523 opened
Jun 13, 2025 -
[API] Define LLMInferenceService and LLMInferenceServiceConfig types and CRDs
#4521 opened
Jun 13, 2025 -
Unified LLM Inference Service API and disaggregated p/d serving support
#4520 opened
Jun 12, 2025 -
Can't create Serverless InferenceService when knative is installed in non-default namespace
#4517 opened
Jun 11, 2025 -
Container fails to start with CUDA version error when using `kserve/huggingfaceserver:latest-gpu`
#4513 opened
Jun 10, 2025 -
Predictor config is required but marked as Optional in Model class
#4511 opened
Jun 5, 2025 -
Predictor health check - support custom preprocess function that mutates model name and version
#4510 opened
Jun 5, 2025 -
KServe CPU Spikes in transformer
#4509 opened
Jun 5, 2025 -
MIG support for huggingface runtime
#4505 opened
Jun 2, 2025 -
Track code coverage changes for each PR and possibly add a minimum coverage check
#4502 opened
May 29, 2025 -
onnx model cannot be start successfully.
#4489 opened
May 22, 2025 -
Enable User -Supplied Schemas for Payload Logging
#4484 opened
May 21, 2025 -
Allow more verbose inference client logging
#4479 opened
May 20, 2025 -
Relax dependency versioning
#4476 opened
May 15, 2025 -
Does KServe support Huawei's graphics card?
#4474 opened
May 15, 2025 -
Adding name field to non-custom predictor incorrectly triggers custom container logic
#4472 opened
May 14, 2025 -
Remove torchserve from KServe?
#4469 opened
May 14, 2025
30 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
feat: support secure access to prometheus in keda
#4384 commented on
Jun 5, 2025 • 9 new comments -
chore: update image push from docker to ghcr
#4358 commented on
Jun 12, 2025 • 8 new comments -
Auto-update annotation for isvc.
#4342 commented on
Jun 11, 2025 • 8 new comments -
Allow OCI for multi-node/multi-gpu
#4441 commented on
May 30, 2025 • 4 new comments -
Add id2label support, fix CUDA bug in return_probabilities
#4444 commented on
Jun 13, 2025 • 1 new comment -
Implement authorization for Raw InferenceGraphs
#4421 commented on
May 21, 2025 • 1 new comment -
fix: Allow CA bundle path without config map
#4451 commented on
Jun 13, 2025 • 0 new comments -
Remove default resources requests and limits - they should be set exp…
#4448 commented on
Jun 3, 2025 • 0 new comments -
fix: 4439 - update telepresence-setup.sh to adopt tooling changes
#4440 commented on
Jun 12, 2025 • 0 new comments -
Refactor LocalModelNodeAgent
#4431 commented on
Jun 12, 2025 • 0 new comments -
feat: refactor storage initializer resources configuration
#4411 commented on
Jun 13, 2025 • 0 new comments -
Switch Kserve from poetry to uv
#4407 commented on
Jun 5, 2025 • 0 new comments -
[WIP] Custom Schema Support for Payload Logging
#4392 commented on
May 21, 2025 • 0 new comments -
Remove kube-rbac-proxy
#4378 commented on
May 15, 2025 • 0 new comments -
No error reported in the status when storageUri protocol validation fails
#4359 commented on
Jun 13, 2025 • 0 new comments -
Envoy AI Gateway integration
#4319 commented on
Jun 5, 2025 • 0 new comments -
On InferenceService deletion, validate no references to it
#4235 commented on
May 31, 2025 • 0 new comments -
Allow to set custom timeouts for `InferenceGraph` router
#4218 commented on
May 27, 2025 • 0 new comments -
A new DistributedInfereneceService CRD
#4433 commented on
Jun 12, 2025 • 0 new comments -
REST API Support for Creating InferenceService Resources
#4432 commented on
Jun 11, 2025 • 0 new comments -
Harmonizing OCI Image model support
#4083 commented on
Jun 9, 2025 • 0 new comments -
KServe Easy Deploy: Helm-based Onboarding Experience for ML Developers
#4393 commented on
Jun 8, 2025 • 0 new comments -
How can you deploy a model artifact from Kubeflow Pipelines with KServe?
#4269 commented on
Jun 7, 2025 • 0 new comments -
Support multiple StorageUri in Inference Service
#3413 commented on
Jun 5, 2025 • 0 new comments -
Extend Model Caching to Serverless with Cloud DataCache Integration
#4408 commented on
Jun 5, 2025 • 0 new comments -
Severe security issue in local-node-agent
#4402 commented on
Jun 4, 2025 • 0 new comments -
Add Support or Documentation for Running Hugging Face/vLLM Server on AMD GPUs
#3963 commented on
May 30, 2025 • 0 new comments -
Store original STORAGE_URI value in an environment variable
#3428 commented on
May 17, 2025 • 0 new comments -
KServe: HPA: Support Custom Metric Definition
#4259 commented on
May 14, 2025 • 0 new comments -
[Bug] KServe InferenceService fails with "NoSupportingRuntime" for sklearn despite correct predictor.sklearn spec (v0.13.1)
#4464 commented on
May 14, 2025 • 0 new comments