8000 Container fails to start with CUDA version error when using `kserve/huggingfaceserver:latest-gpu` · Issue #4513 · kserve/kserve · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Container fails to start with CUDA version error when using kserve/huggingfaceserver:latest-gpu #4513
Open
@xrwang8

Description

@xrwang8

/kind bug

What steps did you take and what happened:

When deploying the kserve/huggingfaceserver:latest-gpu image in Kubernetes, the container fails to start due to a CUDA version mismatch. The error indicates the container requires CUDA >=12.8, but the host driver doesn't meet this requirement.

Error Log:

  Normal   Created    115s                kubelet            Created container storage-initializer
  Normal   Started    115s                kubelet            Started container storage-initializer
  Warning  Failed     62s (x4 over 105s)  kubelet            Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.8, please update your driver to a newer version, or use an earlier cuda container: unknown
  Warning  BackOff  29s (x8 over 104s)  kubelet  Back-off restarting failed container kserve-container in pod dbp-9fd2968b-009d-4d9e-9d30-32b9511a1e8e-predictor-7f4b8d4dx6q5_default(4dd91701-b2ab-4441-ae75-dba42264ef38)

What's the InferenceService yaml:
[To help us debug please run kubectl get isvc $name -n $namespace -oyaml and paste the output]

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Istio Version:
  • Knative Version:
  • KServe Version:
  • Kubeflow version:
  • Cloud Environment:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube/Kind version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0