cri: error out if CRI request has CDI devices but CDI is disabled. #11975

klihub · 2025-06-11T12:29:31Z

Currently if a container creation requests CDI device injection using the dedicated CRI message field, the request is processed normally but with the CDI devices silently dropped. The in-line request usually indicates that a DRA driver requested injection of the devices. Silent failure to honour the request can result in hard to understand/debug problems later if the container can start up.

This patch addresses this by explicitly checking and rejecting container creation if devices were requested but CDI support is disabled.

klihub · 2025-06-11T17:59:18Z

/retest

mikebrow

ouch

the default for enable_cdi devices is enabled (true)...

I agree we should not be silent about the failure to call cdi setup when enabled and requested, but we may need to allow the container create even if configured w/cdidevices via annotation or the cdidevice field.. otherwise rejecting the container, at this point in time, could possibly cause regressions.

See additional clarification suggestion.

Perhaps a new config flag EnableRejectingContainerCreateForDisabledCDI (default false)

internal/cri/server/container_create_linux.go

mikebrow · 2025-06-11T18:23:10Z

Another thing to consider is deprecating the option to disable cdi

klihub · 2025-06-12T06:07:15Z

ouch

the default for enable_cdi devices is enabled (true)...

Yes, and TBH that is partly why it took me a while to realize what went wrong. I assumed it to be enabled (so didn't check), and since containers with DRA-injected CDI devices (seemingly) started up normally, I didn't realize to suspect that CDI was disabled.

I agree we should not be silent about the failure to call cdi setup when enabled and requested, but we may need to allow the container create even if configured w/cdidevices via annotation or the cdidevice field..

This patch only prevents the container from starting when device injection was requested using the dedicated CRI protocol field, IOW when it is very explicit that a DRA driver is active.

otherwise rejecting the container, at this point in time, could possibly cause regressions.

But isn't this behavioral change possibly causing regressions only if the former behavior was either deliberate or accidental but well known and useful enough that some people started to use it as a feature and rely on it. Here, it's not the first case, and it is hard to see any usefulness potentially leading to the second.

See additional clarification suggestion.

Perhaps a new config flag EnableRejectingContainerCreateForDisabledCDI (default false)

Would this really be helpful then ? I mean, to prevent the situation I ran into, you would need to alter the configuration asking containerd to error out container creation if you accidentally have CDI disabled somewhere. But why would anybody change the configuration like that, when instead they could just enable CDI while at it ?

mikebrow · 2025-06-12T14:20:24Z

my scenario is PyTorch....

if a PyTorch enabled application finds the GPU device it will use it.. if not it will fall back to CPU. In this scenario let's say I have a pod spec with a cdi device asking all hosts where the pod is running to mount the gpu accelerator device. In this context I want to run inference with CPU or GPU on each host.. Some hosts may have cdi disabled some may have it explicitly enabled or enabled by default. Further some hosts may be on various versions of containerd where cdi was not originally enabled by default. In other words there is flexibility at the hosts, both container runtimes and the PyTorch applications for running successfully with and without CDI success on each host.

Now with this change many of those hosts will start failing, where the config is set to cdi disabled and the pod spec has a cdi device.

klihub · 2025-06-12T16:01:27Z

my scenario is PyTorch....

if a PyTorch enabled application finds the GPU device it will use it.. if not it will fall back to CPU. In this scenario let's say I have a pod spec with a cdi device asking all hosts where the pod is running to mount the gpu accelerator device. In this context I want to run inference with CPU or GPU on each host.. Some hosts may have cdi disabled some may have it explicitly enabled or enabled by default. Further some hosts may be on various versions of containerd where cdi was not originally enabled by default. In other words there is flexibility at the hosts, both container runtimes and the PyTorch applications for running successfully with and without CDI success on each host.

Now with this change many of those hosts will start failing, where the config is set to cdi disabled and the pod spec has a cdi device.

But do you really run a GPU DRA plugin in such a cluster ? And on all cluster nodes, with the plugin being oblivious to the fact that some hosts do not support CDI, and still allocating GPUs and seemingly doing GPU injection on the CRI protocol level, with the whole cluster thinking that everything works just fine ? And in reality everything working because then PyTorch just falls back to CPU-based inference on the hosts where CDI is disabled and DRA-based allocation silently fails ?

If that is really the case, or if for some other reason this PR approach is deemed to be a definite no-go, then I think the only alternative we have left is updating this PR to give a big fat warning without erroring out.

mikebrow · 2025-06-13T17:47:30Z

my scenario is PyTorch....
if a PyTorch enabled application finds the GPU device it will use it.. if not it will fall back to CPU. In this scenario let's say I have a pod spec with a cdi device asking all hosts where the pod is running to mount the gpu accelerator device. In this context I want to run inference with CPU or GPU on each host.. Some hosts may have cdi disabled some may have it explicitly enabled or enabled by default. Further some hosts may be on various versions of containerd where cdi was not originally enabled by default. In other words there is flexibility at the hosts, both container runtimes and the PyTorch applications for running successfully with and without CDI success on each host.
Now with this change many of those hosts will start failing, where the config is set to cdi disabled and the pod spec has a cdi device.

But do you really run a GPU DRA plugin in such a cluster ? And on all cluster nodes, with the plugin being oblivious to the fact that some hosts do not support CDI, and still allocating GPUs and seemingly doing GPU injection on the CRI protocol level, with the whole cluster thinking that everything works just fine ? And in reality everything working because then PyTorch just falls back to CPU-based inference on the hosts where CDI is disabled and DRA-based allocation silently fails ?

If that is really the case, or if for some other reason this PR approach is deemed to be a definite no-go, then I think the only alternative we have left is updating this PR to give a big fat warning without erroring out.

Is there a way to know at this level if the CDI request is a DRA allocation vs the other uses of CDI and/or CDI annotation experiments.

At any rate it will be a behavior change, arguably for the better, on the off chance someone has been running with the cri config CDI disabled and now those pods that ran before, and arguably should've failed, won't due to the misconfiguration. Yes misconfiguration made worse now by new DRA cases. We should've deprecated the config switch when it was introduced and removed it when CDI became official (no longer an experimental annotation).

Our CRI config deprecation policy https://github.com/containerd/containerd/blob/main/docs/cri/config.md#deprecation follows https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecating-a-flag-or-cli

from the config document for the v1.x details:

  # enable_cdi enables support of the Container Device Interface (CDI)
  # For more details about CDI and the syntax of CDI Spec files please refer to
  # https://tags.cncf.io/container-device-interface.
  # TODO: Deprecate this option when either Dynamic Resource Allocation(DRA)
  # or CDI support for the Device Plugins are graduated to GA.
  # `Dynamic Resource Allocation` KEP:
  # https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation
  # `Add CDI devices to device plugin API` KEP:
  # https://github.com/kubernetes/enhancemen
8000
ts/tree/master/keps/sig-node/4009-add-cdi-devices-to-device-plugin-api
  enable_cdi = true

klihub · 2025-06-14T07:40:32Z

Is there a way to know at this level if the CDI request is a DRA allocation vs the other uses of CDI and/or CDI annotation experiments.

I think that the injection request coming in the dedicated CRI CDIDevices field is an indirect indication that it was requested by a DRA driver (or then someone is playing around with crictl on the node). That's why this patch proposes to error out only based on that but not if the request is only annotated.

At any rate it will be a behavior change, arguably for the better, on the off chance someone has been running with the cri config CDI disabled and now those pods that ran before, and arguably should've failed, won't due to the misconfiguration. Yes misconfiguration made worse now by new DRA cases. We should've deprecated the config switch when it was introduced and removed it when CDI became official (no longer an experimental annotation).

Our CRI config deprecation policy https://github.com/containerd/containerd/blob/main/docs/cri/config.md#deprecation follows https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecating-a-flag-or-cli

from the config document for the v1.x details:
  # enable_cdi enables support of the Container Device Interface (CDI)
  # For more details about CDI and the syntax of CDI Spec files please refer to
  # https://tags.cncf.io/container-device-interface.
  # TODO: Deprecate this option when either Dynamic Resource Allocation(DRA)
  # or CDI support for the Device Plugins are graduated to GA.
  # `Dynamic Resource Allocation` KEP:
  # https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation
  # `Add CDI devices to device plugin API` KEP:
  # https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4009-add-cdi-devices-to-device-plugin-api
  enable_cdi = true

Yes that makes sense once DRA is GA.

@mikebrow But what's the verdict for this patch ? I can update it to only give a warning instead of erroring out for driver-requested injections with CDI off, if you still think it is the right thing to do ? Just let me know which way we should go.

Currently if a container creation requests CDI device injection using the dedicated CRI message field, the request is processed normally but with the CDI devices silently dropped. The in-line request usually indicates that a DRA driver requested injection of the devices. Silent failure to honour the request can result in hard to understand/debug problems later if the container can start up. This patch addresses this by explicitly checking and rejecting container creation if devices were requested but CDI support is disabled. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>

mikebrow · 2025-06-17T16:10:38Z

Is there a way to know at this level if the CDI request is a DRA allocation vs the other uses of CDI and/or CDI annotation experiments.

I think that the injection request coming in the dedicated CRI CDIDevices field is an indirect indication that it was requested by a DRA driver (or then someone is playing around with crictl on the node). That's why this patch proposes to error out only based on that but not if the request is only annotated.
At any rate it will be a behavior change, arguably for the better, on the off chance someone has been running with the cri config CDI disabled and now those pods that ran before, and arguably should've failed, won't due to the misconfiguration. Yes misconfiguration made worse now by new DRA cases. We should've deprecated the config switch when it was introduced and removed it when CDI became official (no longer an experimental annotation).
Our CRI config deprecation policy https://github.com/containerd/containerd/blob/main/docs/cri/config.md#deprecation follows https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecating-a-flag-or-cli
from the config document for the v1.x details:
  # enable_cdi enables support of the Container Device Interface (CDI)
  # For more details about CDI and the syntax of CDI Spec files please refer to
  # https://tags.cncf.io/container-device-interface.
  # TODO: Deprecate this option when either Dynamic Resource Allocation(DRA)
  # or CDI support for the Device Plugins are graduated to GA.
  # `Dynamic Resource Allocation` KEP:
  # https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation
  # `Add CDI devices to device plugin API` KEP:
  # https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4009-add-cdi-devices-to-device-plugin-api
  enable_cdi = true
Yes that makes sense once DRA is GA.

@mikebrow But what's the verdict for this patch ? I can update it to only give a warning instead of erroring out for driver-requested injections with CDI off, if you still think it is the right thing to do ? Just let me know which way we should go.

go ahead and deprecate it for 2.2.. let's make this a warning in 2.2/2.1 announcing 2.2 deprecation.

Do we not migrate this field? Wondering how you got the enable_cdi = false set on your config.. IOW is this a pervasive problem where we added a = true default to a new field and did not migrate from did not exist to new config exists with true set.

github-project-automation bot added this to Pull Request Review Jun 11, 2025

github-project-automation bot moved this to Needs Triage in Pull Request Review Jun 11, 2025

k8s-ci-robot added the size/XS label Jun 11, 2025

klihub requested a review from mikebrow June 11, 2025 12:29

dosubot bot added the area/cri Container Runtime Interface (CRI) label Jun 11, 2025

mikebrow reviewed Jun 11, 2025

View reviewed changes

internal/cri/server/container_create_linux.go Outdated Show resolved Hide resolved

klihub force-pushed the fixes/main/check-and-reject-disabled-cdi-injection branch from 1656165 to 8009d91 Compare June 15, 2025 10:19

klihub requested a review from mikebrow June 16, 2025 13:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cri: error out if CRI request has CDI devices but CDI is disabled. #11975

cri: error out if CRI request has CDI devices but CDI is disabled. #11975

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cri: error out if CRI request has CDI devices but CDI is disabled. #11975

Are you sure you want to change the base?

cri: error out if CRI request has CDI devices but CDI is disabled. #11975

Uh oh!

Conversation

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!