Open
Description
Describe the bug
On AWS EKS, nodes are set to SchedulingDisabled
and pods are evicted in batches (not cordoned). With knative serving deployed using the operator, some workloads will never drain when HA is set to 3.
Expected behavior
The Knative Operator should allow these components to drain without user interaction.
To Reproduce
- Deploy Knative Serving on EKS with HA set to 3 using the Knative Operator
- Upgrade the version of Kubernetes by updating the AMI template. This will trigger AWS to do a rolling upgrade of the nodes
- Notice that knative-serving components cause the operation to hand indefinitely until a human forcibly kills the pod in question.
Knative release version
1.13.0
Additional context
I have enough nodes that the PDB shouldn't be violated.