8000 Support scheduling tolerating workloads on NotReady Nodes · Issue #45717 · kubernetes/kubernetes · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support scheduling tolerating workloads on NotReady Nodes #45717

Closed
luxas opened this issue May 12, 2017 · 44 comments
Closed

Support scheduling tolerating workloads on NotReady Nodes #45717

luxas opened this issue May 12, 2017 · 44 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@luxas
Copy link
Member
luxas commented May 12, 2017

This has come up a few times; right now the kubelet goes NotReady if the CNI network isn't set up which makes the scheduler not even consider scheduling to that node, not even hostNetwork Pods.

I think that it's in the works to mark Node problems with Taints, where are we with that effort @kubernetes/sig-scheduling-misc?

We should make it possible to schedule workloads on nodes that are NotReady if the workload tolerates all the required taints for the specific condition.
This is critical for kubeadm self-hosting for example.

I will add more context on this later, but starting the discussion at least here... didn't find any other issue although it may exist

@kubernetes/sig-node-feature-requests @kubernetes/sig-scheduling-feature-requests @kubernetes/sig-network-feature-requests

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/network Categorizes an issue or PR as relevant to SIG Network. labels May 12, 2017
@jagosan
Copy link
Contributor
jagosan commented May 12, 2017

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label May 12, 2017
@bsalamat
Copy link
Member

@luxas The title is confusing. If a node is not reachable at all, why should scheduler schedule any pods on it and even if the scheduler does, how does the node know that it should run the pod? I can see how this may work if the node is not ready for reasons other than reachability though.

@gyliu513
Copy link
Contributor

+1 to this, some customer want to use host network so do not have calico installed, and for this case, we should enable the pod can be created. Perhaps introducing a new taint for this to 8000 enable the pod can still be scheduled to the node even if calico is not running.

@k82cn
Copy link
Member
k82cn commented May 13, 2017

similar question with @bsalamat , is there any condition to identify "network" issues? If so, that'll be great to have such a feature.

@sbezverk
Copy link
Contributor

+1 for this feature. For baremetal node, you probably will use host networking for the performance reasons. You would not want to cripple it with using overlay networking.

@luxas luxas changed the title Support scheduling workloads on Nodes with no network Support scheduling tolerating workloads on NotReady Nodes May 13, 2017
@luxas
Copy link
Member Author
luxas commented May 14, 2017

@luxas The title is confusing. If a node is not reachable at all, why should scheduler schedule any pods on it and even if the scheduler does, how does the node know that it should run the pod? I can see how this may work if the node is not ready for reasons other than reachability though.

similar question with @bsalamat , is there any condition to identify "network" issues? If so, that'll be great to have such a feature.

@bsalamat @k82cn Sorry for the confusion, was in a hurry when I wrote the issue description initially. This is not about connectivity, full connectivity is assumed at all times.

This is about nodes that have some known "problem" e.g. NetworkNotReady. That condition means that the node is reachable over the network but no CNI network (e.g. kubenet, Weave or Calico) is installed.

However, currently, when one of the five-ish node conditions that exist are falsy, the Node goes unschedulable due to the NodeNotReady condition. If one sub-condition (NetworkNotReady is true), the full node condition goes falsy.

We should be able to do more fine-grained scheduling than that. In the CNI networking case you might want to run some workloads (of type hostNetwork=true obviously) on nodes with no CNI network installed for bootstrapping or other reasons.

One possible solution (there are quite a few):

  • Make the kubelet create taints for all the falsy conditions it experiences
    • e.g. the kubelet will create the node.alpha.kubernetes.io/networkNotReady taint if the CNI network isn't initialized.
  • Enable scheduling on NodeReady==ConditionFalse nodes (behind a flag on the scheduler possibly to keep backwards-compability)
    • Or make it possible for "special" workloads to tolerate being scheduled to NotReady nodes in some way.

Example of the current state:

$ kubectl describe nodes
Name:			thegopher
Role:			
Labels:			beta.kubernetes.io/arch=amd64
			beta.kubernetes.io/os=linux
			kubernetes.io/hostname=thegopher
			node-role.kubernetes.io/master=
Annotations:		node.alpha.kubernetes.io/ttl=0
			volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:			node-role.kubernetes.io/master:NoSchedule
CreationTimestamp:	Sat, 13 May 2017 19:20:42 +0300
Phase:			
Conditions:
  Type			Status	LastHeartbeatTime			LastTransitionTime			Reason				Message
  ----			------	-----------------			------------------			------				-------
  OutOfDisk 		False 	Sat, 13 May 2017 19:25:23 +0300 	Sat, 13 May 2017 19:20:42 +0300 	KubeletHasSufficientDisk 	kubelet has sufficient disk space available
  MemoryPressure 	False 	Sat, 13 May 2017 19:25:23 +0300 	Sat, 13 May 2017 19:20:42 +0300 	KubeletHasSufficientMemory 	kubelet has sufficient memory available
  DiskPressure 		False 	Sat, 13 May 2017 19:25:23 +0300 	Sat, 13 May 2017 19:20:42 +0300 	KubeletHasNoDiskPressure 	kubelet has no disk pressure
  Ready 		False 	Sat, 13 May 2017 19:25:23 +0300 	Sat, 13 May 2017 19:20:42 +0300 	KubeletNotReady 		runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:		192.168.1.130,192.168.1.130,thegopher
Capacity:
 cpu:		4
 memory:	16357296Ki
 pods:		110
Allocatable:
 cpu:		4
 memory:	16254896Ki
 pods:		110
System Info:
 Machine ID:			5b7f728ac5f04795aee419a7f700eec7
 System UUID:			03C34D3C-6C9C-E411-A26A-F0761C62F136
 Boot ID:			15e5a1e8-cd70-42d4-ab60-36d7aff75ca7
 Kernel Version:		4.10.0-20-generic
 OS Image:			Ubuntu 17.04
 Operating System:		linux
 Architecture:			amd64
 Container Runtime Version:	docker://1.12.6
 Kubelet Version:		v1.6.2
 Kube-Proxy Version:		v1.6.2
ExternalID:			thegopher
Non-terminated Pods:		(4 in total)
  Namespace			Name						CPU Requests	CPU Limits	Memory Requests	Memory Limits
  ---------			----						------------	----------	---------------	-------------
  kube-system			etcd-thegopher					0 (0%)		0 (0%)		0 (0%)		0 (0%)
  kube-system			kube-controller-manager-thegopher		200m (5%)	0 (0%)		0 (0%)		0 (0%)
  kube-system			kube-scheduler-thegopher			100m (2%)	0 (0%)		0 (0%)		0 (0%)
  kube-system			self-hosted-kube-apiserver-zxbgg		250m (6%)	0 (0%)		0 (0%)		0 (0%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests	CPU Limits	Memory Requests	Memory Limits
  ------------	----------	---------------	-------------
  550m (13%)	0 (0%)		0 (0%)		0 (0%)
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----			-------------	--------	------			-------
  5m		5m		1	kubelet, thegopher			Normal		Starting		Starting kubelet.
  5m		5m		1	kubelet, thegopher			Warning		ImageGCFailed		unable to find data for container /
  5m		4m		18	kubelet, thegopher			Normal		NodeHasSufficientDisk	Node thegopher status is now: NodeHasSufficientDisk
  5m		4m		18	kubelet, thegopher			Normal		NodeHasSufficientMemory	Node thegopher status is now: NodeHasSufficientMemory
  5m		4m		18	kubelet, thegopher			Normal		NodeHasNoDiskPressure	Node thegopher status is now: NodeHasNoDiskPressure

$ kubectl -n kube-system get po
NAME                                         READY     STATUS    RESTARTS   AGE
etcd-thegopher                               1/1       Running   0          4m
kube-controller-manager-thegopher            1/1       Running   1          3m
kube-scheduler-thegopher                     1/1       Running   1          3m
self-hosted-kube-apiserver-zxbgg             1/1       Running   1          5m
self-hosted-kube-scheduler-825579390-d1t99   0/1       Pending   0          4m

$ kubectl -n kube-system describe po self-hosted-kube-scheduler-825579390-d1t99
Name:		self-hosted-kube-scheduler-825579390-d1t99
Namespace:	kube-system
Node:		/
Labels:		component=kube-scheduler
		k8s-app=self-hosted-kube-scheduler
		pod-template-hash=825579390
		tier=control-plane
Annotations:	kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"self-hosted-kube-scheduler-825579390","uid":"38a1faa8-37f8-1...
Status:		Pending
IP:		
Controllers:	ReplicaSet/self-hosted-kube-scheduler-825579390
Containers:
  self-hosted-kube-scheduler:
    Image:	gcr.io/google_containers/kube-scheduler-amd64:v1.6.3
    Port:	
    Command:
      kube-scheduler
      --address=127.0.0.1
      --leader-elect=true
      --kubeconfig=/etc/kubernetes/scheduler.conf
    Requests:
      cpu:		100m
    Liveness:		http-get http://127.0.0.1:10251/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
    Environment:	<none>
    Mounts:
      /etc/kubernetes from k8s (ro)
      /var/lock from var-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-q467p (ro)
Conditions:
  Type		Status
  PodScheduled 	False 
Volumes:
  var-lock:
    Type:	HostPath (bare host directory volume)
    Path:	/var/lock
  k8s:
  <unknown>
  default-token-q467p:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-q467p
    Optional:	false
QoS Class:	Burstable
Node-Selectors:	node-role.kubernetes.io/master=
Tolerations:	node-role.kubernetes.io/master=:NoSchedule
		node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
		node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		ReasonMessage
  ---------	--------	-----	----			-------------	--------	-------------
  5m		18s		22	default-scheduler			Warning		FailedScheduling	no nodes available to schedule pods

Here is the code that probably should be changed in some way (plugin/pkg/scheduler/factory/factory.go:475):

func getNodeConditionPredicate() corelisters.NodeConditionPredicate {
	return func(node *v1.Node) bool {
		for i := range node.Status.Conditions {
			cond := &node.Status.Conditions[i]
			// We consider the node for scheduling only when its:
			// - NodeReady condition status is ConditionTrue,
			// - NodeOutOfDisk condition status is ConditionFalse,
			// - NodeNetworkUnavailable condition status is ConditionFalse.
			if cond.Type == v1.NodeReady && cond.Status != v1.ConditionTrue {
				glog.V(4).Infof("Ignoring node %v with %v condition status %v", node.Name, cond.Type, cond.Status)
				return false
			} else if cond.Type == v1.NodeOutOfDisk && cond.Status != v1.ConditionFalse {
				glog.V(4).Infof("Ignoring node %v with %v condition status %v", node.Name, cond.Type, cond.Status)
				return false
			} else if cond.Type == v1.NodeNetworkUnavailable && cond.Status != v1.ConditionFalse {
				glog.V(4).Infof("Ignoring node %v with %v condition status %v", node.Name, cond.Type, cond.Status)
				return false
			}
		}
		// Ignore nodes that are marked unschedulable
		if node.Spec.Unschedulable {
			glog.V(4).Infof("Ignoring node %v since it is unschedulable", node.Name)
			return false
		}
		return true
	}
}

cc @kubernetes/sig-cluster-lifecycle-bugs @kubernetes/sig-cluster-lifecycle-misc

@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label May 14, 2017
@luxas
Copy link
Member Author
luxas commented May 14, 2017

Currently we could workaround this by using DaemonSets as they aren't using the default scheduler: #42002
But that'd be a poor solution and will stop working in a future release.

@wanghaoran1988
Copy link
Contributor

xref : #42001

@davidopp
Copy link
Member

Make the kubelet create taints for all the falsy conditions it experiences

Indeed someone is already working on this (#42406). Feel free to comment on that PR. I think there are some issues with it, for example it appears to be using one taint for everything instead of one taint per condition. I haven't had a chance to look at it.

Enable scheduling on NodeReady==ConditionFalse nodes (behind a flag on the scheduler possibly to keep backwards-compability)

We should not hide behavior like this behind flags. If a pod wants to schedule on a node with NetworkNotReady taint, it should have an explicit toleration for it.

@k82cn
Copy link
Member
k82cn commented May 15, 2017

Just go through #42406 , it handled this case by taint for everything as davidopp said. Appended a comments in #42406 for this case.

@luxas , I think we should not mark NodeReady to ConditionFalse if NodeNetworkUnavailable is acceptable. IMO, if NodeReady is False, it means that node can not run any Pod event it tolerant everything, e.g. some internal critical issue of kubelet.

@k82cn
Copy link
Member
k82cn commented May 15, 2017

slack with luxas, if do not mark NodeReady to ConditionFalse, that may introduce backward compatibility issue :).

@gmarek
Copy link
Contributor
gmarek commented May 15, 2017

NodeReady = false basically means that Kubelet is down or can't reach API server, which in turn means that there's no reason for tolerating NoSchedule Taint corresponding to it (as Kubelet won't notice your Pod either way). To my understanding conditions are supposed to be completely orthogonal, so state of the one, should not influence state of another. One can infer if the Node is "ready enough" by looking at all of them (and yes - there was pretty long discussion about it roughly 2 years ago @davidopp @bgrant0607).

That being said +1 to having separate Taints for each Condition.

@cmluciano
Copy link

The example for host-network=true can be leveraged with kubenet. Kubenet is not an official CNI plugin, but it works for the purpose that you mentioned. Does the noop plugin not work anymore -

By default if no kubelet network plugin is specified, the noop plugin is used, which sets net/bridge/bridge-nf-call-iptables=1 to ensure simple configurations (like docker with a bridge) work correctly with the iptables proxy.

@timothysc timothysc added this to the next-candidate milestone May 16, 2017
@timothysc timothysc added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label May 16, 2017
@davidopp
Copy link
Member

This issue (and #44445) led @gmarek and me to realize that there is not a good understanding and agreement of what the actual and desired behavior is for node conditions (and for the taints that we want to replace them with for scheduling purposes) in general. Before we try to do anything, we need to take a step back and get this understanding. @gmarek and I started talking about all of the scenarios and it turns out that it's incredibly complicated.

Anyway, @gmarek filled a spreadsheet which people should take a look at and make comments/corrections to.

@davidopp
Copy link
Member

Oops, @gmarek is still working on the spreadsheet, so the link isn't public yet. @gmarek please post to this issue when it is ready to share. Sorry about that.

@gmarek
Copy link
Contributor
gmarek commented May 18, 2017

Sure - the doc is here

@timothysc
Copy link
Member

IMO there is a conflation of "Conditions" with an implicit state vs. having an explicit state machine. This leads to the glue logic that exists in kubelet, and if we follow this track it would make it's way into the scheduler.

@luxas
Copy link
Member Author
luxas commented May 18, 2017

We should not hide behavior like this behind flags. If a pod wants to schedule on a node with NetworkNotReady taint, it should have an explicit toleration for it.

@davidopp I meant that the feature to be able to take NotReady nodes into account could be added behind a feature flag for the scheduler possibly already in v1.7?
Of course the workload itself must tolerate everything in order to get scheduled...

That being said +1 to having separate Taints for each Condition.

Great @gmarek, I think that's step number one

NodeReady = false basically means that Kubelet is down or can't reach API server, which in turn means that there's no reason for tolerating NoSchedule Taint corresponding to it (as Kubelet won't notice your Pod either way).

@gmarek I thought that NodeReady=ConditionFalse meant that not every sub-condition is true; the kubelet experiences some problems. And that NodeReady=ConditionUnknown meant that the node isn't reachable by the control plane. Isn't that understanding right?

IMO there is a conflation of "Conditions" with an implicit state vs. having an explicit state machine. This leads to the glue logic that exists in kubelet, and if we follow this track it would make it's way into the scheduler.

@timothysc I'm not sure I understand how this glue logic would make its way into the scheduler. There wouldn't be any new code added to the scheduler really, only proposal (from me, at least) is to allow taking NodeReady=ConditionFalse nodes into account. (still exclude the unreachable ones)

@gmarek
Copy link
Contributor
gmarek commented May 18, 2017

@luxas - my understanding is different. I think that all conditions are independent, and NodeReady is also independent of others, and it means that there's something fundamental wrong with the Node (I believe that it mostly means that runtime is down). I guess it's best to ask @kubernetes/sig-node-proposals

@timothysc timothysc added this to the v1.7 milestone May 19, 2017
@timothysc
Copy link
Member

I'm bumping priority down now that I understand the context of where kubeadm self hosting is at.

@timothysc timothysc added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels May 25, 2017
@timothysc timothysc modified the milestones: next-candidate, v1.7 May 25, 2017
@caseydavenport
Copy link
Member

Currently we could workaround this by using DaemonSets as they aren't using the default scheduler: #42002
But that'd be a poor solution and will stop working in a future release.

Yeah, I think we need to fix this behavior before we change the DaemonSet controller to use the default scheduler. Various CNI network providers (Calico, Flannel, Weave) rely on DaemonSets being able to schedule hostNetwork pods when NetworkUnavailable is true.

Another use case for this that we hear a lot is the ability to deploy networking via helm. Helm requires the tiller pod to run on the cluster before being able to deploy any charts. Given the current behavior this makes it impossible to deploy network daemonsets via helm.

@luxas
Copy link
Member Author
luxas commented May 30, 2017

Ah, this is even worse than I thought...

@timothysc

@luxas So we talked about this issue today in sig-scheduling. There is not a plan of record to leverage taints and conditions. The NetworkNotReady condition seems to be conflated with the NotReady status.
It would be good to know when and how the node gets into the NotReady status and how exactly has that changed.

See: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_node_status.go#L658

var newNodeReadyCondition v1.NodeCondition
rs := append(kl.runtimeState.runtimeErrors(), kl.runtimeState.networkErrors()...)
if len(rs) == 0 {
	newNodeReadyCondition = v1.NodeCondition{
		Type:              v1.NodeReady,
		Status:            v1.ConditionTrue,
		Reason:            "KubeletReady",
		Message:           "kubelet is posting ready status",
		LastHeartbeatTime: currentTime,
	}
} else {
	newNodeReadyCondition = v1.NodeCondition{
		Type:              v1.NodeReady,
		Status:            v1.ConditionFalse,
		Reason:            "KubeletNotReady",
		Message:           strings.Join(rs, ","),
		LastHeartbeatTime: currentTime,
	}
}

I took for granted without even looking deeper at it that NetworkReady actually was a Kubernetes Node condition, but it isn't. It's a CRI Condition, which can't be used for scheduling.
What we have now is a blob in the KubeletNotReady condition that can fail for a number of different reasons.

I really think that the NetworkPluginNotReady CRI condition should be reported as a Node Condition as well instead of being baked into the general KubeletNotReady condition. Now we lack granularity in all kinds of ways. And we can't remove the NetworkPluginNotReady -> KubeletNotReady behavior anymore, since that's GA functionality now.

    - lastHeartbeatTime: 2017-05-30T13:06:04Z
      lastTransitionTime: 2017-05-30T13:05:04Z
      message: 'runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady
        message:docker: network plugin is not ready: cni config uninitialized'
      reason: KubeletNotReady
      status: "False"
      type: Ready

PTAL @kubernetes/sig-node-bugs. Is there any specific reason the CRI condition reporting was designed as-is?

@yujuhong
Copy link
Contributor

PTAL @kubernetes/sig-node-bugs. Is there any specific reason the CRI condition reporting was designed as-is?

Reporting unready network plugins as part of the node ready condition predated CRI. In fact, the logic was there at least on or before kubernetes 1.2, before taints/toleration was introduced.

I do think we should clarify/re-define the conditions and interactions with the control plane.

@timothysc
Copy link
Member

I think this is bad, but I don't think it's release blocker bad.

I do agree it's feature blocker bad for both "self-hosted" and daemonsets->scheduler.

Does anyone have cycles to investigate?

@luxas
Copy link
Member Author
luxas commented May 30, 2017

I think this is bad, but I don't think it's release blocker bad.

@timothysc Totally agree. Won't affect the release since the behavior has existed a couple of releases already.

I do agree it's feature blocker bad for both "self-hosted" and daemonsets->scheduler.
Does anyone have cycles to investigate?

I might have some time, let's see how things turn out

I do think we should clarify/re-define the conditions and interactions with the control plane.

@yujuhong Where should we have this conversation?

@gmarek
Copy link
Contributor
gmarek commented May 30, 2017

@luxas @yujuhong - there's an started work in #42406, that I didn't have time to properly drive. First step of cleaning up all this mess was gathering the data I shared in a trix. Next step is to decide what semantics we want to have (and which behavior we want to break), which I planned to do after the freeze (and after I dig myself out of stuff that accumulates during it).

@yujuhong
Copy link
Contributor

@gmarek, thanks and looking forward to see the progress on this!

@yujuhong Where should we have this conversation?

@luxas, it's unlikely that we'll have time to tackle this before the 1.7 release settles down. I think @gmarek's plan sounds great.

@gmarek
Copy link
Contributor
gmarek commented May 31, 2017

cc @davidopp

@kfox1111
Copy link

Any update on this? I just ran into this issue again.

@k82cn
Copy link
Member
k82cn commented Dec 15, 2017

there's an alpha feature in 1.9 (TaintNodeByCondition), please try it :).

@luxas
Copy link
Member Author
luxas commented Dec 22, 2017

I think we can close this as it's implemented in alpha and the feature tracking state is here: kubernetes/enhancements#382

@luxas luxas closed this as completed Dec 22, 2017
@deitch
Copy link
Contributor
deitch commented Mar 1, 2018

What was the final result of this? I could not find documentation that said, "here is how you schedule a Pod on a node that has network not yet ready.

I think the net result is that you:

  1. enable tainting the node by condition
  2. Add a toleration for the specific taint (in this case, network not being ready if you are using hostnetwork

But I am unsure. Is that correct? How do I enable them and what is the correct toleration? Is there an example?

@swade1987
Copy link

@deitch i am also interested as I would like to deploy calico via a helm chart which would require this functionality.

@resouer
Copy link
Contributor
resouer commented Apr 2, 2018

@swade1987 It's already been implemented, the taint key is: node.kubernetes.io/network-unavailable

@kfox1111
Copy link
kfox1111 commented Apr 2, 2018

Last I looked, running tiller as a statefulset/deployment with net=host and the toleration still didn't let it schedule. maybe that has changed though?

@jsafrane
Copy link
Member

For the record, this has worked for me with today's master (pre-1.11):

  • Enabled feature gate TaintNodesByCondition=true everywhere
  • Run a pod with toleration:
    apiVersion: v1
    kind: Pod
    metadata:
      name: testpod
    spec:
      tolerations:
        - key: node.kubernetes.io/network-unavailable
      ...
    

@bgrant0607
Copy link
Member

Documentation:
https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/#taint-based-evictions

Could definitely be more clear. Not possible to find without knowing the name of the taint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
None yet
Development

No branches or pull requests

0