You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1. What kops version are you running? The command kops version, will display
this information.
1.27.3
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
1.26.5
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
Create an instance group with minSize: 2, maxSize: 2 and maxUnavailable: 1.
Create deployment with two replicas and force affinity to run on different nodes in the ig.
Change something like rootVolumeSize and run rolling-update.
5. What happened after the commands executed?
Kops detaches 1 node (node A) from ASG
Kops awaits a new ASG node (C) to join the cluster and get healthy
Kops evicts pods running on nodes A and B at the same time
Pod A starts running on node C
Pod B stays in a pending state until a new node (D) joins the cluster
Logs: i-0c3b3448a4ae12e1c is node A i-00a2a7024e57bb5f8 is node B i-05ea3b6651f6467c0 is node C default/test-57c5db579d-cz5c9 pod is running in node A default/test-57c5db579d-sbx7g pod is running in node B
(base) ➜ ~ kops-1.27.3 rolling-update cluster --name k8s-test-02.my-tld --yes
Detected single-control-plane cluster; won't detach before draining
NAME STATUS NEEDUPDATE READY MIN TARGET MAX NODES
master-us-east-1a-1 Ready 0 1 1 1 1 1
test NeedsUpdate 2 0 2 2 2 2
I0322 11:13:39.217785 27816 instancegroups.go:501] Validating the cluster.
I0322 11:13:42.292254 27816 instancegroups.go:537] Cluster validated.
I0322 11:13:42.292469 27816 instancegroups.go:342] Tainting 2 nodes in "test" instancegroup.
I0322 11:13:42.601527 27816 instancegroups.go:602] Detaching instance "i-0c3b3448a4ae12e1c", node "i-0c3b3448a4ae12e1c", in group "test.k8s-test-02.my-tld".
I0322 11:13:43.394529 27816 instancegroups.go:203] waiting for 15s after detaching instance
I0322 11:13:58.396383 27816 instancegroups.go:501] Validating the cluster.
...
I0322 11:16:54.258561 27816 instancegroups.go:560] Cluster did not pass validation, will retry in "30s": machine "i-05ea3b6651f6467c0" has not yet joined cluster, system-node-critical pod "calico-node-qrd8s" is pending, system-node-critical pod "ebs-csi-node-bnpw8" is pending, system-node-critical pod "kube-proxy-i-05ea3b6651f6467c0" is pending, system-node-critical pod "node-problem-detector-dx8qx" is pending.
I0322 11:17:30.338670 27816 instancegroups.go:540] Cluster validated; revalidating in 10s to make sure it does not flap.
I0322 11:17:44.110497 27816 instancegroups.go:537] Cluster validated.
I0322 11:17:44.111576 27816 instancegroups.go:431] Draining the node: "i-00a2a7024e57bb5f8".
I0322 11:17:44.111584 27816 instancegroups.go:431] Draining the node: "i-0c3b3448a4ae12e1c".
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-s2xm4, kube-system/ebs-csi-node-nrhnc, kube-system/node-problem-detector-44ntz
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-hqrgz, kube-system/ebs-csi-node-wkhph, kube-system/node-problem-detector-fhrvm
evicting pod kube-system/calico-typha-7b67f47cf4-x4vhf
evicting pod kube-system/cluster-autoscaler-677b59697d-x78vr
evicting pod kube-system/metrics-server-7f46fdc79c-gph5v
evicting pod kube-system/pod-identity-webhook-8b88fdcd9-mwfcx
evicting pod default/test-57c5db579d-sbx7g
evicting pod kube-system/cluster-autoscaler-677b59697d-c7b5h
evicting pod kube-system/calico-typha-7b67f47cf4-cp44r
evicting pod default/test-57c5db579d-cz5c9
I0322 11:18:17.220535 27816 instancegroups.go:708] Waiting for 5s for pods to stabilize after draining.
I0322 11:18:17.300659 27816 instancegroups.go:708] Waiting for 5s for pods to stabilize after draining.
I0322 11:18:22.225207 27816 instancegroups.go:625] Stopping instance "i-00a2a7024e57bb5f8", node "i-00a2a7024e57bb5f8", in group "test.k8s-test-02.sre-dev.habitat.zone" (this may take a while).
I0322 11:18:22.301825 27816 instancegroups.go:625] Stopping instance "i-0c3b3448a4ae12e1c", node "i-0c3b3448a4ae12e1c", in group "test.k8s-test-02.sre-dev.habitat.zone" (this may take a while).
I0322 11:18:23.127212 27816 instancegroups.go:467] waiting for 15s after terminating instance
I0322 11:18:37.720373 27816 instancegroups.go:501] Validating the cluster.
I0322 11:18:41.649081 27816 instancegroups.go:560] Cluster did not pass validation, will retry in "30s": InstanceGroup "test" did not have enough nodes 1 vs 2, system-node-critical pod "calico-node-d8td6" is pending, system-node-critical pod "ebs-csi-node-22m6c" is pending, system-node-critical pod "kube-proxy-i-00a2a7024e57bb5f8" is not ready (kube-proxy), system-node-critical pod "kube-proxy-i-0c3b3448a4ae12e1c" is not ready (kube-proxy), system-node-critical pod "node-problem-detector-stkz8" is pending.
6. What did you expect to happen?
Kops would only evict pods from the specific node from which they were detached (A), not A and B at the same time.
7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
I have no rollingUpdate configuration in my cluster.yaml
8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.
I will upload later if needed.
9. Anything else do we need to know?
If I set maxUnavailable: 0, only pods from dateched nodes are evicted. Is this the expected behavior?
The text was updated successfully, but these errors were encountered:
/kind bug
1. What
kops
version are you running? The commandkops version
, will displaythis information.
1.27.3
2. What Kubernetes version are you running?
kubectl version
will print theversion if a cluster is running or provide the Kubernetes version specified as
a
kops
flag.1.26.5
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
minSize: 2
,maxSize: 2
andmaxUnavailable: 1
.rootVolumeSize
and run rolling-update.5. What happened after the commands executed?
pending
state until a new node (D) joins the clusterLogs:
i-0c3b3448a4ae12e1c
is node Ai-00a2a7024e57bb5f8
is node Bi-05ea3b6651f6467c0
is node Cdefault/test-57c5db579d-cz5c9
pod is running in node Adefault/test-57c5db579d-sbx7g
pod is running in node B6. What did you expect to happen?
Kops would only evict pods from the specific node from which they were detached (A), not A and B at the same time.
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml
to display your cluster manifest.You may want to remove your cluster name and other sensitive information.
I have no
rollingUpdate
configuration in my cluster.yaml8. Please run the commands with most verbose logging by adding the
-v 10
flag.Paste the logs into this report, or in a gist and provide the gist link here.
I will upload later if needed.
9. Anything else do we need to know?
If I set
maxUnavailable: 0
, only pods from dateched nodes are evicted. Is this the expected behavior?The text was updated successfully, but these errors were encountered: