vm restart when kubelet restart #11662

wavezhang · 2024-04-08T08:56:39Z

What happened:
A clear and concise description of what the bug is.
vm restart when kubelet restart
What you expected to happen:
A clear and concise description of what you expected to happen.
vm not affected when kubelet restart
How to reproduce it (as minimally and precisely as possible):
Steps to reproduce the behavior.
restart kubelet until you see vm pod stop
Additional context:
Add any other context about the problem here.

kubelet log

predicate.go:129] "Predicate failed on Pod" pod="default/virt-launcher-vm-ftiq8-sk8jk" err="Predicate NodeAffinity failed"

func (h *HeartBeat) heartBeat(heartBeatInterval time.Duration, stopCh chan struct{}) {
	// ensure that the node is synchronized with the actual state
	// especially setting the node to unschedulable if device plugins are not yet ready is very important
	// otherwise workloads get scheduled but are immediately terminated by the kubelet
	h.do()
	// Now wait for 10 seconds for the device plugins  to be initialized
	// This is more than fast enough to be not treated as unschedulable by the cluster
	// and ensures that the cluster gets marked as scheduled as soon as the device plugin is ready
	h.waitForDevicePlugins(stopCh)

	// from now on periodically update the node status
	wait.JitterUntil(h.do, heartBeatInterval, 1.2, true, stopCh)
}

It seems that h.waitForDevicePlugins(stopCh) should put into h.do() ?

Environment:

KubeVirt version (use virtctl version): N/A
Kubernetes version (use kubectl version): N/A
VM or VMI specifications: N/A
Cloud provider or hardware configuration: N/A
OS (e.g. from /etc/os-release): N/A
Kernel (e.g. uname -a): N/A
Install tools: N/A
Others: N/A

The text was updated successfully, but these errors were encountered:

akalenyu · 2024-04-08T09:08:13Z

We had a similar issue a while ago, and it was fixed and backported (kubernetes/kubernetes#118635)
Could you check if the k8s version you're using is impacted?

wavezhang · 2024-04-08T09:10:48Z

We had a similar issue a while ago, and it was fixed and backported (kubernetes/kubernetes#118635) Could you check if the k8s version you're using is impacted?

it‘s not the same promblem， see kubelet logs

victortoso · 2024-04-08T09:39:11Z

@wavezhang what KubeVirt and k8s versions are you running?

fabiand · 2024-04-08T10:56:43Z

And what container runtime are you using?

But IIRC then as part of the discussion around kubernetes/kubernetes#118635 we today assume that kubelet restarts leads to container restarts.

In controlled cases, we only see node maintenance to cause kubelet restarts, in that case we assume a node ot be drained (aka no vms are there)
In uncontrolled cases, aka an error, we do expect VMs to be killed.

Homura222 · 2024-04-24T10:53:34Z

The virt-handler device_controller is watching the kubelet.sock file, and when kubelet restarts, dpi.initialized is set to false. If virt-handler executes a heartbeat at this time, the kubevirt.io/schedulable label of the node where virt-handler resides will be set to false. After kubelet restarts, kubelet will kill pods that do not match the node label.（virt-launcher pod nodeSelector is kubevirt.io/schedulable: "true"） This will result in all VMs on the node being restarted.

issue:
kubernetes/kubernetes#123980
kubernetes/kubernetes#124586

How to reproduce it (as minimally and precisely as possible):
Restart kubelet every 1s until the kubevirt.io/schedulable label on the node becomes false. Or delete virt-handler pod, set the kubevirt.io/schedulable label on the node to false, then restart kubelet.

I believe the solution to this problem is:
Fix the issue of kubelet kill pods that do not match the node label when restarting(kubernetes/kubernetes#124367), or Improve the virt-handler heartbeats.
cc @victortoso @fabiand @akalenyu @rmohr

wavezhang added the kind/bug label Apr 8, 2024

Homura222 mentioned this issue May 3, 2024

Virt-handler heartbeat causes the vm to be restarted when kubelet is restarted. #11843

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vm restart when kubelet restart #11662

vm restart when kubelet restart #11662

wavezhang commented Apr 8, 2024 •

edited

akalenyu commented Apr 8, 2024

wavezhang commented Apr 8, 2024

victortoso commented Apr 8, 2024 •

edited

fabiand commented Apr 8, 2024

Homura222 commented Apr 24, 2024 •

edited

vm restart when kubelet restart #11662

vm restart when kubelet restart #11662

Comments

wavezhang commented Apr 8, 2024 • edited

akalenyu commented Apr 8, 2024

wavezhang commented Apr 8, 2024

victortoso commented Apr 8, 2024 • edited

fabiand commented Apr 8, 2024

Homura222 commented Apr 24, 2024 • edited

wavezhang commented Apr 8, 2024 •

edited

victortoso commented Apr 8, 2024 •

edited

Homura222 commented Apr 24, 2024 •

edited