Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e flake: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write #109182

Closed
liggitt opened this issue Mar 31, 2022 · 52 comments · Fixed by kubernetes-sigs/kind#2709
Assignees
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@liggitt
Copy link
Member

liggitt commented Mar 31, 2022

Looks like we just got a spike of a new run failure message in master: https://storage.googleapis.com/k8s-triage/index.html?pr=1&text=unable%20to%20apply%20cgroup%20configuration&xjob=1-2

Seen in https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/109178/pull-kubernetes-conformance-kind-ga-only-parallel/1509397620936675328

s: "pod \"oidc-discovery-validator\" failed with status: {Phase:Failed Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:172.18.0.2 HostIPs:[{IP:172.18.0.2}] PodIP:10.244.1.130 PodIPs:[{IP:10.244.1.130}] StartTime:2022-03-31 05:35:20 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:oidc-discovery-validator State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:128,Signal:0,Reason:StartError,Message:failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 36721: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown,StartedAt:1970-01-01 00:00:00 +0000 UTC,FinishedAt:2022-03-31 05:35:21 +0000 UTC,ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:k8s.gcr.io/e2e-test-images/agnhost:2.36 ImageID:k8s.gcr.io/e2e-test-images/agnhost@sha256:f5241226198f5a54d22540acf2b3933ea0f49458f90c51fc75833d0c428687b8 ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8 Started:0xc000d223ea}] QOSClass:BestEffort EphemeralContainerStatuses:[]}", } pod "oidc-discovery-validator" failed with status: {Phase:Failed Conditions:[{Type:Initialized Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:} {Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:ContainersReady Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [oidc-discovery-validator]} {Type:PodScheduled Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2022-03-31 05:35:20 +0000 UTC Reason: Message:}] Message: Reason: NominatedNodeName: HostIP:172.18.0.2 HostIPs:[{IP:172.18.0.2}] PodIP:10.244.1.130 PodIPs:[{IP:10.244.1.130}] StartTime:2022-03-31 05:35:20 +0000 UTC InitContainerStatuses:[] ContainerStatuses:[{Name:oidc-discovery-validator State:{Waiting:nil Running:nil Terminated:&ContainerStateTerminated{ExitCode:128,Signal:0,Reason:StartError,Message:failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 36721: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown,StartedAt:1970-01-01 00:00:00 +0000 UTC,FinishedAt:2022-03-31 05:35:21 +0000 UTC,ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8,}} LastTerminationState:{Waiting:nil Running:nil Terminated:nil} Ready:false RestartCount:0 Image:k8s.gcr.io/e2e-test-images/agnhost:2.36 ImageID:k8s.gcr.io/e2e-test-images/agnhost@sha256:f5241226198f5a54d22540acf2b3933ea0f49458f90c51fc75833d0c428687b8 ContainerID:containerd://61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8 Started:0xc000d223ea}] QOSClass:BestEffort EphemeralContainerStatuses:[]}

/milestone v1.24
/sig node

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Mar 31, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.24 milestone Mar 31, 2022
@k8s-ci-robot
Copy link
Contributor

@liggitt: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Mar 31, 2022
@liggitt liggitt added kind/flake Categorizes issue or PR as related to a flaky test. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 31, 2022
@pacoxu
Copy link
Member

pacoxu commented Apr 1, 2022

kubelet log:

Mar 31 05:35:21 kind-worker kubelet[260]: E0331 05:35:21.552197 260 remote_runtime.go:453] "StartContainer from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 36721: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown" containerID="61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8"

containerd log:

Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.447710609Z" level=info msg="CreateContainer within sandbox "951d4117fa12f522b3eac01fdc3575df7a7d4d4cb1d467913ac0c9d5529ce909" for &ContainerMetadata{Name:oidc-discovery-validator,Attempt:0,} returns container id "61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8""
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.448420497Z" level=info msg="StartContainer for "61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8""
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.525049577Z" level=info msg="shim disconnected" id=61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.525138474Z" level=warning msg="cleaning up after shim disconnected" id=61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8 namespace=k8s.io
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.525149912Z" level=info msg="cleaning up dead shim"
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.537393536Z" level=warning msg="cleanup warnings time="2022-03-31T05:35:21Z" level=info msg="starting signal loop" namespace=k8s.io pid=36723 runtime=io.containerd.runc.v2\ntime="2022-03-31T05:35:21Z" level=warning msg="failed to read init pid file" error="open /run/containerd/io.containerd.runtime.v2.task/k8s.io/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/init.pid: no such file or directory" runtime=io.containerd.runc.v2\n"
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.537681531Z" level=error msg="copy shim log" error="read /proc/self/fd/268: file already closed"
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.537997837Z" level=error msg="Failed to pipe stdout of container "61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8"" error="reading from a closed fifo"
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.538005119Z" level=error msg="Failed to pipe stderr of container "61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8"" error="reading from a closed fifo"
Mar 31 05:35:21 kind-worker containerd[178]: time="2022-03-31T05:35:21.550346256Z" level=error msg="StartContainer for "61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8" failed" error="failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 36721: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown"

I prefer to think it is a container-related issue: failed to read init pid file.

@dims
Copy link
Member

dims commented Apr 4, 2022

@ehashman
Copy link
Member

ehashman commented Apr 4, 2022

/triage accepted
/priority critical-urgent

/cc @kolyshkin @rphillips

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Apr 4, 2022
@mrunalp
Copy link
Contributor

mrunalp commented Apr 4, 2022

Was the runc (or containerd) binary updated on these jobs?

@ehashman ehashman added this to Triage in SIG Node CI/Test Board Apr 4, 2022
@kolyshkin
Copy link
Contributor

write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod4c5127ae-797f-4b89-9aa9-7f66226768cd/61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8/cgroup.procs: no such device: unknown" containerID="61b3e1f7568f23bb1503c2309e9e254c1ac0103d0de059958f9555ff6548b5c8"

Support for RDMA controller was indeed added to runc 1.1 (in opencontainers/runc#2883).

The error message says that write failed. That write happens right after mkdir, which (apparently) succeeded.

@kolyshkin
Copy link
Contributor

kubelet log also shows (this is the earliest mention of rdma):

Mar 31 05:30:14 kind-control-plane kubelet[723]: time="2022-03-31T05:30:14Z" level=warning msg="Failed to remove cgroup (will retry)" error="rmdir /sys/fs/cgroup/rdma/kubelet/kubepods/pod93035424-72ce-42df-be98-d0a9a47ec7c3/7cd75055222b05261b435a4face3561cd52ba91f54fa9ce6b194962767e5cddd: device or resource busy"
Mar 31 05:30:14 kind-control-plane kubelet[723]: time="2022-03-31T05:30:14Z" level=warning msg="Failed to remove cgroup (will retry)" error="rmdir /sys/fs/cgroup/unified/kubelet/kubepods/besteffort/pod2ca9d41c-9ba9-4151-bac4-ef5d66467871/1aa8f37127b89423f3927beddbde27c0af1a1d45b1c3dc2a80b4079621412c35: device or resource busy"
Mar 31 05:30:14 kind-control-plane kubelet[723]: time="2022-03-31T05:30:14Z" level=warning msg="Failed to remove cgroup (will retry)" error="rmdir /sys/fs/cgroup/unified/kubelet/kubepods/pod93035424-72ce-42df-be98-d0a9a47ec7c3/7cd75055222b05261b435a4face3561cd52ba91f54fa9ce6b194962767e5cddd: device or resource busy"
Mar 31 05:30:14 kind-control-plane kubelet[723]: time="2022-03-31T05:30:14Z" level=warning msg="Failed to remove cgroup (will retry)" error="rmdir /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod2ca9d41c-9ba9-4151-bac4-ef5d66467871/1aa8f37127b89423f3927beddbde27c0af1a1d45b1c3dc2a80b4079621412c35: device or resource busy"
Mar 31 05:30:15 kind-control-plane kubelet[723]: time="2022-03-31T05:30:15Z" level=error msg="Failed to remove cgroup" error="rmdir /sys/fs/cgroup/rdma/kubelet/kubepods/pod93035424-72ce-42df-be98-d0a9a47ec7c3/7cd75055222b05261b435a4face3561cd52ba91f54fa9ce6b194962767e5cddd: device or resource busy"
Mar 31 05:30:15 kind-control-plane kubelet[723]: time="2022-03-31T05:30:15Z" level=error msg="Failed to remove cgroup" error="rmdir /sys/fs/cgroup/unified/kubelet/kubepods/besteffort/pod2ca9d41c-9ba9-4151-bac4-ef5d66467871/1aa8f37127b89423f3927beddbde27c0af1a1d45b1c3dc2a80b4079621412c35: device or resource busy"
Mar 31 05:30:15 kind-control-plane kubelet[723]: time="2022-03-31T05:30:15Z" level=error msg="Failed to remove cgroup" error="rmdir /sys/fs/cgroup/unified/kubelet/kubepods/pod93035424-72ce-42df-be98-d0a9a47ec7c3/7cd75055222b05261b435a4face3561cd52ba91f54fa9ce6b194962767e5cddd: device or resource busy"
Mar 31 05:30:15 kind-control-plane kubelet[723]: time="2022-03-31T05:30:15Z" level=error msg="Failed to remove cgroup" error="rmdir /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod2ca9d41c-9ba9-4151-bac4-ef5d66467871/1aa8f37127b89423f3927beddbde27c0af1a1d45b1c3dc2a80b4079621412c35: device or resource busy"
Mar 31 05:30:15 kind-control-plane kubelet[723]: I0331 05:30:15.180441     723 pod_container_manager_linux.go:192] "Failed to delete cgroup paths" cgroupName=[kubelet kubepods pod93035424-72ce-42df-be98-d0a9a47ec7c3] err="unable to destroy cgroup paths for cgroup [kubelet kubepods pod93035424-72ce-42df-be98-d0a9a47ec7c3] : Failed to remove paths: map[:/sys/fs/cgroup/unified/kubelet/kubepods/pod93035424-72ce-42df-be98-d0a9a47ec7c3 rdma:/sys/fs/cgroup/rdma/kubelet/kubepods/pod93035424-72ce-42df-be98-d0a9a47ec7c3]"
Mar 31 05:30:15 kind-control-plane kubelet[723]: I0331 05:30:15.180471     723 pod_container_manager_linux.go:192] "Failed to delete cgroup paths" cgroupName=[kubelet kubepods besteffort pod2ca9d41c-9ba9-4151-bac4-ef5d66467871] err="unable to destroy cgroup paths for cgroup [kubelet kubepods besteffort pod2ca9d41c-9ba9-4151-bac4-ef5d66467871] : Failed to remove paths: map[:/sys/fs/cgroup/unified/kubelet/kubepods/besteffort/pod2ca9d41c-9ba9-4151-bac4-ef5d66467871 rdma:/sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod2ca9d41c-9ba9-4151-bac4-ef5d66467871]"

Note that support for hybrid unified hierarchy was also appeared in runc 1.1.

@kolyshkin
Copy link
Contributor

The removal failure happens because allegedly there is a process(es) left in rdma and unified cgroups which prevent the cgroup removal. I can't figure out why this can ever happen (kubelet does not know anything about rdma or unified, but it should not break things).

My preliminary theory is, inability to write a pid to rdma is caused by too many rdma cgroups.

In any case, we should figure out why rdma and unified are not empty upon removal. Following the source code, kubelet kills all the processes in these cgroups before trying to remove them, so I am puzzled.

@mrunalp
Copy link
Contributor

mrunalp commented Apr 5, 2022

In any case, we should figure out why rdma and unified are not empty upon removal. Following the source code, kubelet kills all the processes in these cgroups before trying to remove them, so I am puzzled.

Any possibility of processes stuck in 'D' state?

@kolyshkin
Copy link
Contributor

So, I added some debug in #109298 to see what is going on.

Here is an excerpt from the kubelet log:

Apr 05 01:54:38 kind-worker kubelet[261]: time="2022-04-05T01:54:38Z" level=error msg="Failed to remove cgroup" error="rmdir /sys/fs/cgroup/unified/kubelet/kubepods/burstable/pod1883213d8fec799ee2b7bf9f2185a5c7/5b078c521eefb476090c430cac51c128fabcd9094ebcaf0fa225d2b366c13c39: device or resource busy"
Apr 05 01:54:38 kind-worker kubelet[261]: time="2022-04-05T01:54:38Z" level=error msg="Failed to remove cgroup" error="rmdir /sys/fs/cgroup/rdma/kubelet/kubepods/burstable/pod1883213d8fec799ee2b7bf9f2185a5c7/5b078c521eefb476090c430cac51c128fabcd9094ebcaf0fa225d2b366c13c39: device or resource busy"
Apr 05 01:54:38 kind-worker kubelet[261]: I0405 01:54:38.438546 261 pod_container_manager_linux.go:192] "Failed to delete cgroup paths" cgroupName=[kubelet kubepods burstable poda67d4606-bf19-4290-b354-6c8f9f522e74] err="unable to destroy cgroup paths for cgroup [kubelet kubepods burstable poda67d4606-bf19-4290-b354-6c8f9f522e74] : Failed to remove paths: map[:/sys/fs/cgroup/unified/kubelet/kubepods/burstable/poda67d4606-bf19-4290-b354-6c8f9f522e74 rdma:/sys/fs/cgroup/rdma/kubelet/kubepods/burstable/poda67d4606-bf19-4290-b354-6c8f9f522e74]" pids=[]
Apr 05 01:54:38 kind-worker kubelet[261]: I0405 01:54:38.436682 261 cgroup_manager_linux.go:307] "KKK subsystem info" name="" path="/sys/fs/cgroup/unified/kubelet/kubepods/besteffort/pod15fc6eea-8cb0-4d22-8f38-79d81043176f" pids=[0 0] pErr=
Apr 05 01:54:38 kind-worker kubelet[261]: I0405 01:54:38.438656 261 pod_container_manager_linux.go:192] "Failed to delete cgroup paths" cgroupName=[kubelet kubepods burstable pod1e2d8aa5-e95f-4ccd-8a58-70c4d5559d54] err="unable to destroy cgroup paths for cgroup [kubelet kubepods burstable pod1e2d8aa5-e95f-4ccd-8a58-70c4d5559d54] : Failed to remove paths: map[:/sys/fs/cgroup/unified/kubelet/kubepods/burstable/pod1e2d8aa5-e95f-4ccd-8a58-70c4d5559d54 rdma:/sys/fs/cgroup/rdma/kubelet/kubepods/burstable/pod1e2d8aa5-e95f-4ccd-8a58-70c4d5559d54]" pids=[]
Apr 05 01:54:38 kind-worker kubelet[261]: I0405 01:54:38.438668 261 cgroup_manager_linux.go:307] "KKK subsystem info" name="rdma" path="/sys/fs/cgroup/rdma/kubelet/kubepods/burstable/pod1883213d8fec799ee2b7bf9f2185a5c7" pids=[] pErr=
Apr 05 01:54:38 kind-worker kubelet[261]: I0405 01:54:38.436848 261 cgroup_manager_linux.go:307] "KKK subsystem info" name="rdma" path="/sys/fs/cgroup/rdma/kubelet/kubepods/pod9daba603-dc32-4c18-bb17-7110922acc05" pids=[] pErr=
Apr 05 01:54:38 kind-worker kubelet[261]: I0405 01:54:38.438981 261 cgroup_manager_linux.go:307] "KKK subsystem info" name="" path="/sys/fs/cgroup/unified/kubelet/kubepods/burstable/pod1883213d8fec799ee2b7bf9f2185a5c7" pids=[0 0] pErr=
Apr 05 01:54:38 kind-worker kubelet[261]: I0405 01:54:38.439027 261 cgroup_manager_linux.go:307] "KKK subsystem info" name="" path="/sys/fs/cgroup/unified/kubelet/kubepods/pod9daba603-dc32-4c18-bb17-7110922acc05" pids=[0 0] pErr=
Apr 05 01:54:38 kind-worker kubelet[261]: I0405 01:54:38.439225 261 pod_container_manager_linux.go:192] "Failed to delete cgroup paths" cgroupName=[kubelet kubepods besteffort pod15fc6eea-8cb0-4d22-8f38-79d81043176f] err="unable to destroy cgroup paths for cgroup [kubelet kubepods besteffort pod15fc6eea-8cb0-4d22-8f38-79d81043176f] : Failed to remove paths: map[:/sys/fs/cgroup/unified/kubelet/kubepods/besteffort/pod15fc6eea-8cb0-4d22-8f38-79d81043176f rdma:/sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/pod15fc6eea-8cb0-4d22-8f38-79d81043176f]" pids=[]

All this means that

  • there are no processes in RDMA cgroup, yet it can not be removed;
  • for some reason runc/libcontainer/cgroups.GetAllPids() returns the list with two 0 in it for unified controller. This is probably not related to this issue, but I am looking into it;
  • despite no processes in cgroups, they can not be removed.

Adding more debug to #109298...

My next two suspects are KIND and the kernel. As for KIND, I looked at the sources of the script that prepares cgroups, and found nothing bad.

@kolyshkin
Copy link
Contributor

Any possibility of processes stuck in 'D' state?

@mrunalp Looks like it's not it, cgroup.procs show no entries (nor any in the subdirectories) -- see the previous comment.

@helayoty
Copy link
Member

helayoty commented Apr 5, 2022

@liggitt 👋 the release 1.24 bug triage shadow. While the test freeze phase is cut off tomorrow, do you think this issue will still be included in the current release?

@liggitt
Copy link
Member Author

liggitt commented Apr 5, 2022

Until the issue is understood, it should remain in the milestone

@ehashman
Copy link
Member

ehashman commented Apr 5, 2022

/assign @mrunalp

@derekwaynecarr
Copy link
Member

@mrunalp lets catch up on why rdma is even an available controller on this host.

rdma isnt in this allowed list:

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/cgroup_manager_linux.go#L260

@kolyshkin
Copy link
Contributor

@mrunalp lets catch up on why rdma is even an available controller on this host.

rdma isnt in this allowed list:

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/cgroup_manager_linux.go#L260

@derekwaynecarr This "allowed list" is merely a way to specify controllers that must be present (I guess its naming is slightly misleading). IOW, the code you refer to ensures that memory, cpu etc paths do present.

It has nothing to do with rdma or unified. runc creates cgroups for all supported controllers/subsystems (and add containers to all of them).

What's unclear is why these cgroups can't be removed during destroy. I am still looking at it in #109298 (feeling under the weather today so it takes longer).

@mrunalp
Copy link
Contributor

mrunalp commented Apr 5, 2022

It has nothing to do with rdma or unified. runc creates cgroups for all supported controllers/subsystems (and add containers to all of them).

One thing worth trying may be to see if we don't join the rdma controller, do we still hit the issue.
Maybe runc can skip joining it unless rdma settings are specified? (It is a relatively newer controller which hasn't been tested as much within containers so we could avoid potential bugs there.)

@derekwaynecarr
Copy link
Member

@kolyshkin understood. rdma as an enabled cgroup controller on a target host for kubelet execution is what was new to me so I was wondering if there was a change to the test operating system configuration beyond just runc adding awareness.

@kolyshkin
Copy link
Contributor

RDMA cgroup requires a kernel config parameter to be set. It is obviously set in Ubuntu kernels.

In Fedora 35 kernels, CONFIG_CGROUP_RDMA is not set. Here's from my machine:

[kir@kir-rhat run]$ grep RDMA /boot/config-5.16.18*
# CONFIG_CGROUP_RDMA is not set

On CentOS Stream 9 kernel, it is set:

[root@cirrus-task-6704502681108480 runc]# grep RDMA /boot/config-5.14.0-71.el9.x86_64 
CONFIG_CGROUP_RDMA=y

@kolyshkin
Copy link
Contributor

Not joining RDMA in vendored libcontainer did not help. It might help in runc binary (which I haven't done).

Looking into the underlying cause.

SIG Node CI/Test Board automation moved this from Issues - In progress to Done Apr 13, 2022
@BenTheElder
Copy link
Member

BenTheElder commented Apr 15, 2022

I'm not sure this is eliminated:

https://storage.googleapis.com/k8s-triage/index.html?pr=1&text=unable%20to%20apply%20cgroup%20configuration&xjob=1-2

=>

https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-kind-conformance-parallel-ipv6/1515013844353683456

Test started today at 10:08 AM failed after 31m15s

Message:failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to apply cgroup configuration: failed to write 50586: write /sys/fs/cgroup/rdma/kubelet/kubepods/besteffort/poddb92c0c2-8c75-48f5-bb58-14f859e7797b/b67bb9a6e0d9b89e17cd8d123532268d31329876d4f0d7463196afa2a435faa1/cgroup.procs: no such device: unknown

but CI should be using kind @ HEAD and kubernetes-sigs/kind#2709 merged two days ago

@liggitt liggitt reopened this Apr 18, 2022
SIG Node CI/Test Board automation moved this from Done to Issues - In progress Apr 18, 2022
@liggitt
Copy link
Member Author

liggitt commented Apr 18, 2022

reopening per #109182 (comment) to make sure this is resolved

s0j pushed a commit to jnummelin/k0s that referenced this issue Apr 20, 2022
Looking at some of the issues around k8s/runc, I came across this
issue where runc 1.1.0 didn't properly scope some cgroup objects.

kubernetes/kubernetes#109182

Signed-off-by: Shane Jarych <sjarych@mirantis.com>
jnummelin pushed a commit to jnummelin/k0s that referenced this issue Apr 21, 2022
Looking at some of the issues around k8s/runc, I came across this
issue where runc 1.1.0 didn't properly scope some cgroup objects.

kubernetes/kubernetes#109182

Signed-off-by: Shane Jarych <sjarych@mirantis.com>
jnummelin pushed a commit to jnummelin/k0s that referenced this issue Apr 26, 2022
Looking at some of the issues around k8s/runc, I came across this
issue where runc 1.1.0 didn't properly scope some cgroup objects.

kubernetes/kubernetes#109182

Signed-off-by: Shane Jarych <sjarych@mirantis.com>
SIG Node CI/Test Board automation moved this from Issues - In progress to Done May 2, 2022
@liggitt
Copy link
Member Author

liggitt commented May 3, 2022

@BenTheElder BenTheElder closed this in [BenTheElder/kind@db40a9b](/BenTheElder/kind/commit/db40a9b58aefcb44abffcab638acb5e44e05f31d) 15 hours ago

should that have actually closed this issue? not seeing how the linked commit in Ben's fork modified kind bringup

@liggitt liggitt reopened this May 3, 2022
SIG Node CI/Test Board automation moved this from Done to Issues - In progress May 3, 2022
@BenTheElder
Copy link
Member

Oh no, that's that GitHub "feature", I merely synced my fork to upstream but the commit contains "fixes". Not sure why CI didn't block this with the invalid-commit label.

@thaJeztah
Copy link
Contributor

Ah, yes, those are always a pain.

@BenTheElder
Copy link
Member

Potentially related: ... We should probably update the docker-in-docker in Kubernetes CI, it's going to have an outdated docker install and it's generally not well done, I've been meaning to clean that up ...

https://github.com/kubernetes/test-infra/blob/master/images/krte/Dockerfile is still based on Debian Buster ...

@aojea
Copy link
Member

aojea commented May 10, 2022

the tests that execute command on pods seems to be affected by this

#109928 (comment)

@BenTheElder
Copy link
Member

BenTheElder commented May 12, 2022

KIND's CI image is now on docker 20.10.15 / runc v1.1.1-0-g52de29d.

Tentatively after this change we don't see any more loges about rdma cgroups, I've spot checked with a few curl $kubelet_log | grep rdma for CI logs from before and after the change. Which makes sense since3 we should be on runc v1.1.1 on both dind and kind within that.

(the CI dind is still naively done and I'm not sure what the underlying hosts are running currently, need to get back to that ...)

@BenTheElder
Copy link
Member

https://storage.googleapis.com/k8s-triage/index.html?pr=1&text=unable%20to%20apply%20cgroup%20configuration&xjob=1-2 is indeed empty.

https://storage.googleapis.com/k8s-triage/index.html?pr=1&text=unable%20to%20apply%20cgroup%20configuration on all jobs has a few, but those are:

ci-kubernetes-csi-1-22-test-on-kubernetes-master
ci-kubernetes-csi-1-23-test-on-kubernetes-master

periodic-cluster-api-e2e-workload-upgrade-1-21-1-22-release-1-1

@BenTheElder
Copy link
Member

BenTheElder commented May 27, 2022

The csi-driver-hostpath jobs are due to kubekins-e2e image not having the updated docker (and possibly not updated kind).

CAPI is probably the same thing.

These remaining flakes are rare and not affecting CI for this repo.
/close

@k8s-ci-robot
Copy link
Contributor

@BenTheElder: Closing this issue.

In response to this:

The csi-driver-hostpath jobs are due to kubekins-e2e image not having the updated docker (and possibly not update kind).

CAPI is probably the same thing.

These remaining flakes are rare and not affecting CI for this repo.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

SIG Node CI/Test Board automation moved this from Issues - In progress to Done May 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.