Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support teardown of threaded cgroups #3894

Closed
wants to merge 1 commit into from
Closed

Support teardown of threaded cgroups #3894

wants to merge 1 commit into from

Conversation

cjgibson
Copy link

@cjgibson cjgibson commented Jun 8, 2023

This is a patch introduced by @amurzeau over in #3821. It is intended less as a fix ready for merge, and more of a point to start discussion with the runc developers about how I could go about addressing this issue.

For some context, cgroupv2 introduced a unified hierarchy, and the concept of a threading mode for managing control groups on a per-thread basis, instead of solely per-process. If you attempt to take advantage of that feature today though, you inadvertently create a container that can never be torn down by runc:

failed to handle container TaskExit event: failed to stop container: unknown error after kill: /usr/bin/nvidia-container-runtime did not terminate successfully: exit status 1: read /sys/fs/cgroup/system.slice/kubepods-.../cgroup.procs: operation not supported\n: unknown

When this happens the controlling pod cannot be cleaned up by kubelet, and the node's resources remain used until the host is reset to purge the wedged container and associated cgroup(s).

This error occurs because when a cgroup is moved out of "domain" (read: process, effectively behaving as you'd expect coming from cgroupv1) and into "threaded" mode (as indicated by the content of cgroup.type) the file cgroup.procs becomes unreadable in favor of cgroup.threads. The patch included here addresses that by checking for the error raised in that case (ENOTSUP) and swapping cgroup.threads in for cgroup.procs. This works, but isn't likely totally expected upstream in runc's callstack.

Interested in guidance as to how I'd go about addressing this change in behavior, and any other changes the maintainers would like to see before accepting a fix for this issue upstream, thanks in advance!

When you spawn one or more threaded cgroups in cgroupv2 mode, the file
cgroup.procs becomes unreadable, while cgroup.threads becomes readable.
Today this breaks container teardown as `runc` always attempts to read
the content of cgroup.procs during container termination. This PR switches
between cgroup.procs and cgroup.threads depending on if the cgroup
itself is threaded.

Signed-off-by: Christian Gibson <cjgibson@users.noreply.github.com>
@kolyshkin
Copy link
Contributor

Well, I guess the problem you're hitting is solved in #3825 -- can you please give it a try?

@cjgibson
Copy link
Author

cjgibson commented Jun 9, 2023

Well, I guess the problem you're hitting is solved in #3825 -- can you please give it a try?

Hah, beautiful! I should have done a better job of reviewing the current PRs - will build that branch and report back 👍

@kolyshkin
Copy link
Contributor

Well, I guess the problem you're hitting is solved in #3825 -- can you please give it a try?

Hah, beautiful! I should have done a better job of reviewing the current PRs - will build that branch and report back +1

@cjgibson any update? Note that #3825 has been merged, so you can just try to build runc from the main branch and see if it works for you.

@cjgibson
Copy link
Author

@cjgibson any update? Note that #3825 has been merged, so you can just try to build runc from the main branch and see if it works for you.

Apologies for the wait - tested and confirmed that PR fixes this a bit ago! Looking forward to the 1.1.8 release 😁

@cjgibson cjgibson closed this Jun 23, 2023
@cjgibson cjgibson deleted the cjgibson/1.1.7-3821-hotpatch branch June 23, 2023 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants