Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runc can not wait process in the container exits when share pid namespace #4145

Open
kamizjw opened this issue Dec 15, 2023 · 4 comments
Open

Comments

@kamizjw
Copy link

kamizjw commented Dec 15, 2023

Description

1.docker run with --pid=host
2.other processes in the container except the init process D live
3.docker rm -f $containerdID

Steps to reproduce the issue

Describe the results you received and expected

i received:
1.containerd-shim and init process reaped
shim and init repad
2.container cgroup residue
cgroup residue

What version of runc are you using?

[root@localhost ~]# runc --version
runc version 1.1.3
commit: 02a436f4f2efd8c5a2ec5c4ed3d196242d4edb77
spec: 1.0.2-dev
go: go1.17.3
libseccomp: 2.5.3

Host OS information

No response

Host kernel information

No response

@kamizjw
Copy link
Author

kamizjw commented Dec 15, 2023

I think I figured out why cgroup residue.
when container run with --pid=host,runc delete will deal with process in signalAllProcesses func, because of one of container process(not init) is D status, that process will not exit.but int signalAllProcesses func, p.wait is invalid,it do not wait all processes exits. finnal, init process exit,but D status process not, containerd-shim process exit. cgroup cleanup failed and no more chance to clean up

@kamizjw
Copy link
Author

kamizjw commented Dec 15, 2023

I think I figured out why cgroup residue. when container run with --pid=host,runc delete will deal with process in signalAllProcesses func, because of one of container process(not init) is D status, that process will not exit.but int signalAllProcesses func, p.wait is invalid,it do not wait all processes exits. finnal, init process exit,but D status process not, containerd-shim process exit. cgroup cleanup failed and no more chance to clean up

So how should we make sure that all processes in the container have exited?

@kolyshkin
Copy link
Contributor

We have recently made some changes in that area (in particular, see #4102). Plus, you are using a somewhat old version of runc (1.1.3), the latest one is 1.1.10.

I suggest you try a version compiled from HEAD (this is a future 1.2.0), and let us know if it fixes your problem.

@kolyshkin
Copy link
Contributor

Another thing is, one can only wait(2) for its own child, thus, say, runc delete or runc kill can not wait for any container processes, as they are not the children of this instance of runc.

And, if the process can't be killed because it is stuck in D state, there's nothing runc can do (except for returning an error which I think is happening after #4102).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants