Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExecSync did not return according to the timeout set in the request #9568

Closed
Wabct opened this issue Dec 21, 2023 · 9 comments
Closed

ExecSync did not return according to the timeout set in the request #9568

Wabct opened this issue Dec 21, 2023 · 9 comments
Labels
kind/test status/more-info-needed Awaiting contributor information

Comments

@Wabct
Copy link

Wabct commented Dec 21, 2023

Description

Create a Pod and configure a blocking ExecProbe. After setting the timeout, I find that the health check does not return as expected timeout.

Steps to reproduce the issue

The runtimeRequestTimeout of kubelet uses the default value of 2m
Test with blocking exec command
Pod.yaml

          readinessProbe:
            exec:
              command:
              - /bin/bash
              - -c
              - sleep 300&
            failureThreshold: 3
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 8

Describe the results you received and expected

I found that the requested time interval is 2m8s instead of 8s。The time interval of 2m8s is caused by the following code in kuebelt.

// kubernetes\pkg\kubelet\cri\remote\remote_runtime.go
func (r *remoteRuntimeService) ExecSync(containerID string, cmd []string, timeout time.Duration) (stdout []byte, stderr []byte, err error) {
        ...
	if timeout != 0 {
		// Use timeout + default timeout (2 minutes) as timeout to leave some time for
		// the runtime to do cleanup.
		ctx, cancel = getContextWithTimeout(r.timeout + timeout)
	} else {
		ctx, cancel = getContextWithCancel()
	}
        ...

The ExecSync block until cmd execution is completed and exits. In this use case, ExecSync will block for 300s.
I did some debugging and found that ExecSync blocks in <-attachDone of the execInternal function, waiting for IO to exit.

// containerd/pkg/cri/server/container_execsync.go
func (c *criService) execInternal(ctx context.Context, container containerd.Container, id string, opts execOptions) (*uint32, error) {
        ...
	case <-execCtx.Done():
		// Ignore the not found error because the process may exit itself before killing.
		if err := process.Kill(ctx, syscall.SIGKILL); err != nil && !errdefs.IsNotFound(err) {
			return nil, fmt.Errorf("failed to kill exec %q: %w", execID, err)
		}
		// Wait for the process to be killed.
		exitRes := <-exitCh
		log.G(ctx).Debugf("Timeout received while waiting for exec process kill %q code %d and error %v",
			execID, exitRes.ExitCode(), exitRes.Error())
		<-attachDone
		log.G(ctx).Debugf("Stream pipe for exec process %q done", execID)
		return nil, fmt.Errorf("timeout %v exceeded: %w", opts.timeout, execCtx.Err())
}

I found a related issue #7832 and tried,but found that it didn't solve the problem.
I set drain_exec_sync_io_timeout to 1ns, and it successfully entered drainExecSyncIO as expected. When executing to execProcess.Delete, the function will still block in p.io.Wait(), which is still waiting for io release.

//  containerd/process.go
func (p *process) Delete(ctx context.Context, opts ...ProcessDeleteOpts) (*ExitStatus, error) {
        ...
	if p.io != nil {
		p.io.Cancel()
		p.io.Wait()
		p.io.Close()
	}
	return &ExitStatus{code: r.ExitStatus, exitedAt: r.ExitedAt}, nil
}

What version of containerd are you using?

containerd github.com/containerd/containerd v1.6.14

Any other relevant information

Kubernetes v1.25.6

Show configuration if it is related to CRI plugin.

No response

@Wabct Wabct added the kind/bug label Dec 21, 2023
@Wabct
Copy link
Author

Wabct commented Dec 21, 2023

@fuweid PTAL

@fuweid
Copy link
Member

fuweid commented Dec 22, 2023

Hi @Wabct , did you backport that patch? and did you check that exec process is zombie? Or you can try v1.7 release to verify it again? currently, that patch hasn't been backported into v1.6 yet.

@fuweid fuweid added status/more-info-needed Awaiting contributor information area/ci labels Dec 22, 2023
@Wabct
Copy link
Author

Wabct commented Dec 25, 2023

@fuweid, I backport that patch into v1.6.14 and find it did not work. Then i tried containerd v1.7.11-8-g6d0a3b66c, and got the same result. The child procress of ExecSync didn't zombie. After the execsync process exits, it's child was re-parented to main process of the container, which is pid 1 in the pid namespace. It will continue executing until exit.

Can we return the timeout error when execCtx.Done(), and start a go routine to clean up exec procress.

@fuweid
Copy link
Member

fuweid commented Dec 25, 2023

@Wabct hmm, I already added that test case to cover that. Before drain IO, that containerd should kill that exec process first. Even if there are children re-parented to pid 1, containerd-shim-runC-v2 will close that IO pipe after 2 second. And containerd will wait for drain IO until timeout ( CRI has setting for this). So which shim you are using?

Exec process doesn't have sub cgroup for so it's unlikely to clean up all processes created by exec process. I was thinking that we can introduce sub cgroup for exec. But it might have performance issue if we create sub cgroup for exec process. You know, that exec probe goes crazy

@Wabct
Copy link
Author

Wabct commented Dec 25, 2023

@fuweid
Here is my shim version:
containerd-shim-runc-v2:
Version: v1.7.11-8-g6d0a3b66c
Revision: 6d0a3b6
Go version: go1.19.12

I tried the following case:
crictl exec --timeout 5 --sync 54c8c90efe3f5 /bin/bash -c "sleep 20" return in 5s
crictl exec --timeout 5 --sync 54c8c90efe3f5 /bin/bash -c "sleep 20&" return in 20s

I check the pipe and found that the shim still hold the pipe of the child procress sleep 20& after 5s.
Is this caused by the shim is waiting for the pipe of the child process to exit that the contianerd also wait io release.

@fuweid
Copy link
Member

fuweid commented Dec 26, 2023

@Wabct Would you please use that following command to run that test with your image? I can't reproduce with your step.
And you can also use crictl info to ensure that drain_exec_sync_io_timeout is correct.

$ # apply that following diff to in your local container repo.
$ ENABLE_CRI_SANDBOXES=1 FOCUS="TestContainerDrainExecIOAfterExit" CONTAINERD_RUNTIME="runc.v2" BUILDTAGS="no_aufs no_btrfs" make cri-integration

=== RUN   TestContainerDrainExecIOAfterExit
    container_exec_test.go:35: Create a sandbox
    main_test.go:707: Image "docker.io/library/ubuntu:22.04" already exists, not pulling.
    container_exec_test.go:45: Create a container
    container_exec_test.go:59: Start the container
    container_exec_test.go:65: Exec in container
E1226 03:25:54.014965   93153 remote_runtime.go:412] ExecSync e672dae15c720c51dd3d5fc5efbe3db251b8934094cd4ecb71d188dfbf36cc28 '/bin/bash -c sleep 20&' from runtime service failed: rpc error: code = Unknown desc = failed to exec in container: failed to drain exec process "d5b66ebad3e5863d3c43943d9f6af7cb1bf67c23c9491f1f466385acbb5db561" io: failed to drain exec process "d5b66ebad3e5863d3c43943d9f6af7cb1bf67c23c9491f1f466385acbb5db561" io in 1ns because io is still held by other processes
--- PASS: TestContainerDrainExecIOAfterExit (3.51s)
PASS
diff --git a/integration/container_exec_test.go b/integration/container_exec_test.go
index be2635f68..8b052aa83 100644
--- a/integration/container_exec_test.go
+++ b/integration/container_exec_test.go
@@ -21,10 +21,9 @@ import (
        "testing"
        "time"

-       "github.com/containerd/containerd/integration/images"
-
        "github.com/stretchr/testify/assert"
        "github.com/stretchr/testify/require"
+       criruntime "k8s.io/cri-api/pkg/apis/runtime/v1"
 )

 func TestContainerDrainExecIOAfterExit(t *testing.T) {
@@ -37,7 +36,7 @@ func TestContainerDrainExecIOAfterExit(t *testing.T) {
        sb, sbConfig := PodSandboxConfigWithCleanup(t, "sandbox", "container-exec-drain-io-after-exit")

        var (
-               testImage     = images.Get(images.BusyBox)
+               testImage     = "docker.io/library/ubuntu:22.04"
                containerName = "test-container-exec"
        )

@@ -47,7 +46,8 @@ func TestContainerDrainExecIOAfterExit(t *testing.T) {
        cnConfig := ContainerConfig(
                containerName,
                testImage,
-               WithCommand("sh", "-c", "sleep 365d"),
+               WithCommand("bash", "-c", "sleep 365d"),
+               WithPidNamespace(criruntime.NamespaceMode_POD),
        )

        cn, err := runtimeService.CreateContainer(sb, cnConfig, sbConfig)
@@ -63,10 +63,6 @@ func TestContainerDrainExecIOAfterExit(t *testing.T) {
        }()

        t.Log("Exec in container")
-       _, _, err = runtimeService.ExecSync(cn, []string{"sh", "-c", "sleep 365d &"}, 5*time.Second)
+       _, _, err = runtimeService.ExecSync(cn, []string{"/bin/bash", "-c", "sleep 20&"}, 5*time.Second)
        require.ErrorContains(t, err, "failed to drain exec process")
-
-       t.Log("Exec in container")
-       _, _, err = runtimeService.ExecSync(cn, []string{"sh", "-c", "sleep 2s &"}, 10*time.Second)
-       require.NoError(t, err, "should drain IO in time")
 }
diff --git a/script/test/utils.sh b/script/test/utils.sh
index 8eedc040e..ce3b1681b 100755
--- a/script/test/utils.sh
+++ b/script/test/utils.sh
@@ -51,7 +51,7 @@ if [ -z "${CONTAINERD_CONFIG_FILE}" ]; then
 version=2

 [plugins."io.containerd.grpc.v1.cri"]
-  drain_exec_sync_io_timeout = "10s"
+  drain_exec_sync_io_timeout = "1ns"
 EOF

   if command -v sestatus >/dev/null 2>&1; then

@Wabct
Copy link
Author

Wabct commented Dec 26, 2023

I found the problem. The community shim links child procress through copyIO. The shim and containerd have separate pipe links, so the shim can close the pipe when the exec process exit.
We implemented a shim ourselves. In our implementation, the pipe obtained by containerd is directly connected to the child process. The shim cannot control the closing of this pipe. Therefore, #7832 does not take effect in our implementation.

Thanks for your reply. @fuweid

@tallclair
Copy link
Contributor

I don't quite understand the resolution in #9568 (comment), but we just got a duplicate report of this in Kubernetes: kubernetes/kubernetes#123931

Can you clarify what the solution is? Has this been fixed, and there's just a patch we need to pick up?

@rtheis
Copy link

rtheis commented Apr 19, 2024

#10094

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/test status/more-info-needed Awaiting contributor information
Projects
None yet
Development

No branches or pull requests

5 participants