Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/1.5] shimv2: handle sigint/sigterm #6509

Merged
merged 1 commit into from Feb 9, 2022

Conversation

vteratipally
Copy link
Contributor

@vteratipally vteratipally commented Feb 3, 2022

backport of #5828 (to fix #5502 for 1.5)

This causes sigint/sigterm to trigger a shutdown of the shim.
It is needed because otherwise the v2 shim hangs system shutdown.

Signed-off-by: Brian Goff cpuguy83@gmail.com
(cherry picked from commit 3ffb6a6)
Signed-off-by: Varsha Teratipally teratipally@google.com

@k8s-ci-robot
Copy link

Hi @vteratipally. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vteratipally
Copy link
Contributor Author

/ok-to-test

@k8s-ci-robot
Copy link

@vteratipally: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vteratipally
Copy link
Contributor Author

/assign @qiutongs

@k8s-ci-robot
Copy link

@vteratipally: GitHub didn't allow me to assign the following users: qiutongs.

Note that only containerd members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @qiutongs

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vteratipally vteratipally force-pushed the my-backport-branch branch 2 times, most recently from f373c72 to 20f9c7f Compare February 3, 2022 17:48
@thaJeztah
Copy link
Member

^^ backport of #5828 (to fix #5502 for 1.5)

Looks like there's a linting failure;

 Error: type `orderedMounts` is unused (unused)

I don't have edit permissions, but could you also edit the PR title to use [release/1.5] as prefix? (makes it easier to find back back ports)

@vteratipally vteratipally changed the title shimv2: handle sigint/sigterm [release/1.5] shimv2: handle sigint/sigterm Feb 3, 2022
@vteratipally
Copy link
Contributor Author

I don't have edit permissions, but could you also edit the PR title to use [release/1.5] as prefix? (makes it easier to find back back ports)

Updated the title and description

@vteratipally
Copy link
Contributor Author

Error: type orderedMounts is unused (unused)
It is weird that it is failing because the orderedMounts is used in the windows related files.

@thaJeztah
Copy link
Member

It is weird that it is failing because the orderedMounts is used in the windows related files.

Hm... yes, interesting. Looks like it's failing on macOS; doing a quick check on my machine, I see that (on macOS) it's only used in a test, not in any other code, but the same is true for both main and this branch.

Screenshot 2022-02-03 at 21 53 58

Screenshot 2022-02-03 at 21 53 13

Wondering if there's an issue in how the Linter is configured in this branch (causing it to not detect it to be used)

@thaJeztah
Copy link
Member

I opened #6511 (in case that's fixing it)

runtime/v2/shim/shim.go Outdated Show resolved Hide resolved
@vteratipally
Copy link
Contributor Author

I opened #6511 (in case that's fixing it)
Thank you for taking a look.

@vteratipally vteratipally force-pushed the my-backport-branch branch 2 times, most recently from fca2e1c to bc9e5e8 Compare February 4, 2022 03:00
@thaJeztah
Copy link
Member

Interesting failure on Windows; https://github.com/containerd/containerd/runs/5061566490?check_suite_focus=true

failed to remove test root dir failed to cleanup WCOW layers in C:\Program Files\containerd\root-test\io.containerd.snapshotter.v1.windows\snapshots: strconv.Atoi: parsing "rm-27": invalid syntax
exit status 1

I suspect there's some code renaming a path for removal (rm-27), but other (cleanup) code not able to handle that.

MacOS and Cgroups 2 failures (at a glance) look flaky

=== FAIL: gc/scheduler TestTrigger (0.02s)
    scheduler_test.go:135: GC wait timed out
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.
Error: Process completed with exit code 1.

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for transparency; looks like the diff is not a straight cherry-pick as it required some changes due to main and release/1.5 having diverged a bit.

For those interested;

curl -fsSL -O https://github.com/containerd/containerd/commit/3ffb6a61130298cd27636df2b64a53e2161cdced.patch
curl -fsSL -O https://github.com/containerd/containerd/pull/6509.patch

git diff --no-index 3ffb6a61130298cd27636df2b64a53e2161cdced.patch 6509.patch > differences.patch

Attached below:

3ffb6a61130298cd27636df2b64a53e2161cdced.patch.txt
differences.patch.txt
6509.patch.txt

Diff (hard to read a diff of a diff)

diff --git a/3ffb6a61130298cd27636df2b64a53e2161cdced.patch b/6509.patch
index a24c881c3b..68f54469fe 100644
--- a/3ffb6a61130298cd27636df2b64a53e2161cdced.patch
+++ b/6509.patch
@@ -1,4 +1,4 @@
-From 3ffb6a61130298cd27636df2b64a53e2161cdced Mon Sep 17 00:00:00 2001
+From bc9e5e8753c33c73303cc0b3c002147053b36d0b Mon Sep 17 00:00:00 2001
 From: Brian Goff <cpuguy83@gmail.com>
 Date: Wed, 19 Jan 2022 00:23:52 +0000
 Subject: [PATCH] shimv2: handle sigint/sigterm
@@ -7,59 +7,48 @@ This causes sigint/sigterm to trigger a shutdown of the shim.
 It is needed because otherwise the v2 shim hangs system shutdown.
 
 Signed-off-by: Brian Goff <cpuguy83@gmail.com>
+(cherry picked from commit 3ffb6a61130298cd27636df2b64a53e2161cdced)
+Signed-off-by: Varsha Teratipally <teratipally@google.com>
 ---
- runtime/v2/shim/shim.go         | 10 ++++++----
- runtime/v2/shim/shim_unix.go    | 20 +++++++++++++++++++-
- runtime/v2/shim/shim_windows.go |  5 ++++-
- 3 files changed, 29 insertions(+), 6 deletions(-)
+ runtime/v2/shim/shim.go         |  6 ++++--
+ runtime/v2/shim/shim_unix.go    | 21 ++++++++++++++++++++-
+ runtime/v2/shim/shim_windows.go |  6 +++++-
+ 3 files changed, 29 insertions(+), 4 deletions(-)
 
 diff --git a/runtime/v2/shim/shim.go b/runtime/v2/shim/shim.go
-index 289b489d648..59e522cede4 100644
+index c14aacca99b..fbdee6047a2 100644
 --- a/runtime/v2/shim/shim.go
 +++ b/runtime/v2/shim/shim.go
-@@ -305,7 +305,7 @@ func run(ctx context.Context, manager Manager, initFunc Init, name string, confi
+@@ -214,7 +214,7 @@ func run(id string, initFunc Init, config Config) error {
  			"pid":       os.Getpid(),
  			"namespace": namespaceFlag,
  		})
 -		go handleSignals(ctx, logger, signals)
 +		go reap(ctx, logger, signals)
- 		ss, err := manager.Stop(ctx, id)
+ 		response, err := service.Cleanup(ctx)
  		if err != nil {
  			return err
-@@ -428,7 +428,7 @@ func run(ctx context.Context, manager Manager, initFunc Init, name string, confi
- 		}
- 	}
- 
--	if err := serve(ctx, server, signals); err != nil {
-+	if err := serve(ctx, server, signals, sd.Shutdown); err != nil {
- 		if err != shutdown.ErrShutdown {
- 			return err
- 		}
-@@ -450,7 +450,7 @@ func run(ctx context.Context, manager Manager, initFunc Init, name string, confi
- 
- // serve serves the ttrpc API over a unix socket in the current working directory
- // and blocks until the context is canceled
--func serve(ctx context.Context, server *ttrpc.Server, signals chan os.Signal) error {
-+func serve(ctx context.Context, server *ttrpc.Server, signals chan os.Signal, shutdown func()) error {
- 	dump := make(chan os.Signal, 32)
- 	setupDumpStacks(dump)
- 
-@@ -480,7 +480,9 @@ func serve(ctx context.Context, server *ttrpc.Server, signals chan os.Signal) er
+@@ -310,7 +310,9 @@ func (s *Client) Serve() error {
  			dumpStacks(logger)
  		}
  	}()
--	return handleSignals(ctx, logger, signals)
-+
-+	go handleExitSignals(ctx, logger, shutdown)
-+	return reap(ctx, logger, signals)
+-	return handleSignals(s.context, logger, s.signals)
++	ctx, cancel := context.WithCancel(s.context)
++	go handleExitSignals(ctx, logger, cancel)
++	return reap(ctx, logger, s.signals)
  }
  
- func dumpStacks(logger *logrus.Entry) {
+ // serve serves the ttrpc API over a unix socket at the provided path
 diff --git a/runtime/v2/shim/shim_unix.go b/runtime/v2/shim/shim_unix.go
-index 0701fdc6f64..e2dab0931ea 100644
+index a61b6420892..56505a89444 100644
 --- a/runtime/v2/shim/shim_unix.go
 +++ b/runtime/v2/shim/shim_unix.go
-@@ -71,7 +71,7 @@ func serveListener(path string) (net.Listener, error) {
+@@ -1,3 +1,4 @@
++//go:build !windows
+ // +build !windows
+ 
+ /*
+@@ -70,7 +71,7 @@ func serveListener(path string) (net.Listener, error) {
  	return l, nil
  }
  
@@ -68,7 +57,7 @@ index 0701fdc6f64..e2dab0931ea 100644
  	logger.Info("starting signal loop")
  
  	for {
-@@ -79,6 +79,8 @@ func handleSignals(ctx context.Context, logger *logrus.Entry, signals chan os.Si
+@@ -78,6 +79,8 @@ func handleSignals(ctx context.Context, logger *logrus.Entry, signals chan os.Si
  		case <-ctx.Done():
  			return ctx.Err()
  		case s := <-signals:
@@ -77,7 +66,7 @@ index 0701fdc6f64..e2dab0931ea 100644
  			switch s {
  			case unix.SIGCHLD:
  				if err := reaper.Reap(); err != nil {
-@@ -90,6 +92,22 @@ func handleSignals(ctx context.Context, logger *logrus.Entry, signals chan os.Si
+@@ -89,6 +92,22 @@ func handleSignals(ctx context.Context, logger *logrus.Entry, signals chan os.Si
  	}
  }
  
@@ -101,10 +90,15 @@ index 0701fdc6f64..e2dab0931ea 100644
  	return fifo.OpenFifoDup2(ctx, "log", unix.O_WRONLY, 0700, int(os.Stderr.Fd()))
  }
 diff --git a/runtime/v2/shim/shim_windows.go b/runtime/v2/shim/shim_windows.go
-index 4eef8c9ffec..4b098ab1630 100644
+index 7339eb2a2e4..4f8a944d154 100644
 --- a/runtime/v2/shim/shim_windows.go
 +++ b/runtime/v2/shim/shim_windows.go
-@@ -46,10 +46,13 @@ func serveListener(path string) (net.Listener, error) {
+@@ -1,3 +1,4 @@
++//go:build windows
+ // +build windows
+ 
+ /*
+@@ -48,10 +49,13 @@ func serveListener(path string) (net.Listener, error) {
  	return nil, errors.New("not supported")
  }
 

runtime/v2/shim/shim_unix.go Outdated Show resolved Hide resolved
This causes sigint/sigterm to trigger a shutdown of the shim.
It is needed because otherwise the v2 shim hangs system shutdown.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
(cherry picked from commit 3ffb6a6)
Signed-off-by: Varsha Teratipally <teratipally@google.com>
@qiutongs
Copy link
Contributor

qiutongs commented Feb 7, 2022

/lgtm

@dmcgowan dmcgowan added this to New in Code Review via automation Feb 8, 2022
@dmcgowan dmcgowan moved this from New to Ready For Review in Code Review Feb 8, 2022
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@estesp estesp merged commit b863e69 into containerd:release/1.5 Feb 9, 2022
Code Review automation moved this from Ready For Review to Done Feb 9, 2022
@fuweid
Copy link
Member

fuweid commented Feb 9, 2022

for transparency; looks like the diff is not a straight cherry-pick as it required some changes due to main and release/1.5 having diverged a bit.

For those interested;

curl -fsSL -O https://github.com/containerd/containerd/commit/3ffb6a61130298cd27636df2b64a53e2161cdced.patch
curl -fsSL -O https://github.com/containerd/containerd/pull/6509.patch

git diff --no-index 3ffb6a61130298cd27636df2b64a53e2161cdced.patch 6509.patch > differences.patch

Attached below:

3ffb6a61130298cd27636df2b64a53e2161cdced.patch.txt differences.patch.txt 6509.patch.txt

Diff (hard to read a diff of a diff)

diff --git a/3ffb6a61130298cd27636df2b64a53e2161cdced.patch b/6509.patch
index a24c881c3b..68f54469fe 100644
--- a/3ffb6a61130298cd27636df2b64a53e2161cdced.patch
+++ b/6509.patch
@@ -1,4 +1,4 @@
-From 3ffb6a61130298cd27636df2b64a53e2161cdced Mon Sep 17 00:00:00 2001
+From bc9e5e8753c33c73303cc0b3c002147053b36d0b Mon Sep 17 00:00:00 2001
 From: Brian Goff <cpuguy83@gmail.com>
 Date: Wed, 19 Jan 2022 00:23:52 +0000
 Subject: [PATCH] shimv2: handle sigint/sigterm
@@ -7,59 +7,48 @@ This causes sigint/sigterm to trigger a shutdown of the shim.
 It is needed because otherwise the v2 shim hangs system shutdown.
 
 Signed-off-by: Brian Goff <cpuguy83@gmail.com>
+(cherry picked from commit 3ffb6a61130298cd27636df2b64a53e2161cdced)
+Signed-off-by: Varsha Teratipally <teratipally@google.com>
 ---
- runtime/v2/shim/shim.go         | 10 ++++++----
- runtime/v2/shim/shim_unix.go    | 20 +++++++++++++++++++-
- runtime/v2/shim/shim_windows.go |  5 ++++-
- 3 files changed, 29 insertions(+), 6 deletions(-)
+ runtime/v2/shim/shim.go         |  6 ++++--
+ runtime/v2/shim/shim_unix.go    | 21 ++++++++++++++++++++-
+ runtime/v2/shim/shim_windows.go |  6 +++++-
+ 3 files changed, 29 insertions(+), 4 deletions(-)
 
 diff --git a/runtime/v2/shim/shim.go b/runtime/v2/shim/shim.go
-index 289b489d648..59e522cede4 100644
+index c14aacca99b..fbdee6047a2 100644
 --- a/runtime/v2/shim/shim.go
 +++ b/runtime/v2/shim/shim.go
-@@ -305,7 +305,7 @@ func run(ctx context.Context, manager Manager, initFunc Init, name string, confi
+@@ -214,7 +214,7 @@ func run(id string, initFunc Init, config Config) error {
  			"pid":       os.Getpid(),
  			"namespace": namespaceFlag,
  		})
 -		go handleSignals(ctx, logger, signals)
 +		go reap(ctx, logger, signals)
- 		ss, err := manager.Stop(ctx, id)
+ 		response, err := service.Cleanup(ctx)
  		if err != nil {
  			return err
-@@ -428,7 +428,7 @@ func run(ctx context.Context, manager Manager, initFunc Init, name string, confi
- 		}
- 	}
- 
--	if err := serve(ctx, server, signals); err != nil {
-+	if err := serve(ctx, server, signals, sd.Shutdown); err != nil {
- 		if err != shutdown.ErrShutdown {
- 			return err
- 		}
-@@ -450,7 +450,7 @@ func run(ctx context.Context, manager Manager, initFunc Init, name string, confi
- 
- // serve serves the ttrpc API over a unix socket in the current working directory
- // and blocks until the context is canceled
--func serve(ctx context.Context, server *ttrpc.Server, signals chan os.Signal) error {
-+func serve(ctx context.Context, server *ttrpc.Server, signals chan os.Signal, shutdown func()) error {
- 	dump := make(chan os.Signal, 32)
- 	setupDumpStacks(dump)
- 
-@@ -480,7 +480,9 @@ func serve(ctx context.Context, server *ttrpc.Server, signals chan os.Signal) er
+@@ -310,7 +310,9 @@ func (s *Client) Serve() error {
  			dumpStacks(logger)
  		}
  	}()
--	return handleSignals(ctx, logger, signals)
-+
-+	go handleExitSignals(ctx, logger, shutdown)
-+	return reap(ctx, logger, signals)
+-	return handleSignals(s.context, logger, s.signals)
++	ctx, cancel := context.WithCancel(s.context)
++	go handleExitSignals(ctx, logger, cancel)
++	return reap(ctx, logger, s.signals)
  }
  
- func dumpStacks(logger *logrus.Entry) {
+ // serve serves the ttrpc API over a unix socket at the provided path
 diff --git a/runtime/v2/shim/shim_unix.go b/runtime/v2/shim/shim_unix.go
-index 0701fdc6f64..e2dab0931ea 100644
+index a61b6420892..56505a89444 100644
 --- a/runtime/v2/shim/shim_unix.go
 +++ b/runtime/v2/shim/shim_unix.go
-@@ -71,7 +71,7 @@ func serveListener(path string) (net.Listener, error) {
+@@ -1,3 +1,4 @@
++//go:build !windows
+ // +build !windows
+ 
+ /*
+@@ -70,7 +71,7 @@ func serveListener(path string) (net.Listener, error) {
  	return l, nil
  }
  
@@ -68,7 +57,7 @@ index 0701fdc6f64..e2dab0931ea 100644
  	logger.Info("starting signal loop")
  
  	for {
-@@ -79,6 +79,8 @@ func handleSignals(ctx context.Context, logger *logrus.Entry, signals chan os.Si
+@@ -78,6 +79,8 @@ func handleSignals(ctx context.Context, logger *logrus.Entry, signals chan os.Si
  		case <-ctx.Done():
  			return ctx.Err()
  		case s := <-signals:
@@ -77,7 +66,7 @@ index 0701fdc6f64..e2dab0931ea 100644
  			switch s {
  			case unix.SIGCHLD:
  				if err := reaper.Reap(); err != nil {
-@@ -90,6 +92,22 @@ func handleSignals(ctx context.Context, logger *logrus.Entry, signals chan os.Si
+@@ -89,6 +92,22 @@ func handleSignals(ctx context.Context, logger *logrus.Entry, signals chan os.Si
  	}
  }
  
@@ -101,10 +90,15 @@ index 0701fdc6f64..e2dab0931ea 100644
  	return fifo.OpenFifoDup2(ctx, "log", unix.O_WRONLY, 0700, int(os.Stderr.Fd()))
  }
 diff --git a/runtime/v2/shim/shim_windows.go b/runtime/v2/shim/shim_windows.go
-index 4eef8c9ffec..4b098ab1630 100644
+index 7339eb2a2e4..4f8a944d154 100644
 --- a/runtime/v2/shim/shim_windows.go
 +++ b/runtime/v2/shim/shim_windows.go
-@@ -46,10 +46,13 @@ func serveListener(path string) (net.Listener, error) {
+@@ -1,3 +1,4 @@
++//go:build windows
+ // +build windows
+ 
+ /*
+@@ -48,10 +49,13 @@ func serveListener(path string) (net.Listener, error) {
  	return nil, errors.New("not supported")
  }
 

@thaJeztah In the main branch, shim manager has been refactored so that it is different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

8 participants