-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Termination grace period is not respected when the pod is killed. #481
Comments
@Nalum could you please give some advice here? |
BTW @TarasLykhenko |
Unfortunately I'm not sure there is another way to handle this without dealing with the locked state, at least until hashicorp/terraform-exec#334 comes into play. You don't have to use the
|
FYI check out my workaround from #531 (comment) |
This is from a write-up I did analysing the problem in some depth. When Kubernetes terminates a pod the init process is sent SIGTERM. tf-controller uses tini https://github.com/weaveworks/tf-controller/blob/29c37473906d2d664e25db86f76eb5832ec5c7ee/runner-base.Dockerfile#L66 ENTRYPOINT [ "/sbin/tini", "--", "tf-runner" ] tini handles receiving signals, by signalling its child process (in this case the tf-runner). /* There is a signal to handle here */
switch (sig.si_signo) {
case SIGCHLD:
/* Special-cased, as we don't forward SIGCHLD. Instead, we'll
* fallthrough to reaping processes.
*/
PRINT_DEBUG("Received SIGCHLD");
break;
default:
PRINT_DEBUG("Passing signal: '%s'", strsignal(sig.si_signo));
/* Forward anything else */
if (kill(kill_process_group ? -child_pid : child_pid, sig.si_signo)) {
if (errno == ESRCH) {
PRINT_WARNING("Child was dead when forwarding signal");
} else {
PRINT_FATAL("Unexpected error when forwarding signal: '%s'", strerror(errno));
return 1;
}
}
break; Note that tini can be configured to signal an entire process group, but see later for why this doesn't impact on the terraform process. tf-runner sets itself up to handle signals, when the signal is received, the signal is placed on to a channel. // catch the SIGTERM from the kubelet to gracefully terminate
sigterm := make(chan os.Signal, 1)
signal.Notify(sigterm, syscall.SIGTERM)
defer func() {
signal.Stop(sigterm)
}() This channel is passed through to the TerraformRunnerServer. Within the ctx, cancel := context.WithCancel(ctx)
go func() {
select {
case <-r.Done:
cancel()
case <-ctx.Done():
}
}() This cancellation is significant, as the assumption here is that the tf-runner process is signalled with SIGTERM and this is resulting in the cancellation of the context. This context is passed to github.com/hashicorp/terraform-exec through to tf-exec code which delegates executing the generated command-line to Go's exec.Command. There are several implementations of This code does this: cmd.SysProcAttr = &syscall.SysProcAttr{
// kill children if parent is dead
Pdeathsig: syscall.SIGKILL,
// set process group ID
Setpgid: true,
} This does two different things, The other thing it does, The The current assumption is that the terraform process sets up similar signal handling to While the terraform process is executing, tf-runner is receiving a SIGTERM, and so it closes the context, this results in SIGKILL being sent to terraform. This is bad! Let's go back to the This sets an attribute on the spawned process (only on Linux) that defines the signal to be sent when the thread that started the process is terminated. This is Go, we don't do threads...except that Goroutines are multiplexed on top of a set of threads that are managed by the Go runtime, the runtime can create and destroy threads as necessary, this describes the process really well. So, at any point in time, the thread that starts the terraform process could be terminated which would result (because of golang/go#27505 recommends locking the thread around this which would presumably prevent it being killed off by the runtime. We know that switching from |
Additionally: since sending SIGTERM to the |
Kubelet will use SIGKILL without SIGTERM and a grace period, in some circumstances: https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/#hard-eviction-thresholds |
Regarding the kubelet and SIGKILL without SIGTERM, I actually observed this during troubleshooting where a group of tf-runner pods were evicted due to node pressure caused by disk(imagefs.available.) The commonality was that all of the tf-runner pods that were scheduled on that node with disk pressure resulted in state locking issues on subsequent runs. |
Hi Team,
We started to notice weird behaviour where many locks were not released when the tf-runners were killed, even when the graceful shutdown was set to 1 hour.
After some investigation, we noticed that when the pod was restarted, the SIGKILL forcefully killed the Terraform process; therefore, it doesn’t start a graceful shutdown, which sometimes ends with the state lock not being released.
We found these related issues:
hashicorp/terraform-exec#332
hashicorp/terraform-exec#334
logs:
events:
Is there something we can do to avoid this behaviour without setting force unlock to auto?
Thanks
The text was updated successfully, but these errors were encountered: