Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to shutdown OpenTelemetry trace exporter gracefully. #6651

Closed
nicolaschotard opened this issue Mar 28, 2024 · 5 comments · Fixed by #6738
Closed

Failed to shutdown OpenTelemetry trace exporter gracefully. #6651

nicolaschotard opened this issue Mar 28, 2024 · 5 comments · Fixed by #6738

Comments

@nicolaschotard
Copy link
Contributor

nicolaschotard commented Mar 28, 2024

Short description

When running OPA with the distributed tracing option enabled, we get the following error when the server is stopped.

Failed to shutdown `OpenTelemetry` trace exporter gracefully.

Steps To Reproduce

  1. Get the latest version of opa (I've tried with version 0.62 and the latest version)
./opa_linux_amd64 version
Version: 0.63.0-dev
Build Commit: d3a4a87f506dec744ed93177c37bea5e2afa186b
Build Timestamp: 2024-03-27T19:08:00Z
Build Hostname: nico-Latitude-7420
Go Version: go1.21.0
Platform: linux/amd64
WebAssembly: available
  1. Run the following command line
./opa_linux_amd64 run --server --set distributed_tracing.type=grpc
  1. Kill the process (Ctrl + C works fine)
  2. The full output should looks like that
./opa_linux_amd64 run --server --set distributed_tracing.type=grpc
{"addrs":[":8181"],"diagnostic-addrs":[],"level":"info","msg":"Initializing server. OPA is running on a public (0.0.0.0) network interface. Unless you intend to expose OPA outside of the host, binding to the localhost interface (--addr localhost:8181) is recommended. See https://www.openpolicyagent.org/docs/latest/security/#interface-binding","time":"2024-03-28T09:18:37+01:00"}
^C{"level":"info","msg":"Shutting down...","time":"2024-03-28T09:18:38+01:00"}
{"level":"info","msg":"Server shutdown.","time":"2024-03-28T09:18:38+01:00"}
{"err":"context canceled","level":"error","msg":"Failed to shutdown OpenTelemetry trace exporter gracefully.","time":"2024-03-28T09:18:38+01:00"}

Expected behavior

The OpenTelemetry trace exporter should be gracefully shut down, and no error should be seen.

Additional context

We have seen this error appeared after upgrading OPA from version 0.59 to 0.62.

@srenatus
Copy link
Contributor

srenatus commented Mar 28, 2024

Thanks for bringing this up. I think it's not wired up correctly: This

opa/runtime/runtime.go

Lines 538 to 543 in d3a4a87

defer func() {
err := rt.traceExporter.Shutdown(ctx)
if err != nil {
rt.logger.WithFields(map[string]interface{}{"err": err}).Error("Failed to shutdown OpenTelemetry trace exporter gracefully.")
}
}()

should happen in the shutdown code path, maybe here

opa/runtime/runtime.go

Lines 845 to 861 in d3a4a87

func (rt *Runtime) gracefulServerShutdown(s *server.Server) error {
if rt.Params.ShutdownWaitPeriod > 0 {
rt.logger.Info("Waiting %vs before initiating shutdown...", rt.Params.ShutdownWaitPeriod)
time.Sleep(time.Duration(rt.Params.ShutdownWaitPeriod) * time.Second)
}
rt.logger.Info("Shutting down...")
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(rt.Params.GracefulShutdownPeriod)*time.Second)
defer cancel()
err := s.Shutdown(ctx)
if err != nil {
rt.logger.WithFields(map[string]interface{}{"err": err}).Error("Failed to shutdown server gracefully.")
return err
}
rt.logger.Info("Server shutdown.")
return nil
}

or here

opa/server/server.go

Lines 225 to 249 in d3a4a87

func (s *Server) Shutdown(ctx context.Context) error {
errChan := make(chan error)
for _, srvr := range s.httpListeners {
go func(s httpListener) {
errChan <- s.Shutdown(ctx)
}(srvr)
}
// wait until each server has finished shutting down
var errorList []error
for i := 0; i < len(s.httpListeners); i++ {
err := <-errChan
if err != nil {
errorList = append(errorList, err)
}
}
if len(errorList) > 0 {
errMsg := "error while shutting down: "
for i, err := range errorList {
errMsg += fmt.Sprintf("(%d) %s. ", i, err.Error())
}
return errors.New(errMsg)
}
return nil
}

@ashutosh-narkar
Copy link
Member

@nicolaschotard if you'd like to submit a fix based on Stephan's feedback feel free to do so. We're happy to help out. Thanks for reporting this.

@nicolaschotard
Copy link
Contributor Author

Hey thank you both for your quick answers. And indeed, moving the rt.traceExporter.Shutdown(ctx) part into the gracefulServerShutdown function seems to fix this issue. I will submit a PR with a fix as soon as I get the time to work on that.

nicolaschotard added a commit to nicolaschotard/opa that referenced this issue Apr 3, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulShutdonw function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
@nicolaschotard
Copy link
Contributor Author

And I believe this commit introduced the bug.

Copy link

stale bot commented May 3, 2024

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days. Although currently inactive, the issue could still be considered and actively worked on in the future. More details about the use-case this issue attempts to address, the value provided by completing it or possible solutions to resolve it would help to prioritize the issue.

@stale stale bot added the inactive label May 3, 2024
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulShutdonw function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 7, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
ashutosh-narkar pushed a commit to David-Wobrock/opa that referenced this issue May 8, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
David-Wobrock pushed a commit to David-Wobrock/opa that referenced this issue May 10, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: open-policy-agent#6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
ashutosh-narkar pushed a commit that referenced this issue May 10, 2024
When running OPA with the distributed tracing option enabled,
the OpenTelemetry trace exporter is not gracefully shut down
when the server is stopped.

This PR fixes that issues by moving the trace exporter shutdown
in the gracefulServerShutdown function.

Fixes: #6651
Signed-off-by: Nicolas Chotard <nicolas.chotard@backmarket.com>
Signed-off-by: David Wobrock <david.wobrock@backmarket.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants