Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing: Add Jaeger B3 header support #5553

Closed
wants to merge 2 commits into from

Conversation

ahayworth
Copy link

@ahayworth ahayworth commented Jul 29, 2022

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Jaeger tracing uses its own HTTP/GRPC header to perform context
propagation between services - "uber-trace-id". However, Jaeger
tracing clients also support context propagation via the B3 header
format (as popularized by Zipkin tracing).

This commit adds support for configuring the B3 header format within
thanos tracing. The main motivation for doing this is interoperability
with other systems that may call into or out of Thanos components - for
example, Grafana or Envoy - both of which support Jaeger tracing, but
use B3 headers instead of the Jaeger-specific "uber-trace-id" format headers.

Unfortunately, we cannot directly toggle this as an option or configure
it via the environment; the jaeger-go library requires us to pass it in
as an option when we call NewTracer. In order to facilitate this, we
extend the ParseConfigFromYaml file to not only return a Jaeger
configuration object, but the actual YAML configuration object as well.
This allows us to check and see if the user requested this kind of
header propagation, and we can then construct the necessary additional
options to NewTracer.

Of course, this is all legacy configuration in many ways. The tracing
world is rapidly moving towards OpenTelemetry and W3C headers for
context propagation. However, there is still utility in supporting B3
headers until everyone (including Thanos!) finishes their OpenTelemetry
migration projects.

Verification

Programmatically testing this change is tricky, because we need to actually
inspect the headers of the HTTP and GRPC requests that flow through the
system with tracing enabled. I was able to verify that this worked through
manual testing via the query frontend:

  1. I started a local ngrok tunnel, which allows me to intercept requests
    between the query frontend and a downstream querier (I configured it
    statically for port 10904, which is one of the querier ports).
  2. I launched a local OpenTelemetry Collector and Jaeger all-in-one image
    to receive and display traces.
  3. I modified scripts/quickstart.sh to:
  • Add a query frontend, with the query-frontend.downstream-url pointed
    to the intercepting ngrok tunnel.
  • Add appropriate tracing configurations for all components.

Once those modifications were in place and launched with make quickstart,
I was able to write queries in the query frontend, see that the intercepted
API calls to the downstream querier contained tracing headers in the
expected format, and verify that the traces were ingested into the Jaeger
backend via the OpenTelemetry collector (the OpenTelemetry collector was not
strictly necessary, but I prefer to use it while debugging).

Screenshot showing the intercepted API request, with correct B3 headers:
Screen Shot 2022-07-29 at 11 16 35 AM

Screenshot of the complete trace ingested to the Jaeger backend, with
correct context-propagation and parent/child span relationships preserved:
Screen Shot 2022-07-29 at 11 16 44 AM

ahayworth added a commit to ahayworth/thanos that referenced this pull request Jul 29, 2022
ahayworth added a commit to ahayworth/thanos that referenced this pull request Jul 29, 2022
Signed-off-by: Andrew Hayworth <ahayworth@gmail.com>
@ahayworth
Copy link
Author

cc @fpetkovski, since I think you may be interested in this.

ahayworth added a commit to ahayworth/thanos that referenced this pull request Jul 29, 2022
Signed-off-by: Andrew Hayworth <ahayworth@gmail.com>
ahayworth added a commit to ahayworth/thanos that referenced this pull request Jul 29, 2022
Signed-off-by: Andrew Hayworth <ahayworth@gmail.com>
Jaeger tracing uses its own HTTP/GRPC header to perform context
propagation between services - "uber-trace-id". However, Jaeger
tracing clients also support context propagation via the B3 header
format (as popularized by Zipkin tracing).

This commit adds support for configuring the B3 header format within
thanos tracing. The main motivation for doing this is interoperability
with other systems that may call into or out of Thanos components - for
example, Grafana or Envoy - both of which support Jaeger tracing, but
use B3 headers instead of the Jaeger-specific "uber-trace-id" format headers.

Unfortunately, we cannot directly toggle this as an option or configure
it via the environment; the jaeger-go library requires us to pass it in
as an option when we call `NewTracer`. In order to facilitate this, we
extend the `ParseConfigFromYaml` file to not only return a _Jaeger_
configuration object, but the actual YAML configuration object as well.
This allows us to check and see if the user requested this kind of
header propagation, and we can then construct the necessary additional
options to `NewTracer`.

Of course, this is all legacy configuration in many ways. The tracing
world is rapidly moving towards OpenTelemetry and W3C headers for
context propagation. However, there is still utility in supporting B3
headers until everyone (including Thanos!) finishes their OpenTelemetry
migration projects.

Signed-off-by: Andrew Hayworth <ahayworth@gmail.com>
Signed-off-by: Andrew Hayworth <ahayworth@gmail.com>
Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to add these by default?

@yeya24
Copy link
Contributor

yeya24 commented Jul 30, 2022

Since we have another WIP pr for jaeger Opentelemetry migration, maybe it is better to have this after #5411 is done?

@ahayworth
Copy link
Author

ahayworth commented Aug 1, 2022

Is there a reason not to add these by default?

@GiedriusS That depends on exactly what you mean with the question.

  • If you mean "make the default true", then I would say that it's a breaking change for anyone that's relying on the current behavior today (and I don't know how many people that might be).
  • If you mean "why not always add those headers in addition to the normal uber-trace-id header", that's certainly possible. In OpenTelemetry terms, that would be a "composite propagator", and it isn't extraordinarily difficult to implement. However, the OpenTracing standard never defined such a concept and so the jaeger-client-go library didn't implement one (as far as I can see, anyways). We'd just need to write that here. That said: I don't know if enough people would want it to justify the effort.

Since we have another WIP pr for jaeger Opentelemetry migration, maybe it is better to have this after #5411 is done?

@yeya24 If that's going to merge soon, then perhaps it is better to wait. My impression was that the PR may not be close to merging given the length of time it has been open, but that impression may have been mistaken! 😄

That said - it looks like #5411 would opentracing "basic" propagation for all supported tracing backends, which may not be what is desired (or intended). There's a new autoprop package for the otel-go SDK which may be an attractive option, since it would allow us to easily extend support for additional propagation types and also allows for the otel-standard OTEL_PROPAGATORS env var support. If included in #5411, that could make this PR obsolete entirely.

@stale
Copy link

stale bot commented Oct 1, 2022

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale label Oct 1, 2022
@fpetkovski
Copy link
Contributor

Looks like this PR is not ready for review. Does it make sense to focus efforts there @ahayworth ?

@stale stale bot removed the stale label Oct 1, 2022
@ahayworth
Copy link
Author

Looks like this PR is not ready for review.

It was at the time it was submitted! 😄 But, it would have to be reworked since #5411 merged.

Does it make sense to focus efforts there @ahayworth ?

No, I don't think so (and I won't continue this PR). #5411 allows me to solve the problem of interoperability that I faced in an entirely different way: using OpenTelemetry native tracing and propagation. (Not even just different, I think it's a better solution!).

I'll close this for now, but if anyone else needs this support in the future the PR can serve as a helpful guide for anyone that looks to implement B3 header support for the legacy Jaeger tracing.

@ahayworth ahayworth closed this Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants