Skip to content
This repository has been archived by the owner on May 23, 2023. It is now read-only.

Slf4j MDC vs Scopes #206

Open
mabn opened this issue Oct 19, 2017 · 6 comments
Open

Slf4j MDC vs Scopes #206

mabn opened this issue Oct 19, 2017 · 6 comments

Comments

@mabn
Copy link
Contributor

mabn commented Oct 19, 2017

Disclaimer: I don't have a working implementation yet

TL;DR - It seems that a reasonable implementation is possible with Scopes if we wrap Span whenever we start a new Scope - which seems similar to the "ActiveSpan" concept from the previous version.

Goals:

  • I want debug_id (span-id + trace-id) of the active span to be available inside MDC

    • put debug_id inside MDC when scope is activated
    • when the active scope closes remove any MDC changes since the scope was activated
  • When I define key-value attributes of the current (computation) context and put them into MDC then I want them to automatically propagate to other threads using OT in-process propagation. I want this because I don't want to write additional in-process propagation for MDC.

    • propagate custom MDC entries to child scopes (e.g. user_id)
    • add MDC entries as span tags
  • When I define key-value attributes of the current (computation) context I want to set them both on MDC and active Span

    • when they are changed they should change both in MDC and the Span
    • if the value changes multiple times on a single Span it's ok if only the last tag value is reported when Span finishes

The first point (setting debug_id) is trivial - see PR #207. The only requirement is not to share scopes across threads which I assume would be invalid anyway.

When it comes to propagating MDC to other scopes and other threads things get complicated:

Problem A:

In order to add MDC entries as tags in a child scope/thread we somehow need to pass MDC entries to that scope/thread. Right now the only entity that's passed is the Span:

Scope scope = tracer.buildSpan("parent").startActive();
MDC.put("key", "x");

runInThread(() -> {
    try(Scope child = scopeManager.activate(scope.span())) {
        MDC.get("key") // == "x"
    }
});

That means that

  1. we have to glue our data to the span instance
  2. or pass something else - instead of Span or in addition to
Ad 1. Add data to Span instance

This seems doable. Whenever startActive is called and Scope is created we can wrap the Span with our SpanWithContext. Then we can store any extra information we need inside it.

When scopeManager.activate(spanwithcontext) is called we have all the information.

Ad 2. Pass something other than Span

This is solution similar to the one described in the java-examples - the Continuation/AutoFinishScope. Copying MDC manually also belong to this category.
Because of the fact that this approach does not use the official API it's a non-starter - it would require to instrument all concurrency libraries (rxjava, akka, hystrix, ...) with extra capabilities for propagating MDC, but we want to use existing OT-contrib libs which already do the propagation.

With Continuations in the official API this would be the ideal solution - we'd just pass the data inside the Continuation.

Problem B:

How to put MDC data into the active Span?

The thing is that after setting MDC there won't be any additional OT calls so there is no obvious place where to put the code which would copy the value into the active Span.

Scope scope = tracer.buildSpan("parent").startActive();
MDC.put("key", "x"); // after scope created, no OT calls afterwards

runInThread(() -> {
    try(Scope child = scopeManager.activate(scope.span())) {
        MDC.get("key")
    }
});

I see following options here:

  • change MDC implementation to actually both set the value inside MDC and additionally in the active Span. This is what we did at my company. We replace the MDCAdapter using reflection to our own implementation which is opentracing-aware. It's a bit hacky but it works. I'd go for this approach but the upcoming Slf4j 1.8 is modularized (because Java 9) and we won't be able to reach into its internals anymore. Slf4j 1.8 uses ServiceLoader to load the implementation and I'm not sure if we'd be able to hijack this mechanism. All in all approaching it form this side does not look good.

  • Another option is to stop using MDC class directly. Instead of

    MDC.put(key, value)

    Use

    Context.put(key, value)

    This allows us to put our logic inside the Context. Actually we already have such context - it's the Span. So the idea is to use:

    Span.put(key, value)

    A real implementation could allow to configure which tags are also set in the MDC to avoid putting there everything.

Problem C:

There is a single parent MDC "state" but multiple "child" ones starting at different times. What I mean is a situation like this:

Scope scope = tracer.buildSpan("parent").startActive();
scope.span().setTag("key", "x")
// no capture() here

runInThread(() -> {
    Thread.sleep(1000)
    try(Scope child = scopeManager.activate(scope.span())) {
        MDC.get("key") // == "x" or "y" ?
    }
});
scope.span().setTag("key", "y")

With Continuations it would be easy because of the capture() call which would be able to record the "parent" state.
I don't see a good solution for this right now and my approach would be to just ignore this. It only happens if someone changes the tag values and then passes the span to another thread. One workaround is to create a child span and then pass it:

Scope scope = tracer.buildSpan("parent").startActive();
scope.span().setTag("key", "x")
// no capture() here

try(Scope outerChild = scopeManager.activate(scope.span())) {
    runInThread(() -> {
        Thread.sleep(1000)
        try(Scope child = scopeManager.activate(outerChild.span())) {
            MDC.get("key") // guaranteed to be "x"
        }
    });
}

scope.span().setTag("key", "y")

This is possible because despite scope.span() returning the same span from OT perspective, in reality it returns a different instance of SpanWithContext for different active scopes.

Problem D:

  • Need to react to Span start to be able to capture debug ids (or we could do that when scope activates)
  • Need to react to Scope activation to mark the point to which revert MDC changes when Scope closes
  • Need to react to Scope closing to revert MDC changes

Right now the only way is to wrap SpanBuilder. It would be great if the OT API allowed to do that without custom wrappers.

@mabn mabn mentioned this issue Oct 19, 2017
@tedsuo
Copy link
Member

tedsuo commented Oct 31, 2017

@mabn, I wonder: if we can get SpanID and TraceID added to SpanContext, does that make Observers work well enough to accomplish MDC integration without a span wrapper?

@mabn
Copy link
Contributor Author

mabn commented Oct 31, 2017

@tedsuo what do you mean by Observers?

@tedsuo
Copy link
Member

tedsuo commented Oct 31, 2017

I mean an OT SpanObserver API that tracers can adopt, as opposed to wrapping the SpanBuilder. This idea here: opentracing/opentracing-go#136

Basically, if we can add SpanID+TraceID, Observers become useful for many things. Potentially MDC integration as well?

@mabn
Copy link
Contributor Author

mabn commented Nov 15, 2017

@tedsuo I didn't have time to dig deeper but I'm pretty sure that Observers would be sufficient - although they would need a few more integration points - i.e. methods related to Scopes (created, closed, activated).

@fzakaria
Copy link

I would find TraceId and SpanId to the context useful.
Right now I'm casting to my implementation (Jaeger) which exposes both.

Also it might be useful to make the mechanism in which request id generation created a Provider -- consumers may want fixed / less random / more random solutions.

@wsargent
Copy link
Contributor

wsargent commented Jan 4, 2018

Possibly off topic, but it is possible to use markers rather than MDC to pass contextual information around, i.e. https://www.playframework.com/documentation/2.6.x/Highlights26#Logging-Marker-API

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants