Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose Context-scoped attributes. #207

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

Oberon00
Copy link
Member

@Oberon00 Oberon00 commented Jun 13, 2022

Add Context-scoped telemetry attributes which typically apply to all signals associated with a trace as it crosses a single service.

This OTEP aims to address various related demands that have been brought up in the past, where the scope of resource attributes is too broad, but the scope of span attributes is too narrow. For example, this happens where there is a mismatch between the OpenTelemetry SDK’s (and thus TracerProvider’s, MeterProvider’s) process-wide initialization and the semantic scope of a (sub)service.

The context-scoped attributes allows you to attach attributes to all telemetry signals emitted within a Context (or, following the usual rules of Context values, any child context thereof unless overridden). Context-scoped attributes are normal attributes, which means you can use strings, integers, floating point numbers, booleans or arrays thereof, just like for span or resource attributes. Context-scoped attributes are associated with all telemetry signals emitted while the Context containing the Context-scoped attributes is active and are available to telemetry exporters. For spans, the context within which the span is started applies. Like other telemetry APIs, Context-scoped attributes are write-only for applications. You cannot query the currently set Context-scoped attributes, they are only available on the SDK level (e.g. to telemetry exporters, Samplers and SpanProcessors).

Context-scoped attributes should be thought of equivalent to adding the attribute directly to each single telemetry item it applies to.


Rendered: https://github.com/dynatrace-oss-contrib/oteps/blob/feature/context-scoped-attributes/text/0207-context-scoped-attributes.md
Note: A table of contents can be found under the icon at the top left corner, if you go to the rendered view.

@Oberon00 Oberon00 added the enhancement New feature or request label Jun 13, 2022
@Oberon00 Oberon00 marked this pull request as ready for review June 14, 2022 07:57
@Oberon00 Oberon00 requested a review from a team as a code owner June 14, 2022 07:57
@Oberon00
Copy link
Member Author

(Note that the check-errors step seems stuck at "Waiting for status to be reported")

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fully supportive of this proposal, it has come up a number of times for our customers (e.g. capturing tenant id in a multi-tenant service), here's what we have ended up with in our Java auto-instrumentation distro to address this for now, but an official solution would be super.

In Java auto-instrumentation, the local root span is typically created automatically before the user has a chance to add anything to the Context, so this proposal will likely require three steps for users:

  • set the attribute on the current (local root) Span
  • set the context-scoped attribute
  • activate the new context

The one other proposal I like is setting context-scoped attributes on the span, and then inheriting these by any child of the span. This would likely require just one step in the above case:

  • set the context-scoped attribute on the current (local root) Span

In both cases above, if a span has been started prior to setting the context-scoped attribute, that span would not get the context-scoped attribute (which I think is ok, e.g. those spans may have already been ended/exported).

One downside to the second approach above is in cases where you don't have a span, but still want to add context-scoped attributes to logs. I'm not sure if some kind of invalid span (with context-scoped attributes) could solve this, or better, supporting context-scoped attributes on
Context also, which could be very convenient for manual instrumentation, and better integration with samplers.

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 15, 2022

There are many possible design variations on these attributes. What you suggest as workaround with "setting the attributes on the span and then on the Context" could maybe be avoided with Alternative: Associating Context-scoped attributes with the span on end and setting attributes on parent Contexts, in addition to generally using the span as primary/only storage as you say (I think this is Alternative: Associating attributes indirectly through spans). Then you would just set the attributes on the Context and activate it, and they would be associated with the span on end. Less work, but also less intuitive and the attributes are not available to samplers / OnStart callbacks.

@Oberon00
Copy link
Member Author

Mentioned briefly in spec meeting today. @carlosalberto suggests to implement a prototype in some language.

I think for spans this should even be doable without changing core API/SDK code, just by using the SpanProcessor's OnStart callback.

@fbogsany
Copy link

In Ruby, we have implemented a similar mechanism using Context to propagate attributes to HTTP client spans. We've also implemented this for Redis client spans.

An example use from the Koala instrumentation, which instruments a Facebook API client, is:

          def graph_call(path, args = {}, verb = 'get', options = {}, &post_processing)
            OpenTelemetry::Common::HTTP::ClientContext.with_attributes('peer.service' => 'facebook', 'koala.verb' => verb, 'koala.path' => path) do
              super
            end
          end

Faraday is a HTTP client library. It's instrumentation merges in the attributes from the HTTP::ClientContext with:

          def span_creation_attributes(http_method:, url:)
            instrumentation_attrs = {
              'http.method' => http_method,
              'http.url' => url.to_s,
              'net.peer.name' => url.host
            }
            config = Faraday::Instrumentation.instance.config
            instrumentation_attrs['peer.service'] = config[:peer_service] if config[:peer_service]
            instrumentation_attrs.merge!(
              OpenTelemetry::Common::HTTP::ClientContext.attributes
            )
          end

Honeycomb has also implemented a similar mechanism in their "hello world" examples, called CarryOn

  module CarryOn
    extend self


    # the key under which attributes will be stored for CarryOn within a context
    CONTEXT_KEY = ::OpenTelemetry::Context.create_key('carry-on-attrs')
    private_constant :CONTEXT_KEY


    # retrieve the CarryOn attributes from a given or current context
    def attributes(context = nil)
      context ||= ::OpenTelemetry::Context.current
      context.value(CONTEXT_KEY) || {}
    end


    # return a new Context with the attributes given set within CarryOn
    def with_attributes(attributes_hash)
      attributes_hash = attributes.merge(attributes_hash)
      ::OpenTelemetry::Context.with_value(CONTEXT_KEY, attributes_hash) { |c, h| yield h, c }
    end
  end


  # The CarryOnSpanProcessor reads attributes stored under a custom CarryOn key
  # in the span's parent context and adds them to the span.
  #
  # Add this span processor to the pipeline and then keys and values
  # added to CarryOn will appear on subsequent child spans for a trace only
  # within this service. CarryOn attributes do NOT propagate outside this process.
  class CarryOnSpanProcessor < OpenTelemetry::SDK::Trace::SpanProcessor
    def on_start(span, parent_context)
        span.add_attributes(CarryOn.attributes(parent_context))
    end
  end

@Oberon00
Copy link
Member Author

Oberon00 commented Jun 28, 2022

Yes, CarryOn looks exactly like what this proposes (though for Spans only). Does that satisfy the prototype condition, @carlosalberto?

@Oberon00 Oberon00 closed this Jun 28, 2022
@Oberon00 Oberon00 reopened this Jun 28, 2022
@Oberon00
Copy link
Member Author

(close & reopen because CI still seems stuck)

@garthk
Copy link

garthk commented Jul 2, 2022

G'day! I suspect the name Scoped Resources might better match your intent. The word "context" strongly suggests the execution path to Go programmers, and "attributes" suggests to me a shorter lifespan than you've proposed for these key-value pairs eg. the name of a sub-service for the lifetime of the process.

Separately, I can't see how this proposal satisfies the needs for which I proposed a BeforeEnd callback, eg. adding a span attribute for the CPU work done by the thread during the span, or recording ambient but changing information about the system eg. RAM usage. Perhaps I'm missing something. Could you clarify how your proposal would satisfy those use cases?

@garthk
Copy link

garthk commented Jul 2, 2022

I had a nagging thought I'd somehow mis-read the spec, came back, and confirmed it. I think I got confused by the scope attributes comparison, but that's on me; I have no suggested edits for clarity. I've struck out my first paragraph above. Do set me straight if I've similarly blundered on the second.

Under OTLP protocol changes, I'd prefer not to treat these attributes as resources. That might seem appropriate for the http.app example but I'm more tempted to use this feature for enduser.id et al.

@Oberon00
Copy link
Member Author

Oberon00 commented Jul 5, 2022

eg. adding a span attribute for the CPU work done by the thread during the span, or recording ambient but changing information about the system eg. RAM usage. Perhaps I'm missing something. Could you clarify how your proposal would satisfy those use cases?

It doesn't satisfy these, the BeforeEnd callback would still be useful for these. But it solves problems closely related to the original problem the BeforeEnd callback was proposed for, namely attaching attributes to (nearly) all spans of a trace. I also doesn't do that 100% in the way the original author of open-telemetry/opentelemetry-specification#1089 proposed (solutions closer to that are discussed in the alternatives section of the OTEP though).

@Oberon00
Copy link
Member Author

Oberon00 commented Jul 5, 2022

Under OTLP protocol changes, I'd prefer not to treat these attributes as resources.

Right, that doesn't make sense. You probably read one of the alternatives, because the main text suggests to implement the attributes such that the Tracers/Meters/... would just attach the attributes directly to the Spans/Metrics as normal attributes, so that exporters won't even be able to tell the difference, so there is no protocol change required. Merging context-scoped attributes with resources is only mentioned as alternative implementation to an optional addition to an alternative 😃

@garthk
Copy link

garthk commented Jul 6, 2022

Indeed I did, and I endorse your conclusion there. Good luck with your OTEP!

Copy link
Contributor

@jmacd jmacd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

service, there will often be exactly one span per Tracer (e.g., one span from the HTTP server instrumentation as the request arrives,
and another one from a HTTP client instrumentation, as a downstream service is called)

I suggest calling Scope Attributes emitter scope attributes for better distinguishability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this sentence? I think you mean to rename Scope Attributes to Emitter Scope Attributes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scope is shorthand for Instrumentation Scope. We can call them Instrumentation Scope Attributes.

text/0207-context-scoped-attributes.md Show resolved Hide resolved
@Oberon00
Copy link
Member Author

Oberon00 commented Aug 1, 2023

I would still like to see this OTEP merged eventually. But what would be needed here is agreement on what direction it should take regarding Baggage and configurability and how much is required for an initial version of the OTEP.

@Oberon00
Copy link
Member Author

Oberon00 commented Aug 1, 2023

@bogdandrutu I think you didn't reply on this PR since your "request changes" review #207 (review) to which I did answer. If you think it is still applicable nevertheless, please say so. No matter if this PR is closed or not, it would be useful to know for any reopening/next version.

This OTEP aims to address various related demands that have been brought up in the past, where the scope of resource attributes is too broad, but the scope of span attributes is too narrow. For example, this happens where there is a mismatch between the OpenTelemetry SDK’s (and thus TracerProvider’s, MeterProvider’s) process-wide initialization and the semantic scope of a (sub)service.

A concrete example is posed in open issue [open-telemetry/opentelemetry-specification#335](https://github.com/open-telemetry/opentelemetry-specification/issues/335) “Consider adding a resource (semantic convention) to distinguish HTTP applications (http.app attribute)”.
If the opentelemetry-java agent is injected in a JVM running a classical Java Application Server (such as tomcat), there will be a single resource for the whole JVM.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there will be a single resource for the whole JVM

why is that? AppServers are perfectly capable of distinguishing applications

Copy link
Member Author

@Oberon00 Oberon00 Aug 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as my understanding of the Java agent & app servers goes (@trask maybe you can correct me if I'm completely wrong here):

Yes of course App Servers are capable of that distinction. But we have no way to represent it in a OTel resource as the Java agent is injected in JVMs, not web server apps. And I think it is neither currently possible nor sensible to inject the instrumenting Java agent multiple times into the same JVM, targeting different apps. Using manual instrumentation with the SDK would be a different matter, I think there the OOTB behavior would be that every app has a different instance of the SDK and thus a different resource. But for Java, I think not using the agent is rather exotic, and you usually do want to use the agent.

@yurishkuro
Copy link
Member

yurishkuro commented Aug 1, 2023

My alternative proposal is to implement processors in the collection pipeline (like span processor, i.e. different per signal), which take a simple configuration of two arrays containing baggage keys and context keys, and enrich the respective signal with the attributes extracted from Baggage and/or Context.

  • no changes to public APIs, does not require any new roadmaps
  • opt-in solution, does not allow some run-away instrumentation to screw up the rest of the telemetry
  • a low-level, composable building block that can be simplified in the future with higher-level facade if it proves necessary and a generality can be derived from multiple use cases

From reading earlier comments I do not see what is not being solved with this approach. The only glitch is that Context allows an attribute to be an arbitrary object, but it doesn't feel like a major roadblock, buyer beware, the processor will try to convert the non-primitive objects to a string.

@Oberon00
Copy link
Member Author

Oberon00 commented Aug 2, 2023

that can be simplified in the future with higher-level facade

Isn't my proposal exactly that? You suggest to specify a particular implementation so that low-level context APIs can become the public API, while my OTEP specifies a higher-level API (the implementation would most likely be exactly what you suggest anyway though).

About it being opt-in, I think this point I can still add to the OTEP, as I metioned in #207 (comment):

I suggested an extremely minimal "configuration" I could add in #207 (comment), namely a simple per-signal on-off switch that the user would set e.g. when creating the MeterProvider to indicate whether they want context attributes for that signal.

It seems this runaway instrumentation scenario is your greatest concern here. Would that be resolved if we added that minimal configurability? I would rather have it opt-out though, a runaway instrumentation can always mess up your traces anyway in different ways (we could consider opt-out for traces and opt-in for other signals).

@yurishkuro
Copy link
Member

Isn't my proposal exactly that?

Yes, you propose to skip the low-level building blocks and jump straight to high level convenience API, which may never be needed. And there is a huge difference between making changes to the OTEL API vs. OTEL SDK. API changes impose a new contract on all implementations - what if we find that we don't like that contract later? My proposal can be implemented right now with a contrib library -- no spec, nor API, nor even SDK changes.

There is also the fundamental difference of opt-in vs. opt-out. I don't like giving users a gun they can shoot themselves with. The opt-out solution is even worse - they can shoot someone else. A minor upgrade of some 3rd party library that suddenly decides to set these attributes can triple someone's Datadog bill, break existing dashboards, etc. My proposal is no-op until the actual end user decides to enable this functionality.

@Oberon00
Copy link
Member Author

Oberon00 commented Aug 3, 2023

My proposal can be implemented right now with a contrib library -- no spec, nor API, nor even SDK changes.

Is this also possible for metrics & logs or only traces? How would instrumentations interact with that library?

Note that we have this part in the Context API spec which could be problematic for your proposal (https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/context#create-a-key):

The key name exists for debugging purposes and does not uniquely identify the key.

Consequently, there is no API to get an attribute by name, nor to enumerate all context attributes. So the library would need to somehow receive the actual key objects from the instrumentations. Could be done, with two libraries: An API one which just offers a key registration API (key object + configurable name) and a SDK one, which then would offer the actual functionality and access the registered key list.

And after all, I'd still like to have some kind of cross-technology common description of what the API should do, so I'd still like to have this in the spec, as that is the only place for such things. I would be fine with creating a completely new component beside SDK and API for this API extension, although I wonder if this is maybe too much ceremony

Regarding:

The opt-out solution is even worse - they can shoot someone else.

I think I don't understand this. If they disabled context attributes for tracing, whom would they "shoot"?

@yurishkuro
Copy link
Member

I think I don't understand this. If they disabled context attributes for tracing, whom would they "shoot"?

What I mean is an instrumentation library author will be in position to affect all telemetry of an end-user's application. I don't think this is a reasonable ownership model. Today a library author can only affect their library's instrumentation, but the end user can turn it off via named tracers / meters.

@Oberon00
Copy link
Member Author

Oberon00 commented Aug 3, 2023

the end user can turn it off via named tracers

Although there was that intention originally behind the tracer's scope name parameter, such a mechanism was never specified or implemented. I suppose, users would turn a instrumentation off at a different level, by disabling it in their instrumenting agent or not calling the initialization function for example. This would then also help against libraries polluting the context.

Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comes up a lot in Java auto-instrumentation, would ❤️ to see it move forward

Copy link

@lmolkova lmolkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a couple of examples in the semconv where this would be extremely helpful:

None of these examples affect baggage, so I wonder if we need to interop with it at all.

The high-cardinality/verbosity/sensitivity issues can be mitigated by

  • instrumentations reusing only known attributes (namespaces)
  • instrumentations allowing to extend the list of known attribute at configuration time

@austinlparker
Copy link
Member

Curious if this could get a once-over to make sure it's still current. Been a lotta changes over the past few years. Generically, I think it's a good proposal (iirc, @jmacd @tedsuo and I whiteboarded out a similar scoping proposal back in 2019 that didn't make it in) and would solve a lot of problems for folks.

@pyohannes
Copy link
Contributor

I also came across use cases where this would be very useful, one in particular was the ability to associate outgoing with incoming requests. This could be achieved if each incoming request sets a well defined context attribute.

I wonder if there's still interest from the original author @Oberon00 in this.

@Oberon00
Copy link
Member Author

Oberon00 commented Mar 27, 2024

I still think the feature is useful, but we kind of hit a stalemate here in the reviews. If anybody wants to meet the demands of reviewers, for example regarding baggage integration and configurability, feel free to submit an extended version of this OTEP, although IIRC it may not be possible as some requests are mutually exclusive, i.e. you probably cannot find a design that fulfills all requests at the same time. At the moment, I don't plan to update this.

@trask
Copy link
Member

trask commented Apr 3, 2024

for those familiar with Java logging MDC, this can be viewed as a generalization of logging MDC to spans, metrics, and logs

@ppkarwasz
Copy link

for those familiar with Java logging MDC, this can be viewed as a generalization of logging MDC to spans, metrics, and logs

For those interested in this subject, we are currently discussing as similar enhancement in Log4j API (cf. apache/logging-log4j2#2385, apache/logging-log4j2#2419 and apache/logging-log4j2#2438).

@lmolkova
Copy link

lmolkova commented Apr 4, 2024

A very similar approach exists in .NET logging called LoggingScope

    using (_logger.BeginScope(new List<KeyValuePair<string, object>>
        {
            new KeyValuePair<string, object>("TransactionId", Guid.NewGuid().ToString()),
        }))
    {
        _logger.LogInformation(MyLogEvents.GetItem, "Getting item {Id}", id);
    }

logging providers then have access to the scoped property bag and can add zero or more scope properties to logs.

@MikeGoldsmith
Copy link
Member

I think this has a lot of merit and has a similar concept to how the Baggage Span processors which have been contributed to a number of SDK contrib projects. The baggage span processors were born out of a need to replicate behaviour from the Honeycomb Beelines where trace level fields would be copied onto all spans.

I'd be happy to review the current proposal, concerns and try to move this issue forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request priority:p1 stale This issue or PR is stale and will be closed soon unless it is resurrected by the author. triaged
Projects
Status: Spec - Priority Backlog
Development

Successfully merging this pull request may close these issues.

None yet