Allow adding links after span creation #2278

pyohannes · 2022-01-19T23:35:22Z

Fixes #454

In a discussion in the specification SIG, it was deemed feasible to revise a further decision about not allowing adding links to spans after span creation.

Adding links after span creation allows to create more meaningful traces in certain scenarios (several of those scenarios were identified by the messaging workgroup). Best-effort sampling based on links is still possible, only based on links added at creation time (similar to attributes).

Changes

This moves the section about adding span links from "Span creation" to "Span operations", removing mentions of the limitation that links can only be added at creation time. The name AddLink is proposed, wording is kept consistent with the corresponding section for adding attributes.

bogdandrutu · 2022-01-19T23:55:51Z

My 2cents: By approving this PR we are allowing instrumentation to use the new API, instead of trying to pass all the links at start time. This will cause instrumentations to not even consider grouping all the links at the start which will mean in lots of cases the instrumentation will produce incomplete traces since the sampler is not working in this case.

We are opening a pandora box without thinking of implications, and this may cause troubles on short to medium term until a proper solution for the samplers to work is proposed.

pyohannes · 2022-01-20T01:05:24Z

By approving this PR we are allowing instrumentation to use the new API, instead of trying to pass all the links at start time. This will cause instrumentations to not even consider grouping all the links at the start which will mean in lots of cases the instrumentation will produce incomplete traces since the sampler is not working in this case.

That's a valid point: this might change the shape of some traces when one is using custom samplers (none of the specified samplers use links for now).

We are opening a pandora box without thinking of implications, and this may cause troubles on short to medium term until a proper solution for the samplers to work is proposed.

I think it would be beneficial to make a decision as soon as possible (whether to allow links after creation time or not to) for the following reasons:

It will give us a clear framework to work with when coming up with semantic conventions. If we now decide to go with a sub-optimal solution that is based on the current limitation, we will be stuck with this sub-optimal solution (as we want to stabilize some semantic conventions this quarter), even when the spec will remove the limitation at a later point.
When we want to remove this limitation, it's probably better to do it know than later. Links are not widely used now, but probably will be used more and more with time. The longer we wait with a decision, the harder it will be to remove the limitation.

pyohannes · 2022-01-20T02:03:52Z

This will cause instrumentations to not even consider grouping all the links at the start which will mean in lots of cases the instrumentation will produce incomplete traces since the sampler is not working in this case.

Sampling in connection with links is a very tricky thing, as I found out by implementing a link-based sampler myself. Even with the current model, it's impossible to avoid incomplete traces: how will you sample a span that has a non-sampled parent, but sampled links? The only way to guarantee complete traces would be to limit adding links only to root spans of a trace.

Producing incomplete traces with link-based samplers is not a problem that is introduced with this change, although this change increases the scope of this problem.

carlosalberto · 2022-01-20T17:57:44Z

We are opening a pandora box without thinking of implications, and this may cause troubles on short to medium term until a proper solution for the samplers to work is proposed.

The same happens for attributes, but we allow that anyway. I feel like a 'big' warning should be put in the related sections, mentioning that instrumentation code SHOULD try its best to set links/attributes at Span build time.

jmacd · 2022-01-20T19:17:18Z

I want to add a few remarks about sampling and "completeness", since that is a major topic in #2047.

By approving this PR we are allowing instrumentation to use the new API, instead of trying to pass all the links at start time. This will cause instrumentations to not even consider grouping all the links at the start which will mean in lots of cases the instrumentation will produce incomplete traces since the sampler is not working in this case.

Challenging this statement @bogdandrutu: if there's a real problem created as a result of the new API, that is to say that many developers opt to use AddLink() instead of passing to Start(), then users will have "incompleteness" and they may be motivated to ask why the instrumented code did not use the Start() API. If there's a will and a way, the right thing will happen. OTOH, the reason for this change is that sometimes there is literally not a way to know your links when you start, so we should allow this new API.

I also want to challenge the idea that anyone using links actually expects a guarantee of "completeness" when they are sampling. Actually, the term "completeness" is not defined the way it's being used in this discussion. To me, the point of a Link is to connect one trace to another, and there is no notion of completeness for trace and its linked-to-traces that we have defined (see #1929). Completeness is about whether all the spans for a given TraceID have been collected (per #2047).

I do not expect linked-to traces to be collected with a 100% guarantee if I'm sampling. One reason to use links is to avoid huge or never-ending traces, so I am usually explicitly not interested in having them all collected. There are two directions a user could go if they want to ensure a sampled trace has its linked-to traces also collected.

Non-probabilistically: if the cost of tracing is not a concern, making a non-probabilistic decision to sample a trace whenever any of its provided-at-start-time linked-to contexts is also being traced. If the cost of tracing is a concern, it's not clear how to rate-limit this decision.
Probabilistically: if probability sampling is being used, then each linked-to trace has an independent probability of being collected. If you have 1000 links and use 1/2 sampling for the linked-to traces, you should expect approximately 500 of them to be collected. (This, to me, is working as intended.)

jmacd · 2022-01-20T19:20:01Z

how will you sample a span that has a non-sampled parent, but sampled links? The only way to guarantee complete traces would be to limit adding links only to root spans of a trace.

#2047 defines a notion of sub-trace completeness that can be used here. Generally speaking, with sampling you have the potential for incompleteness, think of it as a feature not a bug. 😁

specification/trace/api.md

jmacd · 2022-01-21T00:55:55Z

several of those scenarios were identified by the messaging workgroup

It may be helpful to link to a summary of those scenarios.

the reason for this change is that sometimes there is literally not a way to know your links when you start

One of the reasons I've wanted to add links after a span starts is:

a. Span A and B start
b. Span A acquires a shared resource
c. Span B tries to acquire a shared resource, but is blocked by Span A
d. Instrumentation adds a link to Span B ("blocked by")
e. Instrumentation adds a link to Span A ("blocking")

pyohannes · 2022-01-21T01:13:07Z

The same happens for attributes, but we allow that anyway. I feel like a 'big' warning should be put in the related sections, mentioning that instrumentation code SHOULD try its best to set links/attributes at Span build time.

@carlosalberto This is mentioned (I took this over from attributes), I'm not sure if it should be emphasized more:

Whenever possible, users SHOULD set any already known links at span creation instead of calling AddLink later.

pyohannes · 2022-01-21T01:17:17Z

several of those scenarios were identified by the messaging workgroup

It may be helpful to link to a summary of those scenarios.

One is briefly sketched here: #454 (comment)

We also plan to present our current design for messaging instrumentation to a wider audience, this will be covered there: #454 (comment)

github-actions · 2022-01-28T03:17:03Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

pyohannes · 2022-01-29T01:38:30Z

I will bring this up in the next specification SIG meeting, to discuss next steps for resolving open questions.

Aneurysm9 · 2022-02-08T16:13:23Z

specification/trace/api.md

@@ -402,30 +407,6 @@ A `SpanContext` cannot be set as active in a `Context` directly, but by
 [wrapping it into a Span](#wrapping-a-spancontext-in-a-span).
 For example, a `Propagator` performing context extraction may need this.

-#### Specifying links


Should we keep this section? I don't think that we can remove the existing API for specifying links at creation time. Instead, we should simply remove the last sentence of the first paragraph here:

~~Links cannot be added after Span creation~~

I don't think that we can remove the existing API for specifying links at creation time.

Right, the API for creating a span is still required to accept a list of links (see above). In addition, a method for adding links after span creation is added as a should.

I tried to structure this in a way similar to attributes, where the situation is similar (attributes can be added at creation time and after span creation). I'm open to change this, if it's clearer in another way.

I was about to jump on the same train, but I've found that the diff is misleading. By reading the whole document (without diff context), the order of the document introduces the creation first and adding links comes later and is discouraged in the span creation section.

My only concern is that the "Link"s link in the "Spans encapsulate:" part jumps straight to the "Add Links" section which may be misleading. Perhaps we can create a separate "Links" subsection that describes what a link is (arguably the same could be done for Events however Events don't make sense at create time)

Perhaps we can create a separate "Links" subsection that describes what a link is (arguably the same could be done for Events however Events don't make sense at create time)

Any other opinions on this? Otherwise, I'll go ahead the change the PR according to the suggestion.

specification/trace/api.md

CHANGELOG.md

specification/trace/api.md

pyohannes · 2022-02-15T16:44:41Z

These are the diagrams regarding messaging use cases presented in today's spec SIG. For the pull scenario solution presented below, it would be necessary to add links after span creation.

Details and related discussions are captured in this draft OTEP: open-telemetry/oteps#192

Push scenarios

const consumer = kafka.consumer({ groupId: 'my-group' })
await consumer.run({
 eachBatch: async ({ batch, offset, heartbeat, co, uco, isRunning, isStale }) => {
     for (let message of batch.messages) {
         console.log({
             key: message.key.toString(),
             value: message.value.toString(),
             headers: message.headers,
         })
     }
 },
})

Pull scenarios

const res = await client.receive();
res.messages.forEach(() => console.log('operation with side effect'));

bogdandrutu · 2022-02-18T18:11:36Z

@jmacd

One of the reasons I've wanted to add links after a span starts is:
a. Span A and B start
b. Span A acquires a shared resource
c. Span B tries to acquire a shared resource, but is blocked by Span A
d. Instrumentation adds a link to Span B ("blocked by")
e. Instrumentation adds a link to Span A ("blocking")

This is a great example, but in this case the "link" has also associated a "relation". Which I think is important to have if we start allowing links after.

The "default" case that we had so far was batching and the child of relation. With this new proposal we open the possibilities to way more relations and we probably need to think about this and add that to the API probably.

As per @pyohannes picture for the "Pull scenarios", the relation between the Span and the "linked context" is no longer a child of, so I think we need to document that in the link. The relation can be a semantic convention, but I think is important to have it.

@pyohannes for the "Pull scenarios":

Somehow I don't like the proposed way to model this, the reason is that (independent of adding links at start/end) I would prefer to have another Span Execute that wraps the 2 console.log ops as a sibling to Receive. That would feel much more natural.

pyohannes · 2022-02-22T22:06:51Z

@bogdandrutu

The "default" case that we had so far was batching and the child of relation. With this new proposal we open the possibilities to way more relations and we probably need to think about this and add that to the API probably.

I think in this context, even links in batching use cases modelled "child-of" relations: it's the case where one child has multiple parents.

As per @pyohannes picture for the "Pull scenarios", the relation between the Span and the "linked context" is no longer a child of, so I think we need to document that in the link. The relation can be a semantic convention, but I think is important to have it.

Yes, links could greatly benefit from additional semantic context. In the messaging use cases, we'd always have links between one span with kind producer, and one span with kind consumer. This way semantic context could be inferred from the spans that are linked. But having it explicitly attached to a link. As it is already possible to attach attributes to a link, it's might just be a matter of coming up with semantic conventions for attributes on links?

@pyohannes for the "Pull scenarios":

Somehow I don't like the proposed way to model this, the reason is that (independent of adding links at start/end) I would prefer to have another Span Execute that wraps the 2 console.log ops as a sibling to Receive. That would feel much more natural.

As mentioned, this is independent of adding links after span creation. In the messaging workgroup, we were discussing "Process" spans (same as "Execute" spans). The aim was to come up with a model that can easily be extended by "Process" spans. However, we're not yet clear about all the details, and we'd also like to clear up whether we'd be able to create a "Receive" span as sketched above, as "Process" spans might look different in a different model.

github-actions · 2022-04-06T03:20:08Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>

pyohannes · 2022-04-12T23:53:23Z

This is still being discussed with @bogdandrutu in the Thursday messaging workgroup.

github-actions · 2022-04-20T03:46:08Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions · 2022-04-27T03:46:36Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

nerochiaro · 2022-10-14T12:15:17Z

I came here looking for a resolution of this issue, but it seems that the last significant comment was back in April

This is still being discussed with @bogdandrutu in the Thursday messaging workgroup.

I am curious to understand if the discussion has simply stalled and has been abandoned or if there is a possibility that at some point opentelemetry will allow adding links to spans that have already been created.

pyohannes · 2022-10-17T23:27:06Z

I am curious to understand if the discussion has simply stalled and has been abandoned or if there is a possibility that at some point opentelemetry will allow adding links to spans that have already been created.

The discussion has stalled but is not yet settled, issue #454 is still open. @nerochiaro Maybe you could add your use case for adding span links after span creation to this issue?

nerochiaro · 2022-10-18T10:19:36Z

I added more comments to issue #454, but in short the use case would be when using the lambda automatic instrumentation.

A span for the lambda function is created automatically by the instrumentation's function wrapper, and there is no way to tell it to add links (for example, a link to the span created by a lambda authorizer that ran before the function itself).

Oberon00 · 2022-10-18T11:25:40Z

For that use case you don't need to add links after creation, if you can tell the wrapper to add the links right away.

nerochiaro · 2022-10-18T11:53:50Z

For that use case you don't need to add links after creation, if you can tell the wrapper to add the links right away.

Except that, as far as I can tell, the wrapper creates the span then runs the lambda, without accepting any input.
I would have to write my own wrapper which duplicates all the code of the default one, and add code to create links to that.

(Or possibly the wrapper could accept a configuration option that specifies a "hook" function that given the event and context would generate extra links and attributes to add to the lambda, but it seems convoluted and with drawbacks. see my comment on #454 for more details.)

I'm not saying it's not feasible, but certainly not ideal, especially when lambdas running behind API Gateway with an authorizer lambda are a pretty common case. Maybe this scenario could be modeled some other way that I'm not seeing, without the use of links ?

joaopgrassi · 2022-10-19T16:26:33Z

I removed the stale label, but I wonder if we should focus the discussion on the issue or here in this PR. If here, then maybe re-opening the PR to get attention again?

carlosalberto · 2022-10-19T17:01:43Z

I'd recommend discussing this in the issue itself (e.g. sometimes there's more than one possible PR for a given issue, so general discussion should be in the issue itself).

pyohannes requested review from a team as code owners January 19, 2022 23:35

github-actions bot assigned arminru Jan 19, 2022

jmacd approved these changes Jan 20, 2022

View reviewed changes

yurishkuro reviewed Jan 20, 2022

View reviewed changes

specification/trace/api.md Outdated Show resolved Hide resolved

github-actions bot added the Stale label Jan 28, 2022

pyohannes force-pushed the create-span-link-after-creation branch from 9dc2cdd to 5fcc01f Compare January 29, 2022 01:37

github-actions bot removed the Stale label Jan 29, 2022

pyohannes mentioned this pull request Feb 4, 2022

Stabilize messaging semantic conventions for tracing open-telemetry/oteps#192

Closed

Aneurysm9 reviewed Feb 8, 2022

View reviewed changes

MrAlias reviewed Feb 15, 2022

View reviewed changes

specification/trace/api.md Outdated Show resolved Hide resolved

CHANGELOG.md Outdated Show resolved Hide resolved

specification/trace/api.md Outdated Show resolved Hide resolved

specification/trace/api.md Show resolved Hide resolved

bogdandrutu mentioned this pull request Feb 24, 2022

Define span structure for HTTP retries and redirects #2078

Merged

pyohannes mentioned this pull request Feb 25, 2022

Add documentation requirement for adding attributes at span creation #2383

Merged

jmacd mentioned this pull request Mar 1, 2022

Support non-UTF8 payloads (per MQTT specification) mochi-mqtt/server#29

Merged

denisivan0v mentioned this pull request Mar 2, 2022

Add span links for HTTP retries and redirects #2393

Closed

lalitb mentioned this pull request Mar 25, 2022

Links between Spans from different Traces open-telemetry/opentelemetry-cpp#1279

Closed

reyang added area:api Cross language API specification issue spec:trace Related to the specification/trace directory labels Mar 29, 2022

github-actions bot added the Stale label Apr 6, 2022

pyohannes and others added 9 commits April 7, 2022 16:38

Allow adding links after span creation

1bf8935

Add changelog and adapt spec compliance matrix

9c5005e

MUST -> SHOULD (to keep APIs stable)

c2d4b93

Fix broken links

f07c72e

Update CHANGELOG.md

3e96d44

Co-authored-by: Tyler Yahn <MrAlias@users.noreply.github.com>

Add documentation requirement for adding links

15f5b5c

PR comment

91a88f3

Fix changelog

5431922

Fix matrix

8aa3d5d

pyohannes force-pushed the create-span-link-after-creation branch from 152e5b6 to 8aa3d5d Compare April 7, 2022 23:43

github-actions bot removed the Stale label Apr 13, 2022

github-actions bot added the Stale label Apr 20, 2022

github-actions bot closed this Apr 27, 2022

joaopgrassi removed the Stale label Oct 19, 2022

pyohannes mentioned this pull request Feb 7, 2023

Allow adding links after span creation #3186

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow adding links after span creation #2278

Allow adding links after span creation #2278

pyohannes commented Jan 19, 2022

bogdandrutu commented Jan 19, 2022

pyohannes commented Jan 20, 2022 •

edited

pyohannes commented Jan 20, 2022

carlosalberto commented Jan 20, 2022

jmacd commented Jan 20, 2022 •

edited

jmacd commented Jan 20, 2022

jmacd commented Jan 21, 2022

pyohannes commented Jan 21, 2022

pyohannes commented Jan 21, 2022

github-actions bot commented Jan 28, 2022

pyohannes commented Jan 29, 2022

Aneurysm9 Feb 8, 2022

pyohannes Feb 8, 2022

eyjohn Feb 8, 2022

pyohannes Feb 8, 2022

pyohannes commented Feb 15, 2022 •

edited

bogdandrutu commented Feb 18, 2022

pyohannes commented Feb 22, 2022

github-actions bot commented Apr 6, 2022

pyohannes commented Apr 12, 2022

github-actions bot commented Apr 20, 2022

github-actions bot commented Apr 27, 2022

nerochiaro commented Oct 14, 2022

pyohannes commented Oct 17, 2022

nerochiaro commented Oct 18, 2022

Oberon00 commented Oct 18, 2022

nerochiaro commented Oct 18, 2022 •

edited

joaopgrassi commented Oct 19, 2022 •

edited

carlosalberto commented Oct 19, 2022

Allow adding links after span creation #2278

Allow adding links after span creation #2278

Conversation

pyohannes commented Jan 19, 2022

Changes

bogdandrutu commented Jan 19, 2022

pyohannes commented Jan 20, 2022 • edited

pyohannes commented Jan 20, 2022

carlosalberto commented Jan 20, 2022

jmacd commented Jan 20, 2022 • edited

jmacd commented Jan 20, 2022

jmacd commented Jan 21, 2022

pyohannes commented Jan 21, 2022

pyohannes commented Jan 21, 2022

github-actions bot commented Jan 28, 2022

pyohannes commented Jan 29, 2022

Aneurysm9 Feb 8, 2022

Choose a reason for hiding this comment

pyohannes Feb 8, 2022

Choose a reason for hiding this comment

eyjohn Feb 8, 2022

Choose a reason for hiding this comment

pyohannes Feb 8, 2022

Choose a reason for hiding this comment

pyohannes commented Feb 15, 2022 • edited

Push scenarios

Pull scenarios

bogdandrutu commented Feb 18, 2022

pyohannes commented Feb 22, 2022

github-actions bot commented Apr 6, 2022

pyohannes commented Apr 12, 2022

github-actions bot commented Apr 20, 2022

github-actions bot commented Apr 27, 2022

nerochiaro commented Oct 14, 2022

pyohannes commented Oct 17, 2022

nerochiaro commented Oct 18, 2022

Oberon00 commented Oct 18, 2022

nerochiaro commented Oct 18, 2022 • edited

joaopgrassi commented Oct 19, 2022 • edited

carlosalberto commented Oct 19, 2022

pyohannes commented Jan 20, 2022 •

edited

jmacd commented Jan 20, 2022 •

edited

pyohannes commented Feb 15, 2022 •

edited

nerochiaro commented Oct 18, 2022 •

edited

joaopgrassi commented Oct 19, 2022 •

edited