Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in Webflux test when explicitely providing tracing headers #1846

Closed
jackcat13 opened this issue Feb 11, 2021 · 25 comments
Closed

Comments

@jackcat13
Copy link

Describe the bug
Re-opening spring-projects/spring-cloud-sleuth#1507 because it looks like there is a regression linked to this bug.

Reminder of the encountered issue:

Once using Sleuth and injecting Tracing headers in a WebTestClient I was able to retrieve them in Reactor's context later on like this :

// Test
webTestClient.get()
    .uri("/api/tracing")
    .header("X-B3-TraceId", "463ac35c9f6413ad48485a3953bb6124")
    .header("X-B3-SpanId", "a2fb4a1d1a96d312")

// Controller
@GetMapping
    fun spanId(): Mono<String> {
        return Mono.subscriberContext()
                .filter { it.hasKey(Span::class.java) }
                .map { it[Span::class.java].context() }
                .map { it.spanIdString() }
    }

When comparing spanId and parentId present in the org.springframework.cloud.sleuth.TraceContext object, it appears that values are different than the spanId and parentId set in the headers.

Spring libraries versions used to reproduce this issue:

  • spring Boot 2.4.2.RELEASE
  • spring Framework 5.3.3.RELEASE
  • spring-test 5.3.3.RELEASE
  • spring-cloud-sleuth; 3.0.1

Sample
Sample available here
The branch 2021 contains latest changes to reproduce the issue with latest libraries (project forked from the initial reproducer)
The issue can be reproduced by executing the test com.spring.reproducer.spring.test.sleuth.reproducer.TracingResourceTest

@janekbettinger
Copy link

Is this bug report only about tests and WebTestClient? When specifying a trace and span ID in the request header for Spring Cloud Gateway (based on WebFlux / Reactor) the values are ignored and new traces are generated by the application -- like in the test case linked above.

  • Spring Boot 2.4.2
  • Spring Cloud 2020.0.1
    • Spring Cloud Sleuth 3.0.1
    • Spring Cloud Gateway 3.0.1

@marcingrzejszczak
Copy link
Contributor

What are the headers that are being passed?

@janekbettinger
Copy link

janekbettinger commented Feb 15, 2021

I've tried the B3 headers as well as the newer ones (localhost:9095 is a Spring Cloud Gateway instance):

curl -sv http://localhost:9095/sample-route -H "X-B3-TraceId: explicit_X-B3-TraceId" -H "X-B3-SpanId: explicit_X-B3-SpanId" -H "traceId: explicit_traceId" -H "spanId: explicit_spanId"

The logs (or MDC) does not show the "explicit" values but new traces like fa2ed5b4165d32e9.

Furthermore, also the downstream service that is called by the gateway receives the trace IDs generated by the gateway instead of the manually passed ones:

2021-02-15 10:37:02.214 DEBUG [my-downstream-service,,] 4256 --- [io-8089-exec-10] o.a.coyote.http11.Http11InputBuffer : Received [GET /foo HTTP/1.1
User-Agent: curl/7.68.0
traceId: explicit_traceId
spanId: explicit_spanId
X-Forwarded-Port: 9095
X-Forwarded-Host: 172.28.96.1:9095
X-B3-TraceId: fa2ed5b4165d32e9
X-B3-SpanId: e96af85cf01d6eda
X-B3-ParentSpanId: fa2ed5b4165d32e9
X-B3-Sampled: 0
...
]

Note that the traceId and spanId we see in the request headers of the downstream service is probably just a "pass-through" by the gateway. I'd expect that the X-B3-TraceId value equals explicit_traceId as well.

@marcingrzejszczak
Copy link
Contributor

X-B3-TraceId and X-B3-SpanId must be valid longs. If they are not then AFAIR we're generating new ones

@janekbettinger
Copy link

Gosh... indeed... yeah... it's working as expected with proper values... seems like I've fooled myself during testing ;-)
Thanks for the fast response, @marcingrzejszczak and sorry for hijacking this bug report!

@marcingrzejszczak
Copy link
Contributor

No problem, glad that things are working fine :)

@jackcat13
Copy link
Author

Hello,
For information, I face the issue in the reproducer using both the parentId and spanId (Reproduced with X-B3 field as well as the new spanId and parentId fields). When setting these fields explicitely in the header, I get different ones in the response body

@marcingrzejszczak
Copy link
Contributor

Actually there are no new spanId and parentId - those are only set in MDC in a new format. For B3 headers you can either set the X-B3... or the b3 field that has both span, trace and optionally parent ids.

For information, I face the issue in the reproducer using both the parentId and spanId (Reproduced with X-B3 field as well as the new spanId and parentId fields). When setting these fields explicitely in the header, I get different ones in the response body

you're saying that you send (the ids are just a sample)

$ curl -sv http://localhost:9095/sample-route -H "X-B3-TraceId: 1" -H "X-B3-SpanId: 2

and in the gateway those headers are ignored and traceid and spanid are set to other values?

@jackcat13
Copy link
Author

Yes, as you can see in the following image, I'm doing the curl command and the value set in the header is ignored:

headerIssue

@marcingrzejszczak
Copy link
Contributor

I don't think 1 and 2 are valid values. Can you set propert values there? E.g.

$ curl -sv http://localhost:9095/sample-route -H "X-B3-TraceId: fa2ed5b4165d32e9" -H "X-B3-SpanId: fa2ed5b4165d32e9"

@janekbettinger
Copy link

janekbettinger commented Feb 15, 2021

Interestingly it does not work with ID values smaller than 10. For me it is also not working with X-B3-TraceId: 1 (GW generates new values), but it does with e.g. X-B3-TraceId: 10.

Update: hehe, simultaneous post :)

@jackcat13
Copy link
Author

I also have another value set :

headerIssue

@marcingrzejszczak
Copy link
Contributor

I have no idea what is the thing that you're returning. Are values in the logs wrong? Is the trace data propagated properly? What do the Sleuth logs say?

@jackcat13
Copy link
Author

Let me try to summarize what I'm doing when I execute my test. The first step is a call using the webTestClient where I set the headers:

webTestClientTestCall

At this step, the spandId set in the header is "a2fb4a1d1a96d312"

Then, when I execute this test in debug mode, I'm able to visualize the content of the TraceContext instance (import org.springframework.cloud.sleuth.TraceContext)
I can visualize the following content :

spanIdDifferentInDebug

My understanding is that the value of the spanId is never propagated to the TraceContext.

If you need some sleuth log, I have to setup logging in the reproducer. I can do it in the meantime.

@marcingrzejszczak
Copy link
Contributor

Wait a second - the trace context looks ok, the spanid is the child of the a2f... span. I don't understand what the problem is? We automatically created a child span.

@jackcat13
Copy link
Author

jackcat13 commented Feb 15, 2021

Well, I'm calling only one service from my test and my understanding is that the spandId and parentId would be equal to the ones I'm setting in the header.

Am I misunderstanding how the traceContext should be instanciated?

@marcingrzejszczak
Copy link
Contributor

It depends on what you're calling. What does happen properly is that on the client side the request headers were set properly and the trace context was successfully propagated. I have no idea what your controller does and how it looks like.

@jackcat13
Copy link
Author

jackcat13 commented Feb 15, 2021

The controller does nothing more than what I shared in the reproducer : Controller link here

the steps are:

  1. Call from the JUnit test
  2. SpanId returned by the Controller
  3. The test checks if the spanId returned by the controller is equal to the spanId set in the header

@marcingrzejszczak
Copy link
Contributor

Most likely that's a component somewhere in the reactor instrumentation that is creating a child span. I don't see a bug here nor a regression. The context is properly propagated.

@jackcat13
Copy link
Author

jackcat13 commented Feb 15, 2021

What confuses me is that it was the same issue in the following ticket: pring-projects/spring-cloud-sleuth#1507

In older versions of spring boot, the test in the reproducer actually passed (that's why I consider it a regression because the test passed until we move to Spring boot 2.4). The reproducer is a simple extract of an automated test of our project so we have detected that the behaviour around the propagation of the spanId/parentId seems different

@marcingrzejszczak
Copy link
Contributor

Ok I'll triple check this when I have time. Thanks for the reproducer.

@marcingrzejszczak
Copy link
Contributor

I'm sorry but I can't import the project to my IDE. Can you rewrite the sample to Java and Maven?

@jackcat13
Copy link
Author

Hello @marcingrzejszczak, I'm going to take a look to convert it today in java/maven on a separate brach

@jackcat13
Copy link
Author

@marcingrzejszczak You can find the converted reproducer at the following location : here
The test class to execute is TracingResourceTest.java

@marcingrzejszczak
Copy link
Contributor

marcingrzejszczak commented Mar 18, 2021

I know what the problem is. Brave has the notion of Propagation and Propagation.Factory. When I introduced the composite of propagation types I have not overriden the Propagation.Factory methods. That means that what you see that happens is that supportJoin is equal to false. The trace headers are properly parsed but since support join is disabled it creates a new child span.

As a workaround what you can do is create your own bean like this

@Bean
	BaggagePropagation.FactoryBuilder myBaggagePropagationFactoryBuilder() {
		return BaggagePropagation.newFactoryBuilder(B3Propagation.newFactoryBuilder().injectFormat(B3Propagation.Format.SINGLE_NO_PARENT).build());
	}

and then you'll override the faulty composite one that I'll try to fix asap.

@marcingrzejszczak marcingrzejszczak added this to To do in 2020.0.3 via automation Mar 18, 2021
@marcingrzejszczak marcingrzejszczak added this to the 3.0.3 milestone Mar 18, 2021
@spencergibb spencergibb moved this from To do to Done in 2020.0.3 Apr 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
2020.0.3
  
Done
Development

No branches or pull requests

4 participants