Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TraceAbleRequestContextStorage for ease of detect context leaks #4232

Merged
merged 46 commits into from Sep 5, 2022

Conversation

klurpicolo
Copy link
Contributor

@klurpicolo klurpicolo commented Apr 26, 2022

Motivation:

  • Context leaks are hard to find because an exception does not tell where/which context is pushed without poping. By using TraceAbleRequestContextStorage, it helps to report the source of context leaks.
  • Details as mentioned in Provide a way to debug a context leak #4100

By the way, Thanks to @anuraaga for giving a reference to read on opentelemetry.

Modifications:

  • Add TraceAbleRequestContextStorage that stores RequestContext stack trace and reports to the user where it happens.
  • Add requestContextLeakDetectionSampler flag that users can use for enable leak detection. Users can enable it by either system property or SPI flag provider.

Result:

How to enable:

  1. By system property
    -Dcom.linecorp.armeria.requestContextLeakDetectionSampler=<sampler-spec>

  2. By providing FlagsProvider SPI

public final class EnableLeakDetectionFlagsProvider implements FlagsProvider {

    @Override
    public Sampler<? super RequestContext> requestContextLeakDetectionSampler() {
        return Sampler.always();
    }
    ...
}
  1. By providing RequestContextStorageProvider SPI (not recommend since RequestContextStorageProvider SPI'll be remove as mentioned in #Remove SPI-based RequestContextStorageProvider lookup  #4211 )
public final class CustomRequestContextStorageProvider implements RequestContextStorageProvider {

    @Override
    public RequestContextStorage newStorage() {
        return new TraceAbleRequestContextStorage(delegate);
    }
}

Use case:
Users problematic code

executor.execute(() -> {
    SafeCloseable leaked = fooCtx.push(); //This causes Request context leaks!
    ...
});

executor.execute(() -> {
    try (SafeCloseable ignored = barCtx.push()) { //Exception happen here
        ...
    }
});

The above code will produce an error as below. Therefore, users can check the stack trace that which line causes context leaks.

java.lang.IllegalStateException: Trying to call object wrapped with context [%New RequestContext%], but context is currently set to TraceableServiceRequestContext[%Previous RequestContext%]
com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage$PendingRequestContextStackTrace: At thread [armeria-testing-eventloop-nio-1-1] previous RequestContext is pushed at the following stacktrace
	at com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage$TraceableServiceRequestContext.<init>(LeakTracingRequestContextStorage.java:111)
	at com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage$TraceableServiceRequestContext.<init>(LeakTracingRequestContextStorage.java:105)
	at com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage.warpRequestContext(LeakTracingRequestContextStorage.java:82)
	at com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage.push(LeakTracingRequestContextStorage.java:62)
	at com.linecorp.armeria.internal.common.RequestContextUtil.getAndSet(RequestContextUtil.java:149)
	at com.linecorp.armeria.server.ServiceRequestContext.push(ServiceRequestContext.java:221)
	at com.linecorp.armeria.internal.common.TraceRequestContextLeakTest.lambda$singleThreadContextLeak$2(TraceRequestContextLeakTest.java:101)  <- This is the line where leaked RequestContext is push 
	...
. This means the callback was called from unexpected thread or forgetting to close previous context.
	at com.linecorp.armeria.internal.common.RequestContextUtil.newIllegalContextPushingException(RequestContextUtil.java:100)
	at com.linecorp.armeria.server.ServiceRequestContext.push(ServiceRequestContext.java:237)
	at com.linecorp.armeria.internal.common.TraceRequestContextLeakTest.lambda$singleThreadContextLeak$3(TraceRequestContextLeakTest.java:107)
	...

@codecov
Copy link

codecov bot commented Apr 26, 2022

Codecov Report

Merging #4232 (1302481) into master (26e57a8) will increase coverage by 0.09%.
The diff coverage is 81.35%.

❗ Current head 1302481 differs from pull request most recent head 9e8e10e. Consider uploading reports for the commit 9e8e10e to get more accurate results

@@             Coverage Diff              @@
##             master    #4232      +/-   ##
============================================
+ Coverage     73.82%   73.92%   +0.09%     
- Complexity    17698    17797      +99     
============================================
  Files          1505     1510       +5     
  Lines         66105    66262     +157     
  Branches       8309     8321      +12     
============================================
+ Hits          48805    48981     +176     
+ Misses        13302    13275      -27     
- Partials       3998     4006       +8     
Impacted Files Coverage Δ
...ava/com/linecorp/armeria/common/FlagsProvider.java 94.73% <0.00%> (+94.73%) ⬆️
...meria/common/ThreadLocalRequestContextStorage.java 93.75% <0.00%> (+6.25%) ⬆️
...ava/com/linecorp/armeria/common/util/Samplers.java 100.00% <ø> (+4.16%) ⬆️
...rp/armeria/common/SystemPropertyFlagsProvider.java 57.74% <33.33%> (+2.03%) ⬆️
.../com/linecorp/armeria/common/util/Unwrappable.java 83.33% <33.33%> (-16.67%) ⬇️
...ernal/common/LeakTracingRequestContextStorage.java 90.32% <90.32%> (ø)
.../linecorp/armeria/client/ClientRequestContext.java 90.47% <100.00%> (ø)
.../linecorp/armeria/common/DefaultFlagsProvider.java 91.42% <100.00%> (+0.12%) ⬆️
...c/main/java/com/linecorp/armeria/common/Flags.java 67.79% <100.00%> (+1.02%) ⬆️
...rp/armeria/internal/common/RequestContextUtil.java 76.05% <100.00%> (+9.90%) ⬆️
... and 41 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

throw pendingRequestCtx.get();
}
}
pendingRequestCtx.set(new PendingRequestContextStackTrace(toPush));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we take a Sampler as a constructor parameter so that users can customize the sampling rate?
If not specified, I prefer to follow DefaultFlagsProvider#VERBOSE_EXCEPTION_SAMPLER_SPEC as the sensible default value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to refer to

if (Flags.verboseExceptionSampler().isSampled(getClass())) {
super.fillInStackTrace();
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, Thanks.🙏

* A {@link RequestContextStorage} which keeps track of {@link RequestContext}s, reporting pushed thread
* information if a {@link RequestContext} is leaked.
*/
public final class TraceAbleRequestContextStorage implements RequestContextStorage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about?

Suggested change
public final class TraceAbleRequestContextStorage implements RequestContextStorage {
public final class LeakTracingRequestContextStorage implements RequestContextStorage {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good name.

@ikhoon ikhoon added this to the 1.17.0 milestone Apr 28, 2022
@klurpicolo
Copy link
Contributor Author

Do I need to do benchmarking for this pull request?

@ikhoon
Copy link
Contributor

ikhoon commented Apr 29, 2022

Do I need to do benchmarking for this pull request?

+1 It would be good information to set the sensible default value of the sample rate.

Benchmark result:
```
Benchmark                                                              Mode  Cnt          Score         Error  Units
LeakTracingRequestContextStorageBenchmark.baseline_threadLocal        thrpt    5  138905863.970 ± 1832287.597  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_always_sample   thrpt    5    1347325.018 ±    7907.388  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_never_sample    thrpt    5   10762418.543 ±   43741.747  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_random_1        thrpt    5    9351586.968 ±    8154.426  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_random_10       thrpt    5    5915767.799 ±   15699.257  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_rateLimited_1   thrpt    5    9677373.483 ±   46283.720  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_rateLimited_10  thrpt    5   10027495.272 ±  129500.874  ops/s
```
@klurpicolo
Copy link
Contributor Author

Here is the benchmark result.

Benchmark                                                              Mode  Cnt          Score         Error  Units
LeakTracingRequestContextStorageBenchmark.baseline_threadLocal        thrpt    5  138905863.970 ± 1832287.597  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_always_sample   thrpt    5    1347325.018 ±    7907.388  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_never_sample    thrpt    5   10762418.543 ±   43741.747  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_random_1        thrpt    5    9351586.968 ±    8154.426  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_random_10       thrpt    5    5915767.799 ±   15699.257  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_rateLimited_1   thrpt    5    9677373.483 ±   46283.720  ops/s
LeakTracingRequestContextStorageBenchmark.leakTracing_rateLimited_10  thrpt    5   10027495.272 ±  129500.874  ops/s

Command

./gradlew --no-daemon :benchmarks:jmh:clean :benchmarks:jmh:jmh -Pjmh.includes=com.linecorp.armeria.common.LeakTracingRequestContextStorageBenchmark

Machine

  • I test on my local machine

Observation

  • It seems like ThreadLocalRequestContextStorage has ~100 times ops/s compare to always-sample LeakTracingRequestContextStorage. Also, ThreadLocalRequestContextStorage has ~13 times ops/s compare to never-sample LeakTracingRequestContextStorage
  • One odd thing is that rate-limited 1 op/s is less than rate-limited 10 ops/s. I expect that the more sample rate, the fewer ops/s. I run the test a couple of times and this trend is always presented.

From the data above, should I do something to fix/optimize it? By the way, I'm really new to benchmarking, so might fall into pitfalls. If something is incorrect with the benchmark test, please let me know. Thank you. 🙏

Comment on lines 90 to 91
delegate.pop(current, toRestore);
pendingRequestCtx.remove();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
delegate.pop(current, toRestore);
pendingRequestCtx.remove();
try {
delegate.pop(current, toRestore);
} finally {
pendingRequestCtx.remove();
}

Comment on lines 53 to 56
Sampler<Class<? extends Throwable>> sampler) {
this.delegate = delegate;
pendingRequestCtx = new FastThreadLocal<>();
this.sampler = sampler;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see we need to limit the type to Class<Throwable>.

Suggested change
Sampler<Class<? extends Throwable>> sampler) {
this.delegate = delegate;
pendingRequestCtx = new FastThreadLocal<>();
this.sampler = sampler;
Sampler<?> sampler) {
this.delegate = delegate;
pendingRequestCtx = new FastThreadLocal<>();
this.sampler = (Sampler<Object>) sampler;

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, @klurpicolo! 🚀

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have beautifully made it. 😄
Just left minor nits and suggestions.
Thanks a lot for your effort. 🙇

prevContext.root() == toPush.root()) {
// The delegate has the ClientRequestContext whose root() is the same as toPush.root()
} else {
throw newIllegalContextPushingException(prevContext, toPush, pendingRequestCtx.get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this?

Suggested change
throw newIllegalContextPushingException(prevContext, toPush, pendingRequestCtx.get());
throw newIllegalContextPushingException(toPush, prevContext, pendingRequestCtx.get());

// Reentrance
return noopSafeCloseable();
}

final ServiceRequestContext root = root();
if (oldCtx.root() == root) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we also use unwrapAll?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should.


private TraceableClientRequestContext(ClientRequestContext delegate) {
super(delegate);
stackTrace = new PendingRequestContextStackTrace();
Copy link
Member

@minwoox minwoox Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using StackTraceElement[] because

  • an exception is not raised yet (we don't need to create an instance of an exception that might not be raised in the future.)
  • all we need in toString is the stack trace?

If we using StackTraceElement[], we could probably do:

private TraceableClientRequestContext(ServiceRequestContext delegate) {
    super(delegate);
    stackTrace = currentThread().getStackTrace();
}

@Override
public String toString() {
    final StringBuilder builder = new StringBuilder(10000); // What's the sensible default?
    builder.append(super.toString());
    builder.append(System.lineSeparator());
    // Start with 1 not to print getStackTrace() method.
    for (int i = 1; i < stackTrace.length; i++) {
        builder.append("\tat ").append(stackTrace[i]).append(System.lineSeparator());
    }
    return builder.toString();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JFYI, StackTraceWriter uses 512 for the initial buffer.

private static final class StackTraceWriter extends PrintWriter {
StackTraceWriter() {
super(new StringWriter(512));

// Reentrance
return noopSafeCloseable();
}

if (oldCtx.root() == this) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

…acingRequestContextStorage.java

Co-authored-by: minux <songmw725@gmail.com>
Comment on lines 116 to 121
final StringWriter sw = new StringWriter();
stackTrace.printStackTrace(new PrintWriter(sw));
return new StringBuilder().append(getClass().getSimpleName())
.append(super.toString())
.append(System.lineSeparator())
.append(sw).toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dedup?

Copy link
Contributor

@ikhoon ikhoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking good! Left some comments on tests. 😉

import io.netty.util.AttributeKey;

/**
* This is copy from ClientRequestContextTest on core module to test behaviour consistency.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about copying the files using Gradle's copy task?

// Copy common files from boot2-autoconfigure module to gen-src directory in order to use them as a source set.
task generateSources(type: Copy) {
from "${rootProject.projectDir}/spring/boot2-autoconfigure/src/main/java"
into "${project.ext.genSrcDir}/main/java"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice suggestion. Thanks!

executor.execute(() -> {
final ServiceRequestContext ctx = newCtx("/1");
try (SafeCloseable ignore = ctx.push()) {
//Ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Global comment: Could you add a white space after //?

Suggested change
//Ignore
// Ignore

Comment on lines 110 to 111
isThrown.set(true);
exception.set(ex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For correct happen-before relationship.

Suggested change
isThrown.set(true);
exception.set(ex);
exception.set(ex);
isThrown.set(true);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Thanks for pointing out that.🙏

Comment on lines 141 to 142
//Leak happened on the first eventLoop shouldn't affect 2nd eventLoop when trying to push
final ServiceRequestContext anotherCtx = newCtx("/2");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wait until latch.countDown() is invoked?

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are almost there.
Left a few minor comments. 😉

@@ -46,7 +46,7 @@ public void pop(RequestContext current, @Nullable RequestContext toRestore) {
requireNonNull(current, "current");
final InternalThreadLocalMap map = InternalThreadLocalMap.get();
final RequestContext contextInThreadLocal = context.get(map);
if (current != contextInThreadLocal) {
if (current.unwrapAll() != contextInThreadLocal.unwrapAll()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also check if contextInThreadLocal is null or not just in case pop is called without push.

RequestContext unwrap) {
final StringBuilder builder = new StringBuilder(512);
builder.append(unwrap).append(System.lineSeparator())
.append("At thread [").append(threadName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need threadName because it's the same thread that raises newIllegalContextPoppingException . (If the thread is changed, the exception is not raised. 😉 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I got it. Thanks

Comment on lines 205 to 208
public static boolean equalUnwrapAllNullable(@Nullable RequestContext ctx1, @Nullable RequestContext ctx2) {
return Optional.ofNullable(ctx1).map(RequestContext::unwrapAll).orElse(null) ==
Optional.ofNullable(ctx2).map(RequestContext::unwrapAll).orElse(null);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will create additional instances every time the method is called. How about

Suggested change
public static boolean equalUnwrapAllNullable(@Nullable RequestContext ctx1, @Nullable RequestContext ctx2) {
return Optional.ofNullable(ctx1).map(RequestContext::unwrapAll).orElse(null) ==
Optional.ofNullable(ctx2).map(RequestContext::unwrapAll).orElse(null);
}
public static boolean equalsUnwrapping(@Nullable RequestContext ctx1, @Nullable RequestContext ctx2) {
if (ctx1 == ctx2) {
return true;
}
if (ctx1 != null && ctx2 != null) {
return ctx1.unwrapAll() == ctx2.unwrapAll();
}
return false;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can hardly say that it is convenient/easy to compare wrapped objects with this utility method.
How about adding a helper method to Unwrappable like

interface Unwrappable {
    // The method name was inspired by String.equalsIgnoreCase().
    // Better naming is welcome. 😆
    @UnstableApi
    boolean equalsIgnoreWrapper(@Nullable Unwrappable other) {

    }
}

so that we can easily compare the ctxs without complex if else conditions that might distract other important code.

include '**/ClientRequestContextTest.java'
}

tasks.compileJava.dependsOn(generateSources)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a new line. 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss it again 😅. Thanks.

Copy link
Member

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your hard work, @klurpicolo! 🎉 🎉 🎉

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautifully done! Thanks @klurpicolo ! 🙇 👍 🙇

return delegate;
}

private static RequestContextWrapper<?> warpRequestContext(RequestContext ctx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static RequestContextWrapper<?> warpRequestContext(RequestContext ctx) {
private static RequestContextWrapper<?> wrapRequestContext(RequestContext ctx) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! misspelled. Thanks 🙏

* A {@link RequestContextStorage} which keeps track of {@link RequestContext}s, reporting pushed thread
* information if a {@link RequestContext} is leaked.
*/
@UnstableApi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this class isn't part of the public api

Suggested change
@UnstableApi

@klurpicolo
Copy link
Contributor Author

Thanks to all reviewers for these meaningful suggestions! 🙇🙇🙇

@minwoox minwoox merged commit 5109764 into line:master Sep 5, 2022
@minwoox
Copy link
Member

minwoox commented Sep 5, 2022

Thanks for not giving up despite all the misdirection. 🤣

heowc pushed a commit to heowc/armeria that referenced this pull request Sep 24, 2022
Motivation:

Armeria has some core logic which assumes that the concrete default implementations of `[Client|Service]RequestContext` are `Default[Client|Service]RequestContext`.

e.g. If the `ClientRequestContext` isn't an instance of `DefaultClientRequestContext`, then the response timeout may not be respected.
https://github.com/line/armeria/blob/5edaef039abfd047f152a63cde38b46094c8353f/core/src/main/java/com/linecorp/armeria/client/HttpResponseDecoder.java#L401-L407

This usage pattern mostly stems from the fact that `[Client|Service]RequestContext` are public APIs that users are exposed to a lot (especially in decorators). However, Armeria developers are skeptical of adding public methods that are either 1) very specific and used only in core logic 2) have the potential to break behavior if used incorrectly.

Up to now, we have added such methods to the default implementation and checked with `instanceof` when invoking such methods.

With the introduction of line#4232, we are now planning on introducing and extensively utilizing wrapper classes for `[Client|Service]RequestContext`. However, using wrapper classes means previous assumptions of `instanceof Default[Client|Service]RequestContext` will be invalid.

In this pull request, I propose that:

1) `RequestContext` now extends `Unwrappable`, allowing users to peel `RequestContext`s to the intended type.
2) We expose another interface `[Client]RequestContextExtension`, which contains "extension" methods for `ClientRequestContext`, `RequestContext`. In this interface, we can add methods that Armeria uses internally without worrying about public API exposure.
3) `Default[Client|Service]RequestContext` doesn't need to be exposed as a public API. We may simply expose the interface to users and hide the detailed implementation like we do for the `Http[Response|Request]` series.
    - Note: This move was handled in this PR mainly due to `CancellationScheduler`. While extracting `ClientRequestContextExtension`, `CancellationScheduler` which is an internal class was exposed as the return value of `#responseCancellationScheduler`. I had the choice of exposing `CancellationScheduler`, or hiding `DefaultClientRequestContext`. I chose the latter and also moved `DefaultServiceRequestContext` for consistency.

Modifications:

- `RequestContext` now extends `Unwrappable`. `RequestContextWrapper` now extends `AbstractUnwrappable`
- Convert usages of `instanceof DefaultClientRequestContext` to unwrap instead
    - e.g. `ctx instanceof DefaultClientRequestContext` -> `ctx.as(ClientRequestContextExtension.class)`
- `RequestContextExtension`, `ClientRequestContextExtension` has been added to an internal package
- `DefaultClientRequestContext`, `DefaultServiceRequestContext`, `NonWrappingRequestContext` have been moved to an internal package
    - `pathWithQuery` is moved to `ClientUtil` and made public
    - `ClientThreadLocalState` is moved to an internal package and made public
- Class hierarchy has been modified for `RequestContext`
    - Now there is an extra layer between `ClientRequestContext` and `DefaultClientRequestContext`:
        - `ClientRequestContext` -> `ClientRequestContextExtension` -> `DefaultClientRequestContext`
    - Now there is an extra layer between `RequestContext` and `NonWrappingRequestContext`
        - `RequestContext` -> `RequestContextExtension` -> `NonWrappingRequestContext`

Before:
<img width="300" alt="Screen Shot 2022-06-24 at 12 00 49 PM" src="https://user-images.githubusercontent.com/8510579/175453672-15dbdf09-4c2e-4afa-a186-a2147e97afa2.png">
After:
<img width="300" alt="Screen Shot 2022-06-24 at 12 01 15 PM" src="https://user-images.githubusercontent.com/8510579/175453689-3c00edde-7209-4d1c-9480-e776c58537e0.png">

Result:

- Developers may add more utility functions to `RequestContext` without worrying about public API exposure
- Armeria tries to unwrap `RequestContext` instead of simply checking the type.
heowc pushed a commit to heowc/armeria that referenced this pull request Sep 24, 2022
…ine#4232)

Motivation:

- Context leaks are hard to find because an exception does not tell where/which context is pushed without poping. By using TraceAbleRequestContextStorage, it helps to report the source of context leaks.
- Details as mentioned in line#4100 

By the way, Thanks to @anuraaga for giving a reference to read on [opentelemetry](https://github.com/open-telemetry/opentelemetry-java).

Modifications:

- Add `TraceAbleRequestContextStorage` that stores `RequestContext` stack trace and reports to the user where it happens.
- Add `requestContextLeakDetectionSampler` flag that users can use for enable leak detection. Users can enable it by either system property or SPI flag provider.

Result:

- Closes line#4100 
- `TraceAbleRequestContextStorage` is added, so users can use it to report where context leaks happen.

How to enable:
1) By system property
`-Dcom.linecorp.armeria.requestContextLeakDetectionSampler=<sampler-spec>`

2) By providing FlagsProvider SPI
```java
public final class EnableLeakDetectionFlagsProvider implements FlagsProvider {

    @OverRide
    public Sampler<? super RequestContext> requestContextLeakDetectionSampler() {
        return Sampler.always();
    }
    ...
}
```

3) By providing RequestContextStorageProvider SPI (not recommend since RequestContextStorageProvider SPI'll be remove as mentioned in #line#4211 )
```java
public final class CustomRequestContextStorageProvider implements RequestContextStorageProvider {

    @OverRide
    public RequestContextStorage newStorage() {
        return new TraceAbleRequestContextStorage(delegate);
    }
}
```
Use case:
Users problematic code
```java
executor.execute(() -> {
    SafeCloseable leaked = fooCtx.push(); //This causes Request context leaks!
    ...
});

executor.execute(() -> {
    try (SafeCloseable ignored = barCtx.push()) { //Exception happen here
        ...
    }
});

```
The above code will produce an error as below. Therefore, users can check the stack trace that which line causes context leaks.
```
java.lang.IllegalStateException: Trying to call object wrapped with context [%New RequestContext%], but context is currently set to TraceableServiceRequestContext[%Previous RequestContext%]
com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage$PendingRequestContextStackTrace: At thread [armeria-testing-eventloop-nio-1-1] previous RequestContext is pushed at the following stacktrace
	at com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage$TraceableServiceRequestContext.<init>(LeakTracingRequestContextStorage.java:111)
	at com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage$TraceableServiceRequestContext.<init>(LeakTracingRequestContextStorage.java:105)
	at com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage.warpRequestContext(LeakTracingRequestContextStorage.java:82)
	at com.linecorp.armeria.internal.common.LeakTracingRequestContextStorage.push(LeakTracingRequestContextStorage.java:62)
	at com.linecorp.armeria.internal.common.RequestContextUtil.getAndSet(RequestContextUtil.java:149)
	at com.linecorp.armeria.server.ServiceRequestContext.push(ServiceRequestContext.java:221)
	at com.linecorp.armeria.internal.common.TraceRequestContextLeakTest.lambda$singleThreadContextLeak$2(TraceRequestContextLeakTest.java:101)  <- This is the line where leaked RequestContext is push 
	...
. This means the callback was called from unexpected thread or forgetting to close previous context.
	at com.linecorp.armeria.internal.common.RequestContextUtil.newIllegalContextPushingException(RequestContextUtil.java:100)
	at com.linecorp.armeria.server.ServiceRequestContext.push(ServiceRequestContext.java:237)
	at com.linecorp.armeria.internal.common.TraceRequestContextLeakTest.lambda$singleThreadContextLeak$3(TraceRequestContextLeakTest.java:107)
	...
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a way to debug a context leak
6 participants