New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebClient using default reactor-netty client leaks netty ByteBufs on timeout #22384
Comments
Thanks for the report. This has been tagged as Reactor Netty issue. Please follow reactor/reactor-netty#603. |
Thanks. I will follow that issue as you suggest. Interesting that the issue here is actually with reactor-netty, when the problem doesn't seem to reproduce when using their HttpClient directly - only when using reactor-netty indirectly via WebClient. Thanks for looking into it and forwarding appropriately! |
Are you sure about this? I see the exact opposite. When I run the tests with This is also supported by looking at the source code. The Reactor Netty ByteBufFlux#aggregate is accumulating and retains without provisions for cancellation. By contrast the StringDecoder does use So as far as I can see we're okay in the Spring Framework. /cc @violetagg |
@rstoyanchev I'm not able to reproduce the issue. Only these exceptions appear in the log files
I'm using Netty 4.1.34.Final and Reactor Netty 0.8.6.BUILD-SNAPSHOT |
I just took another look at this. I've upgraded the spring-boot version from 2.1.2.RELEASE to 2.1.3.RELEASE, and thus spring-framework 5.1.5.RELEASE and reactor-netty 0.8.5.RELEASE. It looks like I may have just gotten "lucky" in my initial runs of my test cases and only happened to see leak reports when running the one that uses the WebClient. I do still see leak messages emitted when running both test cases (but only sometimes). I ran the following 20 times (running just the test case which uses a WebClient which uses reactor-netty):
and here is the output from the 6 of those times where leak reports were emitted: I ran the following 20 times (running just the test case which uses reactor-netty directly):
and here is the output from the 4 of those times where leak reports were emitted: There is some variety to the leak reports. One general observation is that in the direct reactor-netty cases, the leak reports were always accompanied by IllegalReferenceCountExceptions like the one noted above. In contrast, none of the WebClient test runs which emitted leak reports contained this exception (though I did see this exception on a few runs which did not emit leak reports). I don't know whether this detail is a coincident or not. I've pushed a few minor changes to the test cases along with the version update noted above. |
@danielra did you notice the versions mentioned? Can you also try with:
|
Thank you for calling that out! I did miss updating that version. I have done so now (and pushed the change), and on my first attempt with the webClient test, I observed this leak report emitted via:
webclient-test-log-with-leak-updated-reactor_1.txt
reactor-netty-test-log-with-leak-updated-reactor_1.txt FWIW, the pattern of correlation with IllegalReferenceCountExceptions occurring when leak reports are emitted by the direct reactorNetty test (but not in the webClient test) persisted. |
Sorry, that was with the updated reactor-bom but Netty 4.1.33.Final instead of Netty 4.1.34.Final. I've updated the netty version as well and captured the following two leak reports:
webclient-test-log-with-leak-updated-reactor-and-netty_1.txt
reactor-netty-test-log-with-leak-updated-reactor-and-netty_1.txt |
@danielra Can you try now |
Created an issue for the exception that I observed when running the test with WebClient #22594 |
Thanks! I just tried a few runs of the test cases with the 0.8.6.BUILD-SNAPSHOT version of reactor-netty, and pushed that dependency change to the demo project. Here is example output from each test case from runs which emitted leak reports:
webclient-test-log-with-leak-updated-reactor-and-netty-and-reactor-netty_1.txt
reactor-netty-test-log-with-leak-updated-reactor-and-netty-and-reactor-netty_1.txt |
@danielra According to the log you are still using
|
@rstoyanchev I see the stack below, did you have something like this before?
|
Oh, sorry. I thought specifying the version as 0.8.6.BUILD-SNAPSHOT would be enough to pickup your recent change. I've now instead cloned the reactor-netty project and included the version built from the head of master (which does include the linked change) via an
reactor-netty-test-log-with-leak-updated-reactor-and-netty-and-reactor-netty_2.txt |
@violetagg with 0.8.6 snapshots I see this from
And this with
as well as this:
As for something like:
The only way I can explain in the given scenario, i.e. |
I have tried and failed to reproduce the issue with @Test
public void joinWithCancellation() {
Mono<DataBuffer> bufferFlux = DataBufferUtils.join(
Flux.interval(Duration.ofMillis(40)).take(50).map(aLong -> stringBuffer("foo" + aLong)));
StepVerifier.create(bufferFlux)
.thenAwait(Duration.ofMillis(ThreadLocalRandom.current().nextInt(81, 116)))
.thenCancel()
.verify();
} This is in DataBufferUtilsTests so at the end of the test there is a verifyAllocations check. @danielra since you have a reproducible test case, I suggest creating a new issue under https://github.com/reactor/reactor-netty. |
Here is the issue for Reactor Netty reactor/reactor-netty#700 |
I am using the latest version of Spring Boot (2.2.6) and even though this issue is supposed to have been fixed, I'm seeing the same memory leak being reported: "LEAK: ByteBuf.release() was not called before it's garbage-collected". Is there some place in the code below where I am supposed to be, somehow, calling ByteBuf.release(), even though it seems like Spring and Netty internal classes should be taking care of this? Or have I, otherwise, written the code improperly and should be doing it some other, better, way? Edit: I should mention, I found the following two "Hint" messages in the stacktrace....
|
@jkjom it may sound similar but in your scenario you're getting raw data from the
You can follow that recommendation or use |
@rstoyanchev Thanks for your help! I finally found something that worked to get rid of ByteBuf.release() memory leak messages....
|
Affects: 5.1.4.RELEASE
I see buffer leak reports from Netty's ResourceLeakDetector when a Mono.timeout cancels a subscription to the Mono returned from bodyToMono. I've created a demo project with a test case which seems to reliably reproduce the issue. A separate test case is included which uses the reactor-netty HttpClient directly instead of using it through the WebClient wrapper - and I have not seen any leak reports in this case with the current version of reactor-netty (v0.8.4.RELEASE).
Demo project with mentioned test cases can be found here:
https://github.com/danielra/buffer-leak-repro-reactor-netty
The specific test case which reproduces the leak can be seen here:
https://github.com/danielra/buffer-leak-repro-reactor-netty/blob/master/src/test/java/com/example/demo/DemoApplicationTests.java#L59
The tests can be run via
./gradlew clean test --debug
. Please note that the tests "pass" unconditionally, but buffer leak reports can be viewed in the console output.Here is an example of the leak report log messages I have observed from this test case:
webclient_buffer_leak_example_log.txt
The text was updated successfully, but these errors were encountered: