Spring WebFlux & Jetty, large responses are cut off due to prematurely closed connection #24128

scottjohnson · 2019-12-03T22:09:22Z

See demonstration repository here: https://github.com/opentable/webflux-large-response-spring

I. Description

Given a Spring WebFlux application running on Jetty server using Spring 5.1.6 / Spring Boot 2.1.5 and above:

When returning a response from the server that is sufficiently large*, the connection is closed prematurely and the response is cut off part-way through.

The client will only receive the bytes of the response prior to the cut off point. The server will emit a Jetty error, java.io.IOException: Closed while Pending/Unready.

2019-12-03 20:28:13.862  WARN 1 --- [ttp@6bb4dd34-15] org.eclipse.jetty.server.HttpOutput      : java.io.IOException: Closed while Pending/Unready

* What is a "sufficiently large" response to trigger the cut-off depends on various factors detailed below.

This repository (https://github.com/opentable/webflux-large-response-spring) is a demonstration of the issue that can be used to reproduce when running locally from IDE, in a local Docker container, in Docker container deployed to Heroku, or running on some other container resource cluster (e.g. Mesos).

See usage details in the Background & Analysis section below. In short:

$ curl -v localhost:8080/just?size=8000000
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /just?size=8000000 HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.54.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Tue, 03 Dec 2019 21:56:03 GMT
< Content-Type: text/plain;charset=utf-8
< Content-Length: 8000000
< 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

(etc...)

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa* transfer closed with 7197652 bytes remaining to read
* stopped the pause stream!
* Closing connection 0
curl: (18) transfer closed with 7197652 bytes remaining to read

II. Affected Versions

Spring 5.1.6 / Spring Boot 2.1.5 and above (tested up to Spring 5.2.1/Boot 2.2.1)
Jetty 9.4.15.v20190215 and above (tested up to Jetty 9.4.24.v20191120)
Java 8 and above (tested up to Java 11)

Note that the issue is only impacted by the Spring/Boot version and not the Jetty or Java versions.

We tested various combinations of Spring, Spring Boot, Jetty, and Java versions and determined that the issue is introduced only with an update to Spring 5.1.6/Spring Boot 2.1.5, regardless of which Jetty or Java version is used.

The issue is not exhibited on Spring 5.1.5/Spring Boot 2.1.4, regardless of which Jetty or Java version is used.

Tests with different Spring and Jetty versions

Does the issue occur?

--

Boot 2.1.4, Jetty 9.4.15.v20190215: No (This is the default pairing for Spring Boot 2.1.4)

Boot 2.1.4, Jetty 9.4.16.v20190411: No

Boot 2.1.4, Jetty 9.4.17.v20190418: No

Boot 2.1.4, Jetty 9.4.18.v20190429: No

Boot 2.1.4, Jetty 9.4.19.v20190610: No

Boot 2.1.4, Jetty 9.4.24.v20191120: No

--

Boot 2.1.5, Jetty 9.4.15.v20190215: Yes

Boot 2.1.5, Jetty 9.4.16.v20190411: Yes

Boot 2.1.5, Jetty 9.4.17.v20190418: Yes

Boot 2.1.5, Jetty 9.4.18.v20190429: Yes (this is the default pairing for Spring Boot 2.1.5)

Boot 2.1.5, Jetty 9.4.19.v20190610: Yes

Boot 2.1.5, Jetty 9.4.24.v20191120: Yes

--

Boot 2.2.1, Jetty 9.4.22.v20191022: Yes (this is the default pairing for Spring Boot 2.2.1)

Boot 2.2.1, Jetty 9.4.24.v20191120: Yes

Conclusion: The specific version of Jetty doesn't matter. The issue occurs consistently with Spring 5.1.6/Boot 2.1.5 and up, regardless of the Jetty version used.

Tests with different servers (Jetty, Netty, Tomcat Undertow)

Does the issue occur?

Jetty server: Yes

Reactor Netty server: No

Tomcat Server: No

Undertow Server: No

Conclusion: The issue is produced only by a combination of Spring 5.1.6/Boot 2.1.5 with the Jetty server. We were able to make requests for significantly larger amounts of data using, e.g., Reactor Netty server without ever encountering the error.

III. Background & Analysis

Q: Is it a problem with WebClient? A: No

We originally encountered this issue with an API gateway proxy server built on Spring Webflux, that uses the WebFlux WebClient to proxy requests and responses from downstream services. Web apps consuming these APIs would encounter an issue with large JSON responses returned by the API gateway being cut-off part way through.

Our initial assumption was that the issue has to do with the WebClient client connector used, but this hypothesis was disproved in testing. Ultimately, what matters is that the response from the server is large, even if no downstream WebClient call is made.

The demo app contains an API /webclient?path="foo" which proxies a request to the URI foo and returns the result as an un-parsed String. When curling this API and proxying to a downstream URI that returns a very large response to the demo app, the demo app will attempt to return this to the caller. The server will close the connection prematurely and the client receives only a partial response.

Testing different WebClient client connectors

Does the issue occur?

Reactor-netty client: Yes

Jetty-reactive-httpclient: Yes

Conclusion: The client connector doesn't matter. The issue occurs consistently with either client connector, as long as the server component is Jetty.

Removing WebClient from the mix...

The demo app also contains an API /just?size=bar which simply generates a large static string of size number of characters and returns it in the server response using Mono.just. When curling this API with a sufficiently large size the server will close the connection prematurely and the client receives only a partial response.

Conclusion: The use of WebClient as part of the call stack doesn't matter. All that matters is returning a large response from the server.

What is a "sufficiently large" response?

We had a difficult time reproducing this issue at first because what is a "sufficiently large" response to trigger the premature connection close can vary by several orders of magnitude depending on the operating environment where the code is running.

We were able to reproduce the issue using this demo app in the following operating environments:

Running locally in the IDE (IntelliJ)
Running in a local Docker container in Docker Desktop
Running in a Docker container in Heroku
Running in a Docker container in a Mesos cluster in our organization's remote data center

The number of bytes of the response that are necessary to trigger the cutoff decreases as you go down this list.

In other words, when running in our Mesos cluster, we are able to consistently reproduce the issue when requesting a response of 78200 characters or more. But when running in Heroku it requires up to 10x the number of bytes (as our Mesos cluster), and when running locally (in local Docker or in IDE) it can require up to 100x the number of bytes to trigger the issue.

Something about our Mesos cluster or our organizational network lowers the threshold significantly, but it is still possible to trigger the issue in these other environments by making the response size sufficiently large.

One hypothesis is that varying threshold has something to do with the network itself (maybe network latency?) over which the request is being made, but we have yet been unable to empirically test this.

Incidentally the threshold to trigger the issue in our Mesos cluster is always exactly 78200 characters.

What about container resource provisioning?

Given that we know the exact threshold to trigger the issue in a particular operating environment (78200 characters in Mesos), we were able to test whether container resource provisioning has any impact on triggering the issue.

Test container resource sizing

Does the issue occur?

Tested with a provisioned container with 512m native memory and 0.1 cpus, and max heap of 256m: Yes, after 78200 characters

Tested with a provisioned container with 2048m native memory and 2.0 cpus, and max heap of 1024m: Yes, after 78200 characters

Conclusion: Container resource sizing does not appear to have any impact on whether or not the issue is triggered or any affect on what is the triggering threshold.

The text was updated successfully, but these errors were encountered:

scottjohnson · 2019-12-04T01:49:21Z

I believe this is fixed in #24050. We did a quick test with 5.2.2.BUILD-SNAPSHOT and the problem disappears. Going to do more thorough testing tomorrow. This is a case where the issue manifests when using HTTP/1.1 with large writes.

rstoyanchev · 2019-12-04T10:52:32Z

Thanks for the detailed report. This does sound like the same issue as #24050 although it is news for me that it did not manifest prior to 5.1.6. It's possible something else changed that altered the timing that in turn made it easier to demonstrate.

I am going to close this for now as duplicate but please do comment further. Also note that 5.2.2 and 5.1.12 were released yesterday so you can test with the released versions.

scottjohnson · 2019-12-04T19:39:35Z

Thanks @rstoyanchev. We tested with 5.1.12 and 5.2.2 release and it looks good.

rstoyanchev · 2019-12-05T10:44:10Z

Great, thanks for confirming.

vanniktech · 2020-10-10T16:24:26Z

I'm getting the same when running Spring Boot 2.3.4 with embedded Tomcat. Should this have been fixed as part of this issue as well?

bclozel · 2020-10-10T17:38:11Z

@vanniktech This sounds like a different problem. If you've got a sample project we can clone/download that reproduces the problem, please create a new issue.

vanniktech · 2020-10-11T10:15:47Z

After some digging it ended up being a configuration issue in NGINX which was truncating larger responses.

spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Dec 3, 2019

bclozel added the status: waiting-for-feedback We need additional information before we can continue label Dec 4, 2019

rstoyanchev closed this as completed Dec 4, 2019

rstoyanchev added status: duplicate A duplicate of another issue and removed status: waiting-for-feedback We need additional information before we can continue status: waiting-for-triage An issue we've not yet triaged or decided on labels Dec 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spring WebFlux & Jetty, large responses are cut off due to prematurely closed connection #24128

Spring WebFlux & Jetty, large responses are cut off due to prematurely closed connection #24128

scottjohnson commented Dec 3, 2019 •

edited

scottjohnson commented Dec 4, 2019 •

edited

rstoyanchev commented Dec 4, 2019 •

edited

scottjohnson commented Dec 4, 2019

rstoyanchev commented Dec 5, 2019

vanniktech commented Oct 10, 2020

bclozel commented Oct 10, 2020

vanniktech commented Oct 11, 2020

Spring WebFlux & Jetty, large responses are cut off due to prematurely closed connection #24128

Spring WebFlux & Jetty, large responses are cut off due to prematurely closed connection #24128

Comments

scottjohnson commented Dec 3, 2019 • edited

I. Description

II. Affected Versions

Tests with different Spring and Jetty versions

Tests with different servers (Jetty, Netty, Tomcat Undertow)

III. Background & Analysis

Q: Is it a problem with WebClient? A: No

Testing different WebClient client connectors

Removing WebClient from the mix...

What is a "sufficiently large" response?

What about container resource provisioning?

Test container resource sizing

scottjohnson commented Dec 4, 2019 • edited

rstoyanchev commented Dec 4, 2019 • edited

scottjohnson commented Dec 4, 2019

rstoyanchev commented Dec 5, 2019

vanniktech commented Oct 10, 2020

bclozel commented Oct 10, 2020

vanniktech commented Oct 11, 2020

scottjohnson commented Dec 3, 2019 •

edited

scottjohnson commented Dec 4, 2019 •

edited

rstoyanchev commented Dec 4, 2019 •

edited