Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection prematurely closed BEFORE response #1296

Closed
martinkrc opened this issue Aug 31, 2020 · 7 comments
Closed

Connection prematurely closed BEFORE response #1296

martinkrc opened this issue Aug 31, 2020 · 7 comments
Assignees
Labels
status/invalid We don't feel this issue is valid

Comments

@martinkrc
Copy link

We're experiencing the "Connections prematurely closed" error frequently in production - most of communication succeeds but some portion of requests fails with the mentioned error.

Expected Behavior

When a connection is successfully used to exchange request/response with the server, Reactor/Netty should not cause failures.

Actual Behavior

As the situation appears to me, sometimes the server side initiates connection closing somewhere around the moment when the client acquires the pooled channel connection, but the connection is still successfully (?) used for request+response and closed at the end, however Reactor/Netty produces "Connection prematurely closed BEFORE response".

Observing from the logs, this is the rough process:

  • Client creates a new connection with the server
  • Client sends an HTTP request
  • Client receives a response
  • Client releases the channel (for later usage)
  • Client acquires the channel for a new request
  • (as per tcpdump) Client receives FIN segment from the server, meaning that the server closed the connection
  • Client sends an HTTP request via the channel
  • Client receives a response
  • (as per tcpdump) Client closes its side of the connection, sending FIN segment to the server
  • Client fails with "Connection prematurely closed BEFORE response"

Connection log:
connection-lifecycle

TCP dump:
tcpdump

Steps to Reproduce

We're unable to reproduce the situation in our test environment, it happens purely in production. However, I believe that the attached log and tcpdump will help with issue resolution from Reactor/Netty side.

This is how we create the Spring WebClient used for communication:

        SslContext sslContext = NettySslContext.initialize(/*...*/);
        HttpClient httpClient = HttpClient
                .create(ConnectionProvider.elastic(
                        "abcd", Duration.of(5000, ChronoUnit.MILLIS), Duration.of(5000, ChronoUnit.MILLIS))
                )
                .keepAlive(true)
                .secure(spec -> spec.sslContext(sslContext))
                .tcpConfiguration(tcpClient -> tcpClient.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 15000)
        );

        return WebClient.builder()
                .clientConnector(new ReactorClientHttpConnector(httpClient))
                .baseUrl("https://some.host.cz:4001")
                .exchangeStrategies(exchangeStrategies)
                .build();

Your Environment

  • Spring Boot 2.2.7.RELEASE
  • Spring Webflux 5.2.6.RELEASE
  • Reactor Netty 0.9.7.RELEASE
  • Reactor Core 3.3.5.RELEASE
  • Netty 4.1.49.Final
  • Java SE 1.8.0_161-b12
  • Linux
@martinkrc martinkrc added status/need-triage A new issue that still need to be evaluated as a whole type/bug A general bug labels Aug 31, 2020
@violetagg
Copy link
Member

@martinkrc Is it possible to try the latest releases?

@martinkrc
Copy link
Author

martinkrc commented Aug 31, 2020

@martinkrc Is it possible to try the latest releases?

Do you recommend any specific version? The trouble is that we already did an upgrade (from 0.8.8 to 0.9.7) recently hoping for improvements, but it didn't help and we don't want to waste release effort - the issue can't be reproduced in our test env. Do you see any ticket post 0.9.7 which could be fixing the problem?

@violetagg violetagg removed the status/need-triage A new issue that still need to be evaluated as a whole label Sep 1, 2020
@violetagg violetagg self-assigned this Sep 1, 2020
@violetagg
Copy link
Member

@martinkrc
I'm thinking about this
#1165
#1183
Although the root cause might be different in your case

The current releases are Reactor Netty 0.9.11, Spring Boot 2.3.3, Spring Framework 5.2.8

@violetagg
Copy link
Member

violetagg commented Sep 1, 2020

@martinkrc One additional question: Is there any kind of timeout on the target server?
keep alive timeout/idle timeout etc. that closes the connection when it was not used for a particular time?

(For example if you have Tomcat as a target server with connectionTimeout=5s this will automatically configure keepAliveTimeout=5s by default)

From the TCP dump I see that the target server might have such timeout (5s?)

Screenshot 2020-09-01 at 8 59 03

If that's the case configure the pool with max idle time below 5s, also you may want to switch the pool from FIFO to LIFO leasing strategy so that you will use the most recently used connection. (we have fixes related to FIFO/LIFO leasing in 0.9.11)

@violetagg violetagg added the for/user-attention This issue needs user attention (feedback, rework, etc...) label Sep 1, 2020
@violetagg
Copy link
Member

@martinkrc Were you able to try the latest versions?

@martinkrc
Copy link
Author

@violetagg Need more time, I managed to reproduce the issue somehow locally, but not in a reliable way yet. Next steps would be trying a newer version, but don't have capacity for it at the moment.
Meanwhile we're trying to find out the answer for your additional question - about timeout on the target server.

@violetagg
Copy link
Member

@martinkrc I'm closing this, if you have additional information we can reopen it.

@violetagg violetagg added status/invalid We don't feel this issue is valid and removed for/user-attention This issue needs user attention (feedback, rework, etc...) type/bug A general bug labels Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/invalid We don't feel this issue is valid
Projects
None yet
Development

No branches or pull requests

2 participants