Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incoming HTTP request not being fully read or closed when a constraint validator rejects the request #28818

Closed
davidreis97 opened this issue Oct 25, 2022 · 19 comments · Fixed by #29119
Labels
area/resteasy-reactive kind/bug Something isn't working triage/wontfix This will not be worked on

Comments

@davidreis97
Copy link

davidreis97 commented Oct 25, 2022

Describe the bug

Machine: Apple M1 Pro
JVM: openjdk 11.0.2
Quarkus: 2.7.4.Final, 2.13.3.Final

There seems to be a race condition between reading a request from a TCP socket and constraint validators.

When a request gets rejected by a constraint validator (such as @Consume) before Quarkus had a chance to fully ACK the incoming request, Quarkus answers back with the appropriate status code and stops reading the request from the TCP socket but doesn't close the connection.

This behavior, keeping the connection open but not reading data from the socket, leaves clients in a hanging state after they fully filled the TCP transmission window, waiting for it to be ACK'ed.

The client hang may not be observed if the requests are smaller in size, given that in those cases it's likely that the request has been completely received and ACK'ed by Quarkus before the constraint validator had the chance to fire. In these cases, no hanging is observed.

Expected behavior

Quarkus should either read the incoming request fully or close the connection, so as to signal to the client that the request transmission should be over.

Actual behavior

Described in the previous section. Follow the steps below to reproduce.

How to Reproduce?

  1. Create a sample Quarkus application using quarkus create && cd code-with-quarkus
  2. Add the following endpoint:
@Path("/hello")
public class GreetingResource {

    @POST
    @Consumes({"image/jpeg", "image/png"})
    @Produces(MediaType.TEXT_PLAIN)
    public String hello(byte[] imageFile) {
        throw new NotAuthorizedException("");
    }
}
  1. Make a request with a large payload (3MB is enough on my machine, Apple M1 Pro) which breaks the @Consumes filter by sending a different Content-Type header. For your convenience, here's a python snippet that'll do it:
import requests
import os

url = "http://localhost:8080/hello"

payload=b"\x00"+os.urandom(1024 * 1024 * 3)+b"\x00"
headers = {
  'Content-Type': 'image/gif'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)

At this point, the Python code should hang because the request stream is neither being fully read nor closed by Quarkus.

Output of uname -a or ver

(macOS M1 Pro) Darwin .local 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:19:52 PDT 2022; root:xnu-8020.140.49~2/RELEASE_ARM64_T6000 arm64

Output of java -version

openjdk version "11.0.2" 2019-01-15 OpenJDK Runtime Environment 18.9 (build 11.0.2+9) OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)

GraalVM version (if different from Java)

No response

Quarkus version or git rev

Reproduced in 2.7.4.Final and 2.13.3.Final

Build tool (ie. output of mvnw --version or gradlew --version)

Apache Maven 3.8.6 Maven home: /Users//.m2/wrapper/dists/apache-maven-3.8.6-bin/67568434/apache-maven-3.8.6 Java version: 11.0.2, vendor: Oracle Corporation, runtime: /Library/Java/JavaVirtualMachines/jdk-11.0.2.jdk/Contents/Home Default locale: en_PT, platform encoding: UTF-8 OS name: "mac os x", version: "10.16", arch: "x86_64", family: "mac"

Additional information

No response

@davidreis97 davidreis97 added the kind/bug Something isn't working label Oct 25, 2022
@quarkus-bot quarkus-bot bot added the env/m1 Impacts Apple M1 machines label Oct 25, 2022
@quarkus-bot
Copy link

quarkus-bot bot commented Oct 25, 2022

/cc @gastaldi

@gastaldi gastaldi added area/core triage/qe? Issue could use a quality focused review to further harden it labels Oct 25, 2022
@quarkus-bot
Copy link

quarkus-bot bot commented Oct 25, 2022

/cc @mjurc, @rsvoboda

@geoand
Copy link
Contributor

geoand commented Oct 25, 2022

Thanks for reporting.

Which RESTEasy extension are you using? quarkus-resteasy or quarkus-resteasy-reactive?

@davidreis97
Copy link
Author

davidreis97 commented Oct 25, 2022

Hi, thanks for the help. We're using quarkus-resteasy-reactive

@geoand
Copy link
Contributor

geoand commented Oct 25, 2022

Can you try with quarkus-resteasy-reactive please @davidreis97 ?

@davidreis97
Copy link
Author

Sorry, I edited right after I commented, we're already using quarkus-resteasy-reactive

@geoand
Copy link
Contributor

geoand commented Oct 25, 2022

Okay, thanks

@geoand
Copy link
Contributor

geoand commented Oct 26, 2022

Looking at a Wireshark capture, it seems like the Python client is not sending the TCP FIN package after receiving the HTTP 415 response.

It looks something like this:

Screenshot from 2022-10-26 08-39-33

So I am going to close this issue as invalid. Feel free however to comment and provide additional info if you still feel it's an issue

@geoand geoand added triage/invalid This doesn't seem right and removed env/m1 Impacts Apple M1 machines triage/qe? Issue could use a quality focused review to further harden it labels Oct 26, 2022
@geoand geoand closed this as not planned Won't fix, can't repro, duplicate, stale Oct 26, 2022
@davidreis97
Copy link
Author

davidreis97 commented Oct 26, 2022

@geoand Thank you for looking into this, however, this isn't a problem that just happens with a Python client, this also happens with Postman and with the Java RestEasy client that Quarkus uses as well (the latter is actually where we saw the issue for the first time).

I believe this happens because Quarkus left the TCP connection open and unread. Quarkus should either read the request that is being sent or close the connection so that the TCP transmission is complete.

@davidreis97
Copy link
Author

davidreis97 commented Oct 27, 2022

@geoand Just to add, after sleeping on this - RFC 2626 (HTTP 1.1) clearly states that a server has the right to respond to a request before receiving it fully, and a client should monitor this response and stop transmitting the request after receiving an error response.

However, given that a lot of modern HTTP client implementations (including the one used by Quarkus for RestEasy auto-generated HTTP clients) do not handle this properly, I believe the right solution here could be to fully read the request from the TCP socket even after responding to the request. Closing the TCP connection from the server side after sending a response would also fix this issue but would mess with the keep-alive connection reuse, so I'd steer away from that.

@geoand
Copy link
Contributor

geoand commented Oct 27, 2022

However, given that a lot of modern HTTP client implementations (including the one used by Quarkus for RestEasy auto-generated HTTP clients) do not handle this properly

If you have a sample that shows this, I'd love to test it out.

@geoand
Copy link
Contributor

geoand commented Oct 27, 2022

However, given that a lot of modern HTTP client implementations (including the one used by Quarkus for RestEasy auto-generated HTTP clients) do not handle this properly, I believe the right solution here could be to either fully read the request from the TCP socket even after responding to the request. Closing the TCP connection from the server side after sending a response would also fix this issue but would mess with the KeepAlive functionality, so I'd steer away from that.

To be honest, this behavior is controlled by Vert.x, so @cescoffier and @vietj probably have a lot more to say about this

@davidreis97
Copy link
Author

davidreis97 commented Oct 27, 2022

If you have a sample that shows this, I'd love to test it out.

@geoand Here's a quick project that'll help you repro it in 5 minutes. The README has a 3 step guide on what you should do to trigger the issue.

https://github.com/davidreis97/QuarkusRepro28818

@cescoffier
Copy link
Member

I think the problem is really in the client, not in the server. The server respects the specification. It cannot read the request, as it would create an attack vector (send large invalid requests). So, the server is doing the right thing.

Maybe we should close the connection, in this case; that would be the only thing we could add. However, the clients should be able to react to that.

Does it also happen with the reactive rest client from Quarkus?

@davidreis97
Copy link
Author

@cescoffier As I've reported in this comment, this happens with multiple (most?) REST clients including the Quarkus Reactive Rest Client. I've provided a test project to reproduce this here.

@davidreis97
Copy link
Author

davidreis97 commented Nov 2, 2022

@cescoffier @geoand Hi there, have you had the chance to read my previous comment? Should we reopen the ticket and set the proper labels to reflect the fact that this is still an issue that needs work?

@geoand
Copy link
Contributor

geoand commented Nov 2, 2022

@Sgitario can you take a look at the Reactive REST Client issue mentioned here?

Sgitario added a commit to Sgitario/quarkus that referenced this issue Nov 8, 2022
@Sgitario
Copy link
Contributor

Sgitario commented Nov 8, 2022

I could reproduce this issue and this is a tentative fix for it: #29119

@Sgitario Sgitario reopened this Nov 8, 2022
@quarkus-bot quarkus-bot bot removed the triage/invalid This doesn't seem right label Nov 8, 2022
@quarkus-bot quarkus-bot bot added this to the 2.15 - main milestone Nov 8, 2022
@gsmet gsmet modified the milestones: 2.15 - main, 2.14.1.Final Nov 10, 2022
gsmet pushed a commit to gsmet/quarkus that referenced this issue Nov 10, 2022
pedroh-pereira pushed a commit to pedroh-pereira/quarkus that referenced this issue Nov 14, 2022
@Sgitario
Copy link
Contributor

Sgitario commented Dec 5, 2022

The fix of this issue is going to be reverted as agreed in #29469 (comment).

@Sgitario Sgitario added the triage/wontfix This will not be worked on label Dec 5, 2022
@Sgitario Sgitario removed this from the 2.14.1.Final milestone Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/resteasy-reactive kind/bug Something isn't working triage/wontfix This will not be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants