New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize FromRequest for Bytes #2583
base: main
Are you sure you want to change the base?
Conversation
If this works, I think we should upstream it / something like it to |
All those buffers are copied only once into the Also the |
Now that I think about it, we'll probably have the I say probably because it's not mandatory in HTTP 1.1 (the client can just send data and close the connection), it's not sent with chunked transfer (collecting to bytes is a bit dubious then but just collecting it to a growing vector sounds reasonable in that case) and it's completely optional in HTTP 2 as it does not need it because of its framing. So using the header should make this not reallocate the buffer in most cases. According to RFC 9110: "A user agent SHOULD send Content-Length in a request when the method defines a meaning for enclosed content and it is not sending Transfer-Encoding," so I think it'll be fine to use the header in a happy path. However, when I tried to play around with this, something (probably hyper?) didn't pass the body to axum when the |
I believe for http 1 hyper requires the body to either have a content-length or chunked encoding. If both are missing I'm not sure what happens so it could just pass the body to axum. Thanks for looking into this! |
80d84e6
to
1a43af4
Compare
For HTTP 1.1 if both are missing, it is assumed by hyper that there is no body, which is the correct behavior as per RFC. If the client sends body, it's interpreted as a new request. So in the case of HTTP 1.1 it is in fact either For HTTP 2 content length may be missing, but the clients should send it anyway. If they don't I guess it's up to axum (or the user) to decide if the collection should proceed with potential unnecessary reallocations and byte copies, the original approach where first a list of Bytes is collected and then it's copied into one bigger allocation, or it could return 411 to request content length, although I assume that's not something axum should do by default. |
In case of HTTP 1.1, what happens if the |
If the users send less bytes and they do not collect the body, everything runs the same way as it usually does. If another request is sent on the same connection, it will be assumed by hyper that it is still the body of the preceding request. If the user sends less bytes and wants to collect the body, their handler will never run, because the body can never be collected. The request will end with a timeout, or if another request is sent on the same connection, it will be considered as part of the body. If the user sends more bytes, the first part up to the declared content length is considered to be the body of the request and the rest will be considered a new request on the same connection, most likely resulting in 400 since chances are it does not start with valid HTTP line. As far as I can tell, hyper cannot reject this since it cannot know that it is happening. For the same reason axum cannot handle it in any meaningful way either. |
Do you mean: "if the users send less bytes, and the application do not collect the body"?
So is user A sends the application less bytes than that's in the |
Yes.
The only way I can think of something like this could happen is if there would be a connection pool in some application and two different tasks tried to send their requests one after the other with one declaring wrong content length. However, if one of them sends a request with bad content-length, then that application has serious bugs and there is nothing axum or hyper can do, they must assume the content-length is correct. |
I was thinking of HTTP load balancer sharing a connection pool to the application, but by looking again at it, it does not seem possible to reproduce. |
I'm trying this PR, and it changes the behavior when using Edit: |
When trying out this change with production-like load, I can confirm that the CPU usage issue is fixed. |
This change seems to be sufficient to handle limits on body size: #2592 |
Thanks! |
To fix the error in CI: #2596 |
@jplatte is there anything I can do to help you on this PR? |
Soo.. The reason I haven't marked this PR as ready for review yet is that I hope it will be unnecessary. There has been a little bit of discussion on Discord in the hyper channel about it (as you know), and apparently Sean had some idea on how the |
But in the meantime can we address the slow release with this? |
94f545f
to
3d32d0f
Compare
Yeah, I guess so. |
… and introduce FromRequest for BytesMut. The more complex implementation is _currently_ faster, but will likely be simplified to use http-body-util helpers again once the performance of those catches up. Co-authored-by: Yann Simon <yann.simon@commercetools.com>
3d32d0f
to
ab2b9e7
Compare
Is the current PR still relevant after release |
@dayvejones the update only optimizes the performance of So I think this should address the performance issues someone mentioned earlier but not the issue that is mentioned in the description. |
… and introduce FromRequest for BytesMut.
Closes #2548.