Optimize FromRequest for Bytes #2583

jplatte · 2024-02-09T21:46:26Z

… and introduce FromRequest for BytesMut.

Closes #2548.

jplatte · 2024-02-09T21:50:08Z

If this works, I think we should upstream it / something like it to http-body-util. Collected probably just isn't really optimal for when you ultimately want the output in one contiguous buffer, since it first collects all the individual chunks into many small buffers.

mladedav · 2024-02-09T23:47:01Z

Collected probably just isn't really optimal for when you ultimately want the output in one contiguous buffer, since it first collects all the individual chunks into many small buffers.

All those buffers are copied only once into the BytesMut where they are ultimately concatenated, same as the Bytes obtained from the frames here, right? This might be faster because the BytesMut can be moved while we wait for the next frame though. Or am I missing something?

Also the Collected variant knows how large the BytesMut backing array should be at creation time so it might actually save time because of those reallocations I think.

axum-core/src/extract/request_parts.rs

mladedav · 2024-02-10T23:06:41Z

Now that I think about it, we'll probably have the content-length header in this context more often than not. So we also know how large the allocation should be.

I say probably because it's not mandatory in HTTP 1.1 (the client can just send data and close the connection), it's not sent with chunked transfer (collecting to bytes is a bit dubious then but just collecting it to a growing vector sounds reasonable in that case) and it's completely optional in HTTP 2 as it does not need it because of its framing.

So using the header should make this not reallocate the buffer in most cases. According to RFC 9110: "A user agent SHOULD send Content-Length in a request when the method defines a meaning for enclosed content and it is not sending Transfer-Encoding," so I think it'll be fine to use the header in a happy path.

However, when I tried to play around with this, something (probably hyper?) didn't pass the body to axum when the content-length header was missing. I might also be doing something wrong so I'll try to isolate it with just hyper next and see what's what.

davidpdrsn · 2024-02-11T10:47:33Z

However, when I tried to play around with this, something (probably hyper?) didn't pass the body to axum when the content-length header was missing. I might also be doing something wrong so I'll try to isolate it with just hyper next and see what's what.

I believe for http 1 hyper requires the body to either have a content-length or chunked encoding. If both are missing I'm not sure what happens so it could just pass the body to axum.

Thanks for looking into this!

mladedav · 2024-02-12T22:30:55Z

For HTTP 1.1 if both are missing, it is assumed by hyper that there is no body, which is the correct behavior as per RFC. If the client sends body, it's interpreted as a new request. So in the case of HTTP 1.1 it is in fact either content-length present, transfer-encoding: chunked present or no body.

For HTTP 2 content length may be missing, but the clients should send it anyway. If they don't I guess it's up to axum (or the user) to decide if the collection should proceed with potential unnecessary reallocations and byte copies, the original approach where first a list of Bytes is collected and then it's copied into one bigger allocation, or it could return 411 to request content length, although I assume that's not something axum should do by default.

yanns · 2024-02-13T06:32:09Z

In case of HTTP 1.1, what happens if the content-length is present, and if the client sends a body with another length. Is this can be used as attack? Does hyper rejects this? Should axum handle this case?

mladedav · 2024-02-13T08:36:07Z

If the users send less bytes and they do not collect the body, everything runs the same way as it usually does. If another request is sent on the same connection, it will be assumed by hyper that it is still the body of the preceding request.

If the user sends less bytes and wants to collect the body, their handler will never run, because the body can never be collected. The request will end with a timeout, or if another request is sent on the same connection, it will be considered as part of the body.

If the user sends more bytes, the first part up to the declared content length is considered to be the body of the request and the rest will be considered a new request on the same connection, most likely resulting in 400 since chances are it does not start with valid HTTP line.

As far as I can tell, hyper cannot reject this since it cannot know that it is happening. For the same reason axum cannot handle it in any meaningful way either.

yanns · 2024-02-13T09:34:42Z

If the users send less bytes and they do not collect the body

Do you mean: "if the users send less bytes, and the application do not collect the body"?

If another request is sent on the same connection, it will be assumed by hyper that it is still the body of the preceding request.

So is user A sends the application less bytes than that's in the content-length, then if user B is using the same connection, it could be possible for user B to hijack the first request and sends some bytes into it?

mladedav · 2024-02-13T10:19:52Z

Do you mean: "if the users send less bytes, and the application do not collect the body"?

Yes.

if user B is using the same connection, it could be possible for user B to hijack the first request and sends some bytes into it?

The only way I can think of something like this could happen is if there would be a connection pool in some application and two different tasks tried to send their requests one after the other with one declaring wrong content length. However, if one of them sends a request with bad content-length, then that application has serious bugs and there is nothing axum or hyper can do, they must assume the content-length is correct.

yanns · 2024-02-13T10:39:48Z

if user B is using the same connection, it could be possible for user B to hijack the first request and sends some bytes into it?

The only way I can think of something like this could happen is if there would be a connection pool in some application and two different tasks tried to send their requests one after the other with one declaring wrong content length.

I was thinking of HTTP load balancer sharing a connection pool to the application, but by looking again at it, it does not seem possible to reproduce.
By searching around on internet, one possible attack is to use Transfer-Encoding and Content-Length at the same time. In that case Content-Length should be ignored.

yanns · 2024-02-13T17:37:49Z

I'm trying this PR, and it changes the behavior when using .layer(DefaultBodyLimit::max(...)). The requests are not rejected anymore.

Edit:
Instead of using req.into_body(), we should use req.into_limited_body().

yanns · 2024-02-13T17:55:28Z

When trying out this change with production-like load, I can confirm that the CPU usage issue is fixed.

yanns · 2024-02-15T13:44:33Z

This change seems to be sufficient to handle limits on body size: #2592

jplatte · 2024-02-15T21:10:13Z

Thanks!

yanns · 2024-02-16T08:50:02Z

To fix the error in CI: #2596

yanns · 2024-02-20T08:35:15Z

@jplatte is there anything I can do to help you on this PR?

jplatte · 2024-02-22T20:04:35Z

Soo.. The reason I haven't marked this PR as ready for review yet is that I hope it will be unnecessary. There has been a little bit of discussion on Discord in the hyper channel about it (as you know), and apparently Sean had some idea on how the http-body-util thing we're currently using could be optimized. I just didn't follow up on that yet because I've got lots of other things going on.

frederikhors · 2024-02-22T20:11:12Z

But in the meantime can we address the slow release with this?

jplatte · 2024-02-22T20:39:49Z

Yeah, I guess so.

… and introduce FromRequest for BytesMut. The more complex implementation is _currently_ faster, but will likely be simplified to use http-body-util helpers again once the performance of those catches up. Co-authored-by: Yann Simon <yann.simon@commercetools.com>

axum-core/src/extract/request_parts.rs

dayvejones · 2024-03-15T08:45:30Z

Is the current PR still relevant after release http-body-util v0.1.1?

mladedav · 2024-03-15T10:30:13Z

@dayvejones the update only optimizes the performance of Bytes, but this also exposes an extractor for BytesMut.

So I think this should address the performance issues someone mentioned earlier but not the issue that is mentioned in the description.

mladedav reviewed Feb 10, 2024

View reviewed changes

axum-core/src/extract/request_parts.rs Outdated Show resolved Hide resolved

jplatte force-pushed the jplatte/bytes-from-request branch from 80d84e6 to 1a43af4 Compare February 11, 2024 10:54

jplatte force-pushed the jplatte/bytes-from-request branch from 94f545f to 3d32d0f Compare February 22, 2024 20:39

jplatte marked this pull request as ready for review February 22, 2024 20:39

jplatte requested a review from davidpdrsn February 22, 2024 20:40

jplatte force-pushed the jplatte/bytes-from-request branch from 3d32d0f to ab2b9e7 Compare February 22, 2024 20:41

jplatte changed the title ~~WIP: Optimize FromRequest for Bytes~~ Optimize FromRequest for Bytes Feb 22, 2024

mladedav reviewed Feb 22, 2024

View reviewed changes

axum-core/src/extract/request_parts.rs Outdated Show resolved Hide resolved

Update axum-core/src/extract/request_parts.rs

4493713

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize FromRequest for Bytes #2583

Optimize FromRequest for Bytes #2583

jplatte commented Feb 9, 2024 •

edited

jplatte commented Feb 9, 2024

mladedav commented Feb 9, 2024

mladedav commented Feb 10, 2024

davidpdrsn commented Feb 11, 2024 •

edited

mladedav commented Feb 12, 2024

yanns commented Feb 13, 2024

mladedav commented Feb 13, 2024

yanns commented Feb 13, 2024

mladedav commented Feb 13, 2024 •

edited

yanns commented Feb 13, 2024

yanns commented Feb 13, 2024 •

edited

yanns commented Feb 13, 2024

yanns commented Feb 15, 2024

jplatte commented Feb 15, 2024

yanns commented Feb 16, 2024

yanns commented Feb 20, 2024

jplatte commented Feb 22, 2024 •

edited

frederikhors commented Feb 22, 2024

jplatte commented Feb 22, 2024

dayvejones commented Mar 15, 2024 •

edited

mladedav commented Mar 15, 2024

Optimize FromRequest for Bytes #2583

Are you sure you want to change the base?

Optimize FromRequest for Bytes #2583

Conversation

jplatte commented Feb 9, 2024 • edited

jplatte commented Feb 9, 2024

mladedav commented Feb 9, 2024

mladedav commented Feb 10, 2024

davidpdrsn commented Feb 11, 2024 • edited

mladedav commented Feb 12, 2024

yanns commented Feb 13, 2024

mladedav commented Feb 13, 2024

yanns commented Feb 13, 2024

mladedav commented Feb 13, 2024 • edited

yanns commented Feb 13, 2024

yanns commented Feb 13, 2024 • edited

yanns commented Feb 13, 2024

yanns commented Feb 15, 2024

jplatte commented Feb 15, 2024

yanns commented Feb 16, 2024

yanns commented Feb 20, 2024

jplatte commented Feb 22, 2024 • edited

frederikhors commented Feb 22, 2024

jplatte commented Feb 22, 2024

dayvejones commented Mar 15, 2024 • edited

mladedav commented Mar 15, 2024

jplatte commented Feb 9, 2024 •

edited

davidpdrsn commented Feb 11, 2024 •

edited

mladedav commented Feb 13, 2024 •

edited

yanns commented Feb 13, 2024 •

edited

jplatte commented Feb 22, 2024 •

edited

dayvejones commented Mar 15, 2024 •

edited