Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RequestSizeLimitMiddleware and RequestTimeoutMiddleware #2328

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

adriangb
Copy link
Member

@adriangb adriangb commented Nov 6, 2023

Closes #2155

@adriangb
Copy link
Member Author

adriangb commented Nov 6, 2023

TODOs:

  • Docs
  • What do we want to do for websockets? Lifespans?

@adriangb adriangb self-assigned this Nov 6, 2023
@adriangb adriangb requested a review from a team November 6, 2023 23:28
@adriangb adriangb added the feature New feature or request label Nov 6, 2023
@adriangb adriangb added this to the Version 1.0 milestone Nov 6, 2023
@adriangb
Copy link
Member Author

A couple of things to consider:

  • Dealing with SSE requests
  • Overriding instead of adding. As this PR currently stands you could set a 10s timeout for your entire app and a stricter timeout for individual routes, but not the other way around (a 1s timeout for all routes except some specific routes which get not timeout, e.g. because they stream data). To get that working I think we'd need to do the enforcing at the route level and have a middleware that simply sets some markers on the scope object to customize the limits. That would be more invasive of a change than this.

@Kludex
Copy link
Sponsor Member

Kludex commented Nov 14, 2023

Both middlewares can be on a third party packages, and we can close the issue by just pointing at them.

Can we separate the two middlewares in different PRs? The initial discussion was about the request size one.

@adriangb
Copy link
Member Author

Both middlewares can be on a third party packages, and we can close the issue by just pointing at them.

I disagree. It's an important feature to have readily available to all Starlette users, it doesn't require 3rd party packages, the complexity isn't massive and I believe we should make some limits the default in 1.0 like Django and other more mature frameworks do.

Can we separate the two middlewares in different PRs? The initial discussion was about the request size one.

Yes I'm happy to do that if it'd make it easier to discuss.

@alex-oleshkevich
Copy link
Member

for reference: my approach which works on per-route basis
#2175

@alex-oleshkevich
Copy link
Member

My main concern about middleware based approach is that it is very limited. Say, your global middleware will always have a large value. Because if just one route needs to handle 1gb uploads, the global limit will be 1gb.
Per route limits does not have this flaw, however it is still possible to read unlimited data in custom middlewares. But per-route approach solves the majority of cases.

I see it as app-level limit (the global one that covers all middlewares also) + option to override per a route. In result, we can get this example setup: 8mb global and 24mb for /user/photo-upload.

@adriangb
Copy link
Member Author

Yeah, I brought that up above.

I'll rework this to satisfy that use case, although it will be a good chunk more invasive as I said above.

@adriangb
Copy link
Member Author

Also sorry I did not consider your PR, I had lost track / forgotten about it.

@abersheeran
Copy link
Member

I have a small question, why not just read the request content-length to limit the size.

@adriangb
Copy link
Member Author

Clients can easily lie about that

@abersheeran
Copy link
Member

Clients can easily lie about that

But proxy server or asgi
server should not receive data more than content-length?

@adriangb
Copy link
Member Author

I'm not saying we shouldn't also limit the length to the reported content length, I'm just saying that it doesn't serve a security purpose.

@abersheeran
Copy link
Member

I'm not saying we shouldn't also limit the length to the reported content length, I'm just saying that it doesn't serve a security purpose.

Maybe my previous words caused some misunderstanding. What I meant was: the forward reverse proxy server or ASGI server will not pass data to the ASGI application that exceeds the content-length length. Therefore, forging a content-length length on the client side will only result in truncation when the server receives data.
For example, h11 used by Uvicorn will check whether the Content-Length in the Reader matches the actual size of the received data.

@adriangb
Copy link
Member Author

Then I’m confused as to what you are suggesting: are you saying that we should also enforce that limit, that that limit is already enforced elsewhere, and this PR (or #2174) is not needed, or something else?

Comment on lines +62 to +79
async def rcv() -> Message:
nonlocal total_size
message = await receive()
chunk_size = len(message.get("body", b""))
if self.max_chunk_size is not None and chunk_size > self.max_chunk_size:
raise ChunkTooLarge(
self.max_chunk_size
if self.include_limits_in_error_responses
else None
)
total_size += chunk_size
if self.max_request_size is not None and total_size > self.max_request_size:
raise RequestTooLarge(
self.max_request_size
if self.include_limits_in_error_responses
else None
)
return message
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I’m confused as to what you are suggesting: are you saying that we should also enforce that limit, that that limit is already enforced elsewhere, and this PR (or #2174) is not needed, or something else?

Would it be more concise and efficient to change here to judge the request content-length?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the user-defined limits? Or in addition to?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What @abersheeran meant is to check the Content-Length instead of having the logic of adding up chunk_sizes - which is actually what Django does: https://github.com/django/django/blob/6daf86058bb6fb922eb3fe3abae6f5c0e645020c/django/http/request.py#L323-L347.

Django also has this logic on the multipart parser: https://github.com/django/django/blob/594873befbbec13a2d9a048a361757dd3cf178da/django/http/multipartparser.py#L241-L248.

@Kludex
Copy link
Sponsor Member

Kludex commented Nov 17, 2023

What do we want to do for websockets?

  1. Tornado WebSockets have a limitation by default.
    1. https://www.tornadoweb.org/en/stable/releases/v4.5.0.html
    2. https://www.tornadoweb.org/en/stable/websocket.html#tornado.websocket.WebSocketHandler
  2. Uvicorn already limits the size of each WebSocket message via —ws-max-size configuration - it only applies to websockets.
    1. https://www.uvicorn.org/settings/#implementation.

I don't think WebSockets should be a point of concern here.

Lifespans?

Nothing to do for lifespans.

Dealing with SSE requests

No special logic is required for SSE.

Notes/Questions

  1. Is the argumentation of having resource limits on the ASGI application, and not ASGI server because you can apply on individual endpoints/have more granularity?
  2. We need to check the Content-Length (single body event) on the requests that are not Transfer-Encoding: chunked (streaming).

@adriangb
Copy link
Member Author

I don't think WebSockets should be a point of concern here.

The configurations offered by Uvicorn don't cover all of the use cases here. You can't configure it on a per route basis for starters.

No special logic is required for SSE.

What I mean is that if you install a blanket timeout or request size limit it might make sense to say "SSE requests don't have a limit". That would require special logic.

Is the argumentation of having resource limits on the ASGI application, and not ASGI server because you can apply on individual endpoints/have more granularity?

Yes.

We need to check the Content-Length (single body event) on the requests that are not Transfer-Encoding: chunked (streaming).

Agreed, we can check it and if it is larger than the limit error immediately without streaming anything. But we can't "trust" it if it's smaller.

@Kludex
Copy link
Sponsor Member

Kludex commented Nov 17, 2023

We can trust the content length header. The server errors otherwise.

@adriangb
Copy link
Member Author

adriangb commented Nov 17, 2023

But Starlette doesn't depend on the server. We shouldn't make assumptions like this about the server's behavior unless it's part of the ASGI spec, which to my knowledge this is not. Besides, the only advantage of trusting the content-length header when it says it's smaller than the user defined limit is that we do one less ASGI middleware wrapping on receive. It's not expensive to keep track of the actual streamed size, I don't think it justifies the potential footgun.

@Kludex Kludex closed this Jan 20, 2024
@Kludex Kludex reopened this Jan 20, 2024
@adriangb
Copy link
Member Author

@Kludex I’m not sure if it was on purpose or not but I noticed you closed #2175 and not this one. I don’t think there was any concensus on how to implement this feature, nor do I think this PR is intrinsically better in significant ways, so I just wanted to check what your intention was.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Limit max request size
4 participants