Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check max_content_length consistently #2620

Merged
merged 1 commit into from Mar 14, 2023
Merged

check max_content_length consistently #2620

merged 1 commit into from Mar 14, 2023

Conversation

davidism
Copy link
Member

@davidism davidism commented Mar 14, 2023

max_content_length is now checked consistently for anything that accesses request.stream, such as request.data, request.form, and request.json. The check is done in get_input_stream.

If request.max_content_length is set, it is checked immediately when accessing request.stream, and raises a 413 Content Too Large error (RequestEntityTooLarge). If Content-Length is not set (a streaming request), max_content_length is used in LimitedStream so that it may raise once the limit is met.

LimitedStream now implements io.RawIOBase instead of the even more basic io.IOBase. Only readinto needs to be implemented, read, readline, and readlines have default C implementations based on that. The default readall required an extra read to check for EOF, so that had to be implemented as well.

LimitedStream has a new is_max parameter to distinguish whether the limit is a client-reported Content-Length or an application-set max_content_length. If the limit is a maximum, reading from an exhausted stream raises a 413 error instead of returning empty. Reading empty does not raise a ClientDisconnected error, since there's no way to tell if the client stopped sending because the stream was done or because it really disconnected.

The readlines implementation handles the size parameter correctly now. It reads at least one line, but stops reading more lines if the size was reached while reading the last line. Therefore, more than size bytes may be read. It is still safe, since each read goes through readinto, so it will still be limited.

fixes #1513
fixes #2620
fixes #690


It is not possible for the WSGI application to safely exhaust the input stream. It's up to the WSGI/HTTP server which handles the socket, or the client application which handles what happens when the server closes the input stream. Therefore it's expected that the client might show a "connection reset" failure rather than a 413 error message.

All "exhaust input" handling is removed from the application side of Werkzeug. It now always raises an error once Content-Length or max_content_length is reached, and will not read past that. However, the development server will now continue to read up to 10GB, 10MB at a time, up to 1000 reads, for 10 seconds. This allows the client to see the 413 error in all but extreme cases. The development server is already not appropriate for production, so adding this does not seem unreasonable.

The server is still not able to handle HTTP/1.1 keep-alive connections. It was too complicated to handle exhausting the input while not accidentally reading the next request line.

I really think we should take all these things we've been adding on top of http.server and push them back upstream, if anyone wants to work on that.

@davidism davidism added this to the 2.3.0 milestone Mar 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

check max_content_length more consistently max_content_length is not enforced in all cases
1 participant