Remove maxsize arg from Queue constructor in BaseHTTPMiddleware #1028

erewok · 2020-08-11T14:38:57Z

This PR reverts the Queue(maxsize=1) fix for BaseHTTPMiddleware middleware classes and streaming responses.

Some users were relying on behavior where the response object would be fully evaluated within their middleware based on BaseHTTPMiddleware. (Of course, this was a problem for StreamingResponses with this middleware, because they were also being loaded fully into memory.)

The maxsize=1 fix released in version 0.13.7 explicitly prevented the response object from being fully evaluated until later await calls, but any users who were dropping this response for any other created inside their middleware were likely to see pending tasks accumulate.

The question is now, which is a more important bug to fix (these problems are in conflict):

When users return a StreamingResponse with BaseHTTPMiddleware, their response is loaded entirely into memory, or
Pending tasks accumulate whenever users don't fully evaluate a response of any kind inside BaseHTTPMiddleware.

This PR moves to fix the second by reverting the maxsize change, but this also resurrects the first problem.

See issue #1022 and issue #1012

erewok · 2020-08-11T19:45:30Z

An alternative to this PR is to leave the fix in the codebase and instead to change the docs to explicitly state that any Response users get in their middleware is unlikely to be fully evaluated and so they should never drop it and return an alternative response. It would need to be one of those loud, flashing-lights type of warnings.

Something like:

Note: the response you get in your dispatch method is not guaranteed to be fully evaluated. If you discard this response, you risk accumulating pending tasks! Here's an example:

class CustomHeaderMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        response = await call_next(request)
        if response.headers['Custom'] == 'Example':
            # NEVER DO THIS:
            return SomeOtherResponse
        return response

etc.

Indeed, one argument for changing the docs instead is that users can control whether or not they throw away the unevaluated response. By contrast, they don't have any control over the queue size in this call_next method...

Thoughts on that idea, @florimondmanca @JayH5?

florimondmanca · 2020-08-12T12:04:13Z

I think a healthier alternative might be to completely review how Starlette works wrt structured concurrency.

We've had discussions about this about background tasks, as well as this BaseHTTPMiddleware API, and I'm growing the intuition that there's some more general design work to do to address these changes.

Something may have to break, but it's not clear either if we have to break anything, or just change default recommendations.

For example I think you mentioned starting to discourage the use of BaseHTTPMiddleware in favor of raw ASGI middleware (which very naturally allows async with context-managed block, since there's no return <something> statement in an ASGI callable) — I think that's a possible alternative to fix 2/.

Either way, I'm somewhat in favor of keeping the status quo pre-0.13.7, i.e. accepting this PR and letting "BaseHTTPMiddleware has an issue with streaming" be a currently-well-known limitation, with the workaround being "use a raw ASGI middleware instead".

Then we might need to start a wider conversation about structured concurrency principles applied to Starlette, aka "how do we ensure users don't shoot themselves in the foot by letting I/O resources leak unknowingly?".

erewok · 2020-08-13T12:48:25Z

Sounds good, @florimondmanca. I agree with your comments here. I'll merge this and issue a new release issue soon.

obataku · 2020-11-11T07:43:25Z

(Of course, this was a problem for StreamingResponses with this middleware, because they were also being loaded fully into memory.)

just for clarification's sake, @erewok, a BaseHTTPMiddleware impl would only load a StreamingResponse entirely into memory if it actively buffered the response's entire body_iterator and/or had a very slow upstream ASGI message consumer, yes?

erewok · 2020-11-11T16:39:58Z

No, @obataku, any time BaseHTTPMiddleware and StreamingResponse are combined the following will happen:

The entire (streaming) response will be evaluated in memory before being returned to the client.
Background will run before the response is returned to the client.

The consumer does not affect this.

obataku · 2020-11-11T16:45:28Z

@erewok I'm familiar with the background task behavior but I am very surprised that the entire StreamingResponse is always buffered in memory all at once using BaseHTTPMiddleware--I thought the implementation of body_stream specifically avoided that. thanks, I will look more closely

erewok · 2020-11-11T16:50:28Z

It is surprising behavior. My own recommendation is to avoid BaseHTTPMiddleware because it makes a promise that leads to trouble: it offers the entire response to the developer before that response has been sent back to the client.

Remove maxsize arg from Queue constructor in BaseHTTPMiddleware

39e03d5

This was referenced Aug 11, 2020

Version 0.13.8 #1026

Merged

"Task was destroyed but it is pending!" error when middleware dispatch func discards response from call_next #1022

Closed

florimondmanca approved these changes Aug 12, 2020

View reviewed changes

erewok mentioned this pull request Aug 12, 2020

Background tasks don't work with middleware that subclasses BaseHTTPMiddleware #919

Closed

erewok merged commit f5a08d0 into master Aug 13, 2020

erewok deleted the revert_maxsize branch August 13, 2020 12:48

This was referenced Aug 13, 2020

Memory usage streaming large responses #1012

Closed

Improve Documentation on Writing Custom ASGI Middleware and BaseHTTPMiddleware #1029

Closed

JayH5 mentioned this pull request Jun 11, 2021

anyio integration #1157

Merged

3 tasks

Kludex mentioned this pull request Jan 28, 2022

Allow background tasks to run with custom BaseHTTPMiddleware's #1441

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove maxsize arg from Queue constructor in BaseHTTPMiddleware #1028

Remove maxsize arg from Queue constructor in BaseHTTPMiddleware #1028

erewok commented Aug 11, 2020 •

edited

erewok commented Aug 11, 2020 •

edited

florimondmanca commented Aug 12, 2020 •

edited

erewok commented Aug 13, 2020

obataku commented Nov 11, 2020 •

edited

erewok commented Nov 11, 2020

obataku commented Nov 11, 2020

erewok commented Nov 11, 2020

Remove maxsize arg from Queue constructor in BaseHTTPMiddleware #1028

Remove maxsize arg from Queue constructor in BaseHTTPMiddleware #1028

Conversation

erewok commented Aug 11, 2020 • edited

erewok commented Aug 11, 2020 • edited

florimondmanca commented Aug 12, 2020 • edited

erewok commented Aug 13, 2020

obataku commented Nov 11, 2020 • edited

erewok commented Nov 11, 2020

obataku commented Nov 11, 2020

erewok commented Nov 11, 2020

erewok commented Aug 11, 2020 •

edited

erewok commented Aug 11, 2020 •

edited

florimondmanca commented Aug 12, 2020 •

edited

obataku commented Nov 11, 2020 •

edited