New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming with BaseHTTPMiddleware: force background to run after response completes #1017
Conversation
Also separate streaming response in base http middleware from background
I wrote this for Python3.8. I'll amend for 3.6 and 3.7. |
Thanks, @JayH5 for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great digging, thanks! The solutions you've found make sense to me. Just nagging a bit in favor of clearly separating in two dedicated PRs that we could review, merge and keep in the history independently…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good for me on the queue fix. Again if submitted separately we could merge it right away, but I'll leave that to your appreciation. :)
As for background tasks, one last item I'm not 100% sure about is that it's not guaranteed by the middleware anymore that the app
task will be awaited. Previously we always called task.result()
(which AFAIK essentially equals to doing await task
). But now we don't always do it.
So for example, I'm wondering if asyncio wouldn't raise a "task was not awaited" warning in case of a streaming response w/ a background task. Do you have thoughts? I don't know if we should then pass an additional background task to the constructed StreamingResponse that calls await task
, to ensure that the task is properly awaited...
This is the part I spent the most time thinking about in looking at this function. One point is that In this new version, of course, we're relying on the fact that if there is I did test a scenario of streaming with a background task (if you look at the gist, you can see the examples I stole from the docs and the issue reported on this), but I never had it throw "task not awaited." I was also looking for cancelled coroutines. It still seems odd to me, though, to schedule a task and then never As for dealing more explicitly with the background, I think it would be pretty involved. The Because the changes here are small and because my own tests were successful, I got over my discomfort with not explicitly awaiting the task, but I sympathize with that perspective. I will keep trying to break this code. |
@florimondmanca I think you're right about separating these: it would be good to continue testing the proposed changes included here around fixing the background but the other would be good to address immediately. I created a new PR with only |
I have tested this code in a variety of scenarios and inspected the tasks remaining after constructing responses using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a fiddly area which is why I'm personally still a bit wary of merging...
Is there any way we can test that this resolves the "don't wait for background task to send the response" case?
@florimondmanca that's a fair request. I had intended to write a test that would demonstrate this behavior but I had overlooked it. I have added a test now that will fail in master with the following: filepath = tmp_path / "background_test.txt"
filepath.write_text("Test Start")
response = client.get("/background-after-streaming?filepath={}".format(filepath))
assert response.headers["Custom-Header"] == "Example"
assert response.text == "1\n2\n3\n"
with filepath.open() as fl:
> assert fl.read() == "handler first"
E AssertionError: assert 'background last' == 'handler first'
E - handler first
E + background last
tests/middleware/test_base.py:228: AssertionError |
I'll also try to take a look at this tomorrow. Sounds like it's been a fiddly one so would be good to get another pair of eyes on it before it goes in. |
I think this PR is going in the same direction as the fix we had to revert in version 0.13.8 (see issue #1022, so I think it's probably a non-starter at this point. In general, I am beginning to wonder if we shouldn't discourage usage of this particular middleware class while we try to come up with a new means for users to write middleware that works with |
There is one issue
are currently two issueswith thecall_next
method insideBaseHTTPMiddleware
:The queue is unbounded, which means that handlers returning(Fixed in Fix high memory usage when using BaseHTTPMiddleware middleware classes and streaming responses #1018)StreamingResponses
will have their responses loaded fully into the queue.The
StreamingResponse
thatBaseHTTPMiddleware
creates to wrap the handler's response is waiting on the completion from the handler to stop streaming, which requires that any background tasks must be run before the response can be completed.This PR addresses the latter. Here's a gist which provides an example of running various tests against this branch.
Important Notes
This change limits queuemaxsize
to 1 (thanks for the suggestion, @florimondmanca)body_stream
function now explicitly checks whether we are wrapping aStreamingResponse
and iterates if so or completes the response if not.None
added to the queue is only needed in case of exception, so we opt to complete the response and leave theNone
on the queue (can be seen inqueue._unfinished_tasks
attribute).task.result()
only for raised exceptions.Other Notes
Addedtask_done
calls so that the queue would increment or decrement_unfinished_tasks
as a sanity check. It's not required for this to work.Added a streaming and a broken-streaming test for
tests/middleware/test_base.py
.Closes #919,
#1012