New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix high memory usage when using BaseHTTPMiddleware middleware classes and streaming responses #1018
Conversation
Limit queue size to 1 to prevent loading entire streaming response into memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, is there any way we can test this?
@JayH5 Assuming we already have tests for streaming responses, the way I'd properly test this is with a memory test. I had done it for streaming in HTTPX but eventually we dropped it since it requires adding an extra test dependency (memoryprofile I think?). I'd be open to having it for Starlette though, since I don't think there's any other way we can prevent high memory usage from coming back in the future. Edit: the dropped memory test in HTTPX can be seen here: encode/httpx@f092ecd |
I haven't been able to get memory_profiler to work by decorating async functions yet in the (admittedly, two) times I've tried. Did you have that working on When I was working on this, I did a naive Here's an example of the chunk of code inside the async def body_stream() -> typing.AsyncGenerator[bytes, None]:
while True:
print("Queue element count is now: ", queue.qsize())
all_obj_getsizeof = sum(sys.getsizeof(elem) for elem in queue._queue) # type: ignore
print("Queue memory use should be: ", all_obj_getsizeof) On master, this produces output like this: ...
Yield chunk: 92
Yield chunk: 93
Yield chunk: 94
Yield chunk: 95
Yield chunk: 96
Yield chunk: 97
Yield chunk: 98
Yield chunk: 99
INFO: 127.0.0.1:54974 - "GET /streaming-memory-test/100 HTTP/1.1" 200 OK
Queue element count is now: 102
Queue memory use should be: 23664
Queue element count is now: 101
Queue memory use should be: 23432
Queue element count is now: 100
Queue memory use should be: 23200
Queue element count is now: 99
Queue memory use should be: 22968
Queue element count is now: 98
... Whereas on this branch, it produces output like this: Yield chunk: 0
Yield chunk: 1
INFO: 127.0.0.1:54931 - "GET /streaming-memory-test/100 HTTP/1.1" 200 OK
Queue element count is now: 1
Queue memory use should be: 232
Queue element count is now: 0
Queue memory use should be: 0
Yield chunk: 2
Queue element count is now: 0
Queue memory use should be: 0
Yield chunk: 3
Queue element count is now: 0
Queue memory use should be: 0
Yield chunk: 4
Queue element count is now: 0
Queue memory use should be: 0
Yield chunk: 5 This is for an endpoint that looks like this: @app.route("/streaming-memory-test/{total_size:int}")
async def streaming_memory_test(request):
total_size = request.path_params["total_size"]
chunk_size = 1024
while total_size < chunk_size:
total_size *= chunk_size
async def byte_stream():
chunk_count = total_size // chunk_size
remainder = total_size % chunk_size
for n in range(chunk_count):
print("Yield chunk: ", n)
yield os.urandom(chunk_size)
yield os.urandom(remainder)
return StreamingResponse(byte_stream()) (which I wrote to simulate streaming a large file), and a request like this for a 100kb file: ~/open_source
❯ curl http://localhost:8000/streaming-memory-test/100 --output sample_fl
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 100k 0 100k 0 0 19.5M 0 --:--:-- --:--:-- --:--:-- 19.5M
~/open_source
❯ ll -h sample_fl
-rw-r--r-- 1 erewok staff 100K Aug 3 07:55 sample_fl
Also, there's no test of |
@erewok Yes, I remember that the memory test there was passing, and pretty reliably even! Locally and on CI. Tricky to get right but I was fairly confident about it. |
I'd like to look at the "pending task" issue before merging this; this is when the |
I am satisfied now: there was a bug in the handler I wrote to test this code. When I fix that handler, then the pending task issue is no longer present. This PR is good to merge in my opinion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too 🚀
Limit queue size to 1 to prevent loading entire streaming response into memory.
This small PR has been separated from #1017 in order to continue evaluating the changes included in the other separately.
Closes #1012