Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix high memory usage when using BaseHTTPMiddleware middleware classes and streaming responses #1018

Merged
merged 2 commits into from Aug 4, 2020

Conversation

erewok
Copy link
Contributor

@erewok erewok commented Aug 3, 2020

Limit queue size to 1 to prevent loading entire streaming response into memory.

This small PR has been separated from #1017 in order to continue evaluating the changes included in the other separately.

Closes #1012

Limit queue size to 1 to prevent loading entire streaming response into memory
Copy link
Member

@JayH5 JayH5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, is there any way we can test this?

starlette/middleware/base.py Outdated Show resolved Hide resolved
@florimondmanca
Copy link
Member

florimondmanca commented Aug 3, 2020

@JayH5 Assuming we already have tests for streaming responses, the way I'd properly test this is with a memory test. I had done it for streaming in HTTPX but eventually we dropped it since it requires adding an extra test dependency (memoryprofile I think?). I'd be open to having it for Starlette though, since I don't think there's any other way we can prevent high memory usage from coming back in the future.

Edit: the dropped memory test in HTTPX can be seen here: encode/httpx@f092ecd

@erewok
Copy link
Contributor Author

erewok commented Aug 3, 2020

I haven't been able to get memory_profiler to work by decorating async functions yet in the (admittedly, two) times I've tried. Did you have that working on httpx, @florimondmanca? I can take a look at the memory test you had before to see if there are some tips for me.

When I was working on this, I did a naive sys.getsizeof of all the items in the queue (it wraps a deque, so you can iterate), and I was printing out the queue size to validate.

Here's an example of the chunk of code inside the body_stream I was using to validate my assumptions:

    async def body_stream() -> typing.AsyncGenerator[bytes, None]:
            while True:
                print("Queue element count is now: ", queue.qsize())
                all_obj_getsizeof = sum(sys.getsizeof(elem) for elem in queue._queue)  # type: ignore
                print("Queue memory use should be: ", all_obj_getsizeof)

On master, this produces output like this:

...
Yield chunk:  92
Yield chunk:  93
Yield chunk:  94
Yield chunk:  95
Yield chunk:  96
Yield chunk:  97
Yield chunk:  98
Yield chunk:  99
INFO:     127.0.0.1:54974 - "GET /streaming-memory-test/100 HTTP/1.1" 200 OK
Queue element count is now:  102
Queue memory use should be:  23664
Queue element count is now:  101
Queue memory use should be:  23432
Queue element count is now:  100
Queue memory use should be:  23200
Queue element count is now:  99
Queue memory use should be:  22968
Queue element count is now:  98
...

Whereas on this branch, it produces output like this:

Yield chunk:  0
Yield chunk:  1
INFO:     127.0.0.1:54931 - "GET /streaming-memory-test/100 HTTP/1.1" 200 OK
Queue element count is now:  1
Queue memory use should be:  232
Queue element count is now:  0
Queue memory use should be:  0
Yield chunk:  2
Queue element count is now:  0
Queue memory use should be:  0
Yield chunk:  3
Queue element count is now:  0
Queue memory use should be:  0
Yield chunk:  4
Queue element count is now:  0
Queue memory use should be:  0
Yield chunk:  5

This is for an endpoint that looks like this:

@app.route("/streaming-memory-test/{total_size:int}")
async def streaming_memory_test(request):
    total_size = request.path_params["total_size"]
    chunk_size = 1024
    while total_size < chunk_size:
        total_size *= chunk_size
    
    async def byte_stream():
        chunk_count = total_size // chunk_size
        remainder = total_size % chunk_size
        for n in range(chunk_count):
            print("Yield chunk: ", n)
            yield os.urandom(chunk_size)
        yield os.urandom(remainder)

    return StreamingResponse(byte_stream()) 

(which I wrote to simulate streaming a large file), and a request like this for a 100kb file:

~/open_source
❯ curl http://localhost:8000/streaming-memory-test/100 --output sample_fl
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  100k    0  100k    0     0  19.5M      0 --:--:-- --:--:-- --:--:-- 19.5M

~/open_source
❯ ll -h sample_fl
-rw-r--r--  1 erewok  staff   100K Aug  3 07:55 sample_fl

Also, there's no test of BaseHTTPMiddleware with a StreamingResponse in the codebase, so I added one in the other PR, and I think it makes more sense there because that PR messes with the way streaming happens arguably more than this one? In truth, those tests I added can go in either PR. They should pass on master, in this branch, and the other.

@florimondmanca
Copy link
Member

@erewok Yes, I remember that the memory test there was passing, and pretty reliably even! Locally and on CI. Tricky to get right but I was fairly confident about it.

@erewok
Copy link
Contributor Author

erewok commented Aug 4, 2020

I'd like to look at the "pending task" issue before merging this; this is when the coro task doesn't get cancelled in the event of a disconnect.

@erewok
Copy link
Contributor Author

erewok commented Aug 4, 2020

I am satisfied now: there was a bug in the handler I wrote to test this code. When I fix that handler, then the pending task issue is no longer present. This PR is good to merge in my opinion.

Copy link
Member

@florimondmanca florimondmanca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good!

@florimondmanca florimondmanca changed the title BaseHTTPMiddleware: add maxsize arg to Queue constructor Fix high memory usage when using BaseHTTPMiddleware middleware classes and streaming responses Aug 4, 2020
Copy link
Member

@JayH5 JayH5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too 🚀

@erewok erewok merged commit 3d77a1c into master Aug 4, 2020
@erewok erewok deleted the limit_base_qsize branch August 4, 2020 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory usage streaming large responses
3 participants