Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSGI middleware should stream request body, rather than loading it all at once. #371

Closed
peterlandry opened this issue Jun 11, 2019 · 8 comments · Fixed by #1825
Closed

Comments

@peterlandry
Copy link

Hi! I'm deploying a Django app with uvicorn, running on k8s. Our containers were being killed, and I've found that when users upload large files, uvicorn increases memory usage, and slows to a crawl. Eventually causing an OOM.

I'm not sure where this is happening yet, and I suspect could be related to django/channels#1251, but I did a bit more digging and I'm not totally sure. I've tried running uvicorn in wsgi mode, and completely removed channels from the django install, and I'm getting the same behavior. File uploads are being loaded into memory (rather than streaming to disk, as they should be), and upload speeds slow to a crawl. The same app running on gunicorn works fine.

The file I'm testing with is about 470mb.

@tomchristie
Copy link
Member

So both channels, and uvicorn's WSGI middleware consume the entire request body into memory rather than streaming it (due to some complexities of bridging across from async on the frontend, to WSGI's threaded concurrency)

See https://github.com/encode/uvicorn/blob/master/uvicorn/middleware/wsgi.py#L76-L83

Ideally we'd rework the WSGI middleware to provide proper streaming of request bodies.

@tomchristie tomchristie changed the title Large file upload memory leak? WSGI middleware should stream request body, rather than loading it all at once. Jun 12, 2019
@abersheeran
Copy link
Member

Maybe using tempfile.SpooledTemporaryFile instead is a simple and easy good idea.

@abersheeran
Copy link
Member

@peterlandry @tomchristie What do you think of my newly submitted pr? Use bytearray to solve it.

@Bluehorn
Copy link

Bluehorn commented Feb 3, 2022

I was looking into this for @Flauschbaellchen and wrote a demo that illustrates how hard this problem hits. While working on this I ran into #1345 all the time so I though this would also aid in fixing that issue. Of course today this wasn't reproducing anymore as the newest release fixes that bug. 👍

Still, here is an example that can be used to compare the different ways to upload, comparing especially a2wsgi with the middle ware included with uvicorn. Instead of storing the uploads, I just compute a sha256 hash:

issue371_demo.zip

After unpacking, you can run this like this:

$ cd issue371_demo/
$ sudo docker build .

When using the a2wsgi middleware for wsgi, 10 concurrent uploads using curl complete like this (uvicorn is run from gunicorn with 2 workers, wsgi middleware may use 2 threads):

Upload timings for wsgi/a2wsgi: [
1.3421823978424072, 1.4076666831970215, 1.4279963970184326, 1.448310375213623,
2.0521488189697266, 2.1042470932006836, 2.168593168258667, 2.179323434829712,
2.662322998046875, 2.7450664043426514].
Highest latency: 0.025380373001098633.

So basically 4 uploads (2 processes * 2 threads) are processed concurrently. While doing the upload, I run an asgi request concurrently to check the latency for processing ASGI requests. So here they complete after 0.025 s worst case.

Compare this to using the wsgi middleware in uvicorn:

Upload timings for wsgi/uvicorn: [
33.341026306152344, 33.38411903381348, 33.76080870628357,
73.60798692703247, 73.64170670509338, 73.97475504875183,
74.0065565109253, 74.37995886802673, 74.40127992630005,
74.73946642875671].
Highest latency: 0.619253396987915.

No idea why the uploads complete at such irregular intervals.

And, of course, using only asgi, the uploads are processed concurrently and the latency is much better:

Upload timings for asgi: [1.972560167312622, 2.0667288303375244, 2.0667288303375244, 2.0784835815429688, 2.5489165782928467, 2.6735644340515137, 2.724545955657959, 2.724545955657959, 2.735142707824707, 2.735142707824707].
Highest latency: 0.020055770874023438.

BTW: If you are adventurous, you can raise the size of the uploads. Note that the pull request #1329 actually made matters worse in my tests because it took down my machine for many big uploads.

@tomchristie
Copy link
Member

Simply replacing our previous naive byte concatenation with #1329 makes a massive difference here.

Still feasible that an implementation that streams the request body through the WSGI adapter would be preferable, but it's not absolutely necessary that's the case. Sometime simple wins, just by virtue of being simple.

@Bluehorn
Copy link

@tomchristie I agree that simple is preferable. But it seems you did not read my comment:

Note that the pull request #1329 actually made matters worse in my tests because it took down my machine for many big uploads.

I'd consider this a security issue given that a simply running a few big uploads will take down an application (or maybe just a single pod) by collecting all incoming data in memory before even handing it over the the application code.

@Kludex
Copy link
Sponsor Member

Kludex commented Oct 27, 2022

The goal here is to replace the WSGIMiddleware we have by the a2wsgi.WSGIMiddleware.

Implementation is available on #1303. If someone wants to take that over, it will be really helpful.

@Kludex Kludex added this to the Version 0.21.0 milestone Oct 29, 2022
@Kludex
Copy link
Sponsor Member

Kludex commented Dec 24, 2022

Or... Option 1 from #1303 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment