New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WSGI middleware should stream request body, rather than loading it all at once. #371
Comments
So both channels, and uvicorn's WSGI middleware consume the entire request body into memory rather than streaming it (due to some complexities of bridging across from async on the frontend, to WSGI's threaded concurrency) See https://github.com/encode/uvicorn/blob/master/uvicorn/middleware/wsgi.py#L76-L83 Ideally we'd rework the WSGI middleware to provide proper streaming of request bodies. |
Maybe using |
@peterlandry @tomchristie What do you think of my newly submitted pr? Use |
I was looking into this for @Flauschbaellchen and wrote a demo that illustrates how hard this problem hits. While working on this I ran into #1345 all the time so I though this would also aid in fixing that issue. Of course today this wasn't reproducing anymore as the newest release fixes that bug. 👍 Still, here is an example that can be used to compare the different ways to upload, comparing especially a2wsgi with the middle ware included with uvicorn. Instead of storing the uploads, I just compute a sha256 hash: After unpacking, you can run this like this: $ cd issue371_demo/
$ sudo docker build . When using the a2wsgi middleware for wsgi, 10 concurrent uploads using curl complete like this (uvicorn is run from gunicorn with 2 workers, wsgi middleware may use 2 threads):
So basically 4 uploads (2 processes * 2 threads) are processed concurrently. While doing the upload, I run an asgi request concurrently to check the latency for processing ASGI requests. So here they complete after 0.025 s worst case. Compare this to using the wsgi middleware in uvicorn:
No idea why the uploads complete at such irregular intervals. And, of course, using only asgi, the uploads are processed concurrently and the latency is much better:
BTW: If you are adventurous, you can raise the size of the uploads. Note that the pull request #1329 actually made matters worse in my tests because it took down my machine for many big uploads. |
Simply replacing our previous naive byte concatenation with #1329 makes a massive difference here. Still feasible that an implementation that streams the request body through the WSGI adapter would be preferable, but it's not absolutely necessary that's the case. Sometime simple wins, just by virtue of being simple. |
@tomchristie I agree that simple is preferable. But it seems you did not read my comment:
I'd consider this a security issue given that a simply running a few big uploads will take down an application (or maybe just a single pod) by collecting all incoming data in memory before even handing it over the the application code. |
The goal here is to replace the Implementation is available on #1303. If someone wants to take that over, it will be really helpful. |
Or... Option 1 from #1303 (comment). |
Hi! I'm deploying a Django app with uvicorn, running on k8s. Our containers were being killed, and I've found that when users upload large files, uvicorn increases memory usage, and slows to a crawl. Eventually causing an OOM.
I'm not sure where this is happening yet, and I suspect could be related to django/channels#1251, but I did a bit more digging and I'm not totally sure. I've tried running uvicorn in wsgi mode, and completely removed channels from the django install, and I'm getting the same behavior. File uploads are being loaded into memory (rather than streaming to disk, as they should be), and upload speeds slow to a crawl. The same app running on gunicorn works fine.
The file I'm testing with is about 470mb.
The text was updated successfully, but these errors were encountered: