Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Balancing long-running request between multiple workers / Large file uploads #2734

Closed
Flauschbaellchen opened this issue Jan 31, 2022 · 2 comments

Comments

@Flauschbaellchen
Copy link

Hi,

I'm trying to upload large-sized files into my Django application.
The files are typically around 500 MB - 2 GB.
While a client is uploading a file other clients are blocked (request hangs until timeout or it is processed)

I'm using Uvicorn with --worker-class uvicorn.workers.UvicornWorker.

  1. First szenario: 1 Worker
    While starting gunicorn with one worker only (--workers 1) the following issue appears:
    Client A is starting the upload.
    The file itself does not hit the upload handler directly (which is another issue in uvicorn), however as soon as the upload handler is called, other requests, e.g. from a Client B, hang.
    This seems to be OK as I only have 1 worker to deal with incoming requests.

  2. Second szenario: 2 Workers (or more)
    If started with two or more workers (--workers 2), the same happens, but requests that are incoming as long as the file is being processed, are scheduled on the workers either by round-robin or sequentially.
    Requests which hit the first worker (which process the file) hang until the prior request has been finished.
    Some other requests which hits the second worker are processed directly.

This results in other clients getting time-out errors depending which worker is serving their request.

My question:
Is there a way to optimize this behavior? (or: Is this correct or do I have a false understanding of this issue?)

The first worker should "know" that it is already blocked by one request and cannot serve another one in parallel.
All other requests should be put onto the second worker's queue.
Only in the case, that Client A and Client B uploads files in parallel (or do other long-running requests), a request of a third client should be put into the queue and should be processed from the first worker which finished their task.

Is there another possibility to allow large-file uploads? Running with X workers always only allows X uploads to be processed simultaneously. Using WEB_CONCURRENCY or 2 * CPU_COUNT results only in a bunch of parallel uploads which does not seem to be desirable...

Thanks for your time and help!

@benoitc
Copy link
Owner

benoitc commented May 7, 2023

late answer but If you are using uvicorn ennsure the work is splitted in a queue that won't lock the event loop behind it. Not sure what 's the best way to do it though :)

@benoitc
Copy link
Owner

benoitc commented May 7, 2023

closing as stalled issue.

@benoitc benoitc closed this as not planned Won't fix, can't repro, duplicate, stale May 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants