-
-
Notifications
You must be signed in to change notification settings - Fork 710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workers idle even though there's queued work #4501
Comments
I just tried updating to the latest version (2021.1.1) and I think it's even worse now? The worker now prints this out (I added timestamps to the prints):
You can see that it waits for the first job to completely finish before it picks up the second one, despite there being a free worker (i.e. the one that called secede) |
@jrbourbeau could you migrate this ticket to dask/distributed? |
@JohnEmhoff -- thanks for reporting this! I thought this might've been a problem I introduced when adding the Do you have a sense of when this last worked as expected? |
@gforsyth We were on dask 2.18.0 for a while and I thought everything was fine there, but it looks like I'm able to reproduce it on that version too? Maybe we didn't notice or maybe I'm not properly installing the older versions:
|
An observation -- if the shorter of the two jobs is started first, then the "instantaneous job" fires as soon as the first job finishes: (I've changed the times to 20 and 10 seconds, respectively)
I haven't gone very deep on secession and rejoining, but it looks like the scheduler is assigning the task to the first worker it has available, or that it thinks is available. |
Thanks for looking into this -- do you think there's a reasonable workaround while this exists? |
Haven't looked deeply into this, yet, and there are differences due to seceding/long runnign jobs but a similar issue was reported in #4471 |
I know your example is a simplified version of the workflow you have, but since this flow currently leaves workers idle, I would try adding more threads to each proc. That might increase GIL-contention but also might make things run more smoothly. |
Any more questions here @JohnEmhoff? 🙂 |
@jakirkham not especially, thanks for asking! It seems there isn't much of a work-around though beyond over-provisioning, which isn't great at scale. Is this considered a bug? I feel like this is a serious gotcha in scheduling. |
I am the author of the related issue, and also am forced to over-provision. Is there any direction on where to look for issues? I'm spending some time this week learning the scheduler so as to look into this an other issues I'm having. |
What happened: We have a large-ish cluster (about 100 nodes) and recently when we submit a lot of jobs (in the thousands) we notice that about 60% of the cluster is idle. Generally, a job will spawn about 20 downstream sub-jobs; these are submitted from inside the worker which will call secede / rejoin while it waits on those jobs. I'm fairly certain this use of secede / rejoin is related as you can see in the reproduction below.
What you expected to happen: The cluster uses all available resources
Minimal Complete Verifiable Example:
This requires running a scheduler, a worker with two procs, and then submitting jobs. Bear with me while I show all the pieces:
This is how I create the environment:
...and this is the python file with the jobs and such. You can see the two operations are:
Finally, this is the script that will submit jobs that will show the issue we're running into:
This script submits a long job, a shorter job, and then just an instantaneous job to show that there's a scheduling problem. When the jobs are submitted the worker will print out:
The problem is on the line that says
Job done (60s); rejoining
. At this point there's one idle worker that could be running the instantaneous job but it doesn't -- instead it waits on the 120s job. After the 120s job is done (about a minute later) that instantaneous job finally runs. Hence the worker is idle for about a minute.Anything else we need to know?:
Sorry for the length; I don't think I can cut it down any more. If the problem isn't clear let me know and I'll see if I can explain better.
Environment:
The text was updated successfully, but these errors were encountered: