Replies: 1 comment 1 reply
-
did you ever get an answer? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am trying to run a one-off task for a big migration using the
replicated-job
mode, however swarm decides to keep scheduling them in only 1 node.I have to run 500 jobs, and I am running them in batches of 30, since I have 13 nodes I was expecting to have ~2 jobs per node.
Because of this, 1 node has a Load average of 30, while the others are around 0.5. This is how swarmpit looks like:
Am I correct to assume that swarm should try to schedule these jobs to use resources evenly across all nodes? That is indeed what happens when services are created as
replicated
. I am running these jobs with only"node.role == worker"
as constraint, and replicas, concurrency and maxreplicas all set to 1.Also, there was a fix introduced recently in #42741 maybe it is related?
Please let me know any other information that might be useful.
Beta Was this translation helpful? Give feedback.
All reactions