User distribution should happen when new workers comes in #1884

tyge68 · 2021-09-14T09:54:31Z

Currently when more workers are added (dynamically based in the workers CPU load on kubernetes), if the swarming is over, then new workers are detected ready, but there is redistribution of the users on the new workers to take some load on the other workers.

Only a manually re-swarming can force redistribute the users among the workers.

Describe the bug

The new workers are ready but not used

Expected behavior

If a new worker is added, it should automatically redistribute the load of users on the workers.

Actual behavior

New worker is ready, but don't get any user / tasks to execute. Other workers are still using all the CPU.

Steps to reproduce

Setup some workers , and start a load test, wait until the target number of user is reached. Add a new worker to the pool, check the state is ready, but no user is assigned to it, until a "re-swarm" is manually triggered.

Environment

OS:
Python version: 3.9
Locust version: 2.2.1
Locust command line that you ran:
Locust file contents (anonymized if necessary):

cyberw · 2021-09-14T10:03:35Z

I’m +/-0 for such a change. Because redistribution after ramp up has completed would involve stopping users (which is potentially a problem or at least a source of confusion for some tests) I’m not sure it is a good thing.

mboutet · 2021-09-14T14:10:46Z

This is what's currently implemented in #1809. I only implemented the re-distribution while ramping-up/ramping-down. As described by @tyge68, this means that if workers are added or removed while the load test is at a plateau (i.e. steady state), then the redistribution won't take place until the next ramp-up/down.

I'm in the same boat as @tyge68 and I know some others that would like for this feature to be implemented. I think that in order to satisfy both types of users (those who want it and others who don't), this should be configurable as a CLI flag and/or environment variable and disabled by default.

From an implementation standpoint, I already have a design idea which consists in having a background greenlet similar to locust.runners.MasterRunner.heartbeat_worker which periodically (configurable and defaulted to something around ~10s) triggers a re-distribution if needed and if a ramp-up/down is not already in progress.

tyge68 · 2021-09-15T11:36:19Z

@cyberw @mboutet Note I have created a PR, while checking the code , I see there was already some tests related to this issue, I noticed while retesting locally that it rebalance once for the first added worker (after ramp up), but any new worker added again after that are "ignored" and so stay then in "ready" state.

With PR based on what @mboutet suggested, that would resolve it definitely, but it could be that another solution could solve it too in that part of the code (1).

Also it is interesting that there is indeed a test already existing on this topic (2) in the test_runner.py. Eventually it would be nice to check that adding a few more workers a few milliseconds later still rebalance as I observed.

(1) https://github.com/locustio/locust/blob/master/locust/runners.py#L861-L875
(2) https://github.com/locustio/locust/blob/master/locust/test/test_runners.py#L1771

mboutet · 2021-09-15T13:53:30Z

Right now, the rebalancing only takes place during a spawning.

Here's a simple diagram showing the current behaviour:

it rebalance once for the first added worker (after ramp up), but any new worker added again after that are "ignored" and so stay then in "ready" state.

Normally, the add_worker and remove_worker "hooks" can handle multiple workers joining or going missing. This case is also unit tested in:

locust/locust/test/test_dispatch.py

Line 3069 in dbe313d

def test_add_two_workers_between_two_ramp_ups(self):

Of course, because the rebalancing only takes place when there's a spawning, it might give the impression that it does not work when not spawning. By implementing #1886 (comment), the rebalancing should take place immediately.

# This is the 1st commit message: locustio#1884 User distribution should happen when new workers comes in # This is the commit message locustio#2: locustio#1884 reformatting # This is the commit message locustio#3: locustio#1884 Make rebalance interval configurable # This is the commit message locustio#4: locustio#1884 Adding integration test for rebalanced # This is the commit message locustio#5: locustio#1884 revert change # This is the commit message locustio#6: locustio#1886 Simplify solution # This is the commit message locustio#7: locustio#1886 adding missing dependency # This is the commit message locustio#8: locustio#1884 Adding back enable rebalancing option # This is the commit message locustio#9: locustio#1884 adding extra comment the test # This is the commit message locustio#10: locustio#1884 renaming as suggested # This is the commit message locustio#11: locustio#1884 update help description # This is the commit message locustio#12: locustio#1884 handling of missing workers to rebalance

…comes in

#1884 User distribution should happen when new workers comes in

cyberw · 2021-11-14T23:31:19Z

This is fixed now, right?

mboutet · 2021-11-15T00:07:35Z

I think so 😀

tyge68 added the bug label Sep 14, 2021

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 User distribution should happen when new workers comes in

0f60f00

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 reformatting

09ff6bc

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 Make rebalance interval configurable

e5f43d2

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 Adding integration test for rebalanced

d653ae7

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 revert change

9ff84a2

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 Adding back enable rebalancing option

2d2a65b

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 adding extra comment the test

268e8ac

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 renaming as suggested

224ff27

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 update help description

8bd1a59

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 handling of missing workers to rebalance

2ee523e

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 change to snake_case style

d1b2597

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 missing snake_case style

2448717

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

locustio#1884 Using snake_case

9e888fd

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021

Issue locustio#1884 User distribution should happen when new workers …

84838a6

…comes in

tyge68 added a commit to tyge68/locust that referenced this issue Sep 20, 2021

Issue locustio#1884 Turning to snake case style

9ec166b

cyberw added a commit that referenced this issue Sep 20, 2021

Merge pull request #1886 from tyge68/issue/1884

87b6dbc

#1884 User distribution should happen when new workers comes in

cyberw closed this as completed Nov 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

User distribution should happen when new workers comes in #1884

User distribution should happen when new workers comes in #1884

tyge68 commented Sep 14, 2021

cyberw commented Sep 14, 2021

mboutet commented Sep 14, 2021

tyge68 commented Sep 15, 2021

mboutet commented Sep 15, 2021

cyberw commented Nov 14, 2021

mboutet commented Nov 15, 2021

User distribution should happen when new workers comes in #1884

User distribution should happen when new workers comes in #1884

Comments

tyge68 commented Sep 14, 2021

Describe the bug

Expected behavior

Actual behavior

Steps to reproduce

Environment

cyberw commented Sep 14, 2021

mboutet commented Sep 14, 2021

tyge68 commented Sep 15, 2021

mboutet commented Sep 15, 2021

cyberw commented Nov 14, 2021

mboutet commented Nov 15, 2021