Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User distribution should happen when new workers comes in #1884

Closed
tyge68 opened this issue Sep 14, 2021 · 6 comments
Closed

User distribution should happen when new workers comes in #1884

tyge68 opened this issue Sep 14, 2021 · 6 comments
Labels

Comments

@tyge68
Copy link
Contributor

tyge68 commented Sep 14, 2021

Currently when more workers are added (dynamically based in the workers CPU load on kubernetes), if the swarming is over, then new workers are detected ready, but there is redistribution of the users on the new workers to take some load on the other workers.

Only a manually re-swarming can force redistribute the users among the workers.

Describe the bug

The new workers are ready but not used

Expected behavior

If a new worker is added, it should automatically redistribute the load of users on the workers.

Actual behavior

New worker is ready, but don't get any user / tasks to execute. Other workers are still using all the CPU.

Steps to reproduce

Setup some workers , and start a load test, wait until the target number of user is reached. Add a new worker to the pool, check the state is ready, but no user is assigned to it, until a "re-swarm" is manually triggered.

Environment

  • OS:
  • Python version: 3.9
  • Locust version: 2.2.1
  • Locust command line that you ran:
  • Locust file contents (anonymized if necessary):
@tyge68 tyge68 added the bug label Sep 14, 2021
@cyberw
Copy link
Collaborator

cyberw commented Sep 14, 2021

I’m +/-0 for such a change. Because redistribution after ramp up has completed would involve stopping users (which is potentially a problem or at least a source of confusion for some tests) I’m not sure it is a good thing.

@mboutet
Copy link
Contributor

mboutet commented Sep 14, 2021

This is what's currently implemented in #1809. I only implemented the re-distribution while ramping-up/ramping-down. As described by @tyge68, this means that if workers are added or removed while the load test is at a plateau (i.e. steady state), then the redistribution won't take place until the next ramp-up/down.

I'm in the same boat as @tyge68 and I know some others that would like for this feature to be implemented. I think that in order to satisfy both types of users (those who want it and others who don't), this should be configurable as a CLI flag and/or environment variable and disabled by default.

From an implementation standpoint, I already have a design idea which consists in having a background greenlet similar to locust.runners.MasterRunner.heartbeat_worker which periodically (configurable and defaulted to something around ~10s) triggers a re-distribution if needed and if a ramp-up/down is not already in progress.

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
@tyge68
Copy link
Contributor Author

tyge68 commented Sep 15, 2021

@cyberw @mboutet Note I have created a PR, while checking the code , I see there was already some tests related to this issue, I noticed while retesting locally that it rebalance once for the first added worker (after ramp up), but any new worker added again after that are "ignored" and so stay then in "ready" state.

With PR based on what @mboutet suggested, that would resolve it definitely, but it could be that another solution could solve it too in that part of the code (1).

Also it is interesting that there is indeed a test already existing on this topic (2) in the test_runner.py. Eventually it would be nice to check that adding a few more workers a few milliseconds later still rebalance as I observed.

(1) https://github.com/locustio/locust/blob/master/locust/runners.py#L861-L875
(2) https://github.com/locustio/locust/blob/master/locust/test/test_runners.py#L1771

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
@mboutet
Copy link
Contributor

mboutet commented Sep 15, 2021

Right now, the rebalancing only takes place during a spawning.

Here's a simple diagram showing the current behaviour:
locust-rebalancing (1)

it rebalance once for the first added worker (after ramp up), but any new worker added again after that are "ignored" and so stay then in "ready" state.

Normally, the add_worker and remove_worker "hooks" can handle multiple workers joining or going missing. This case is also unit tested in:

def test_add_two_workers_between_two_ramp_ups(self):

Of course, because the rebalancing only takes place when there's a spawning, it might give the impression that it does not work when not spawning. By implementing #1886 (comment), the rebalancing should take place immediately.

tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
# This is the 1st commit message:

locustio#1884 User distribution should happen when new workers comes in

# This is the commit message locustio#2:

locustio#1884 reformatting

# This is the commit message locustio#3:

locustio#1884 Make rebalance interval configurable

# This is the commit message locustio#4:

locustio#1884 Adding integration test for rebalanced

# This is the commit message locustio#5:

locustio#1884 revert change

# This is the commit message locustio#6:

locustio#1886 Simplify solution

# This is the commit message locustio#7:

locustio#1886 adding missing dependency

# This is the commit message locustio#8:

locustio#1884 Adding back enable rebalancing option

# This is the commit message locustio#9:

locustio#1884 adding extra comment the test

# This is the commit message locustio#10:

locustio#1884 renaming as suggested

# This is the commit message locustio#11:

locustio#1884 update help description

# This is the commit message locustio#12:

locustio#1884 handling of missing workers to rebalance
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 15, 2021
tyge68 added a commit to tyge68/locust that referenced this issue Sep 20, 2021
cyberw added a commit that referenced this issue Sep 20, 2021
#1884 User distribution should happen when new workers comes in
@cyberw
Copy link
Collaborator

cyberw commented Nov 14, 2021

This is fixed now, right?

@mboutet
Copy link
Contributor

mboutet commented Nov 15, 2021

I think so 😀

@cyberw cyberw closed this as completed Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants