Change scheduler to use Lua-based script #5044
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This allows us to pop jobs atomically so that we remove redundant ZREMs
from multiple processes trying to process the same job.
When we (GitLab) upgraded to Sidekiq 6, we saw that the number of ZREM operations on our Redis server increased significantly. This had a negative effect on Redis CPU.
We tracked this down to 9b75467. That change fetches 100 jobs, then tries to remove them one at a time. This change makes sense for setups with a few number of processes since the maximum number of overlapping processes would be low.
In our case, we added logging and found out that ~44% of ZREM operations were redundant. We reverted to the Sidekiq 5 scheduler, but later on improved it further by introducing this patch.
It removed the redundant ZREM operations which restored Redis CPU back to Sidekiq 5 levels and reduced enqueue latency at the same time.
In some cases, the current Sidekiq 6 scheduler may be faster. Like when there's a single process and there are a lot of scheduled jobs. But I think it is safe to assume that as instances process more scheduled jobs, there would also be more processes. So overall, I think this is still an improvement over the current scheduler.