fix(perf): improve delete_from_sorted_set speed #837

mhenrixon · 2024-02-17T06:28:55Z

Closes #835

ezekg · 2024-02-17T16:40:44Z

For delete_from_sorted_set, using ZSCAN still has the same time complexity of O(N) as the previous usage of ZRANGE (i.e. at worst case, when adding to the end of the queue — a typical workload when scheduling jobs into the future — it iterates the entire queue), which is where the problem lies. If the sorted set were, for example, 1,000,000 items instead of 100,000, the performance test would fail even worse. The entire purpose of my PR was to use ZRANGE BYSCORE to limit that search space so that we never iterate the entire queue, because iterating the entire queue won't ever scale. Right now, your solution to use ZSCAN still has the same fundamental problem: it doesn't scale; you just pushed the work to iterate the entire queue onto Redis.

Also, your tests are passing because you aren't running performance tests in CI. It fails locally for me:

bin/rspec --require spec_helper --tag perf --fail-fast spec/performance/unique_job_on_conflict_replace_spec.rb
# => Run options: include {:focus=>true, :perf=>true}
# 
#    Randomized with seed 34288
#    
#    UniqueJobOnConflictReplace
#      when schedule queue is large
#        locks and replaces quickly (FAILED - 1)
#    
#    Failures:
#    
#      1) UniqueJobOnConflictReplace when schedule queue is large locks and replaces quickly
#         Failure/Error:
#           expect do
#             Timeout.timeout(0.1) do
#               described_class.perform_in(2_592_000, 100_000, { "type" => "extremely unique" })
#             end
#           end.not_to raise_error
#    
#           expected no Exception, got #<Timeout::Error: execution expired> with backtrace:
#             # /home/zeke/.rvm/gems/ruby-3.2.2/gems/redis-client-0.20.0/lib/redis_client/ruby_connection/buffered_io.rb:139:in `wait_readable'

ezekg · 2024-02-17T20:29:40Z

More thoughts: for the worst case — where the string we're searching for doesn't exist in the set — ZSCAN performs better, since it's non-blocking vs an iterative ZRANGE over the set. But when the string does exist — and we have its score — using ZRANGE BYSCORE will be orders of magitude faster because it divides the search space.

So, I guess I have 2 major questions that need answering to fully understand the problem:

Do we know if the digest (a) MAY exist or that it (b) WILL exist in the sorted set?
Is the score you mentioned already storing always accurate?

If the answer to 1 is (b) it MAY exist, then maybe we can try to find the item by score using ZRANGE BYSCORE at first, but fall back to ZSCAN for the worst case where the ZRANGE BYSCORE returned no results (indicating our score is stale, or the digest doesn't exist). If the answer to 1 is (a) it WILL exist, then ZRANGE BYSCORE will always be faster, given 2.

mhenrixon self-assigned this Feb 17, 2024

mhenrixon added bug enhancement labels Feb 17, 2024

mhenrixon force-pushed the fix/perf-delete_from_sorted_set branch 3 times, most recently from fa49dbd to d9c0115 Compare February 17, 2024 06:49

mhenrixon added 2 commits February 18, 2024 09:06

fix(perf): improve delete_from_sorted_set speed

306843c

fix(perf): break out all way on found

a25420b

mhenrixon force-pushed the fix/perf-delete_from_sorted_set branch from d9c0115 to a25420b Compare February 18, 2024 07:06

mhenrixon closed this Feb 21, 2024

mhenrixon deleted the fix/perf-delete_from_sorted_set branch February 21, 2024 18:36

ezekg mentioned this pull request Feb 21, 2024

Add digest scores for faster deletes in sorted sets #835

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(perf): improve delete_from_sorted_set speed #837

fix(perf): improve delete_from_sorted_set speed #837

mhenrixon commented Feb 17, 2024 •

edited

ezekg commented Feb 17, 2024 •

edited

ezekg commented Feb 17, 2024 •

edited

fix(perf): improve delete_from_sorted_set speed #837

fix(perf): improve delete_from_sorted_set speed #837

Conversation

mhenrixon commented Feb 17, 2024 • edited

ezekg commented Feb 17, 2024 • edited

ezekg commented Feb 17, 2024 • edited

mhenrixon commented Feb 17, 2024 •

edited

ezekg commented Feb 17, 2024 •

edited

ezekg commented Feb 17, 2024 •

edited