Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoize process count in Poller #4882

Closed
wants to merge 1 commit into from
Closed

Conversation

jonhyman
Copy link
Contributor

Avoids scanning the processes set and trying to cleanup on every Poller#wait call

Avoids scanning the processes set and trying to cleanup on every Poller#wait call
jonhyman added a commit to braze-inc/sidekiq that referenced this pull request Apr 22, 2021
Poller#random_poll_interval right now grabs a process count which SSCANs 'processes' and then HGETs all the results. This can be prohibitive. sidekiq#4882 may get merged in to fix in future versions.
@mperham
Copy link
Collaborator

mperham commented Apr 22, 2021

I’m uncomfortable with either calling every time or only calling it once. The process count might change over time. Should we cache this for five minutes or some reasonable length?

@jonhyman
Copy link
Contributor Author

jonhyman commented Apr 22, 2021

I don't know the history of why you added the special case. To me, it seems arbitrary -- why 10 processes, why not 20 -- that's meant to help smooth out random numbers. Processes can run for a long time so even with <10 processes running for 24h that's a lot of wait() calls and I'd assume that a case where you end up with N*M messing consistently messing up your averages is probably uncommon because of the rand and process jitter. That is, while you do provide an example of 2 processes sleeping, you will get some jitter by the fact that MRI Ruby has a GIL and your jobs are going to interfere with the times your Poller runs anyway. Over a short period of time the jitter will separate the processes so they won't be starting to sleep at the same time. I've written a quick example below.

Should we cache this for five minutes or some reasonable length?

Overall if that's what you're most comfortable with, 5m would be fine by me and am happy to change the code.

# Fibonacci will simulate a job doing work
def fib(n)
  if n == 0 || n == 1
    return n
  end
  return fib(n - 1) + fib(n - 2)
end

# This simulates the poller, it sleeps for 5 seconds and we measure how long it takes between runs.
# Because there's work happening on another thread, the time difference between runs will vary slightly from 5 seconds.
t1 = Thread.new do
  while true
    if @et
      puts((@et - Time.now).round(2))
    end
    @et = Time.now
    sleep(5)
  end
end

t2 = Thread.new do
  while true
    fib(Random.rand(60))
    sleep(2)
  end
end

YMMV but here's an output from me running it for a few seconds. So if you have two processes here and they're doing some CPU-based work, the wait calls will diverge from each other within like a minute and then you should get closer to your averages. Even if they both sleep for 8 seconds, they'll probably be off by enough that you still hit your 5 second average most of the runs in a long time period.

-5.0
-5.11
-5.0
-5.0
-5.0
-5.11
-5.11
-5.13
-5.12
-5.11
-5.21
-5.2
-5.12
-5.06
-5.1
-5.01
-5.02

@mperham
Copy link
Collaborator

mperham commented Apr 22, 2021

What do you think about caching the value for 10 calls. We guarantee a 90% reduction in calls and still check regularly.

@mperham
Copy link
Collaborator

mperham commented May 3, 2021

Ping?

@mperham mperham closed this in ed73ed0 May 3, 2021
@jonhyman
Copy link
Contributor Author

jonhyman commented Aug 17, 2021

Wow sorry I totally missed the replies due to going on PTO shortly after this. Apologies, but I think you ended up in a good spot with your commit to reduce the calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants