New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-actionable warnings about RTT #4851
Comments
Yeah. For deployments on Heroku there is no way for us to have a guaranteed AZ. See here. Maybe logging this as INFO would still suffice for those who want it? |
Ah thanks. Hmm the principle is fine :) From the top of my head, I would like to keep the warnings but configure the threshold to our case. Making it possible to override RTT_WARNING_LEVEL via ENV var and/or sidekiq.yml would be great for us. |
Perhaps I shouldn't be taking one reading and WARNing based on it. I should be taking 3-5 readings over 30 seconds before logging anything, that would minimize log noise due to transient spikes. I avoid config switches as they add code complexity. |
+1 for that, @mperham. |
I guess that would work for us, too, the warnings for today for example are usually minutes up to an hour apart. |
I am also on Heroku and have just started to notice these in my logs. Some of the values are very high:
If I'm not mistaken, this is a 16 second ping (not full request) from my Sidekiq server to Redis? I have opened a support request with Heroku, as this is pretty bad. Would it be reasonable to correlate consistently high values with UPDATE: I averaged the RTT values in my logs over the past 24 hours and came up with |
@PhilCoggins That's awful. If you are seeing consistently poor performance, I would explain to Heroku Support about the poor latency and ask them to fail you over to a new Redis instance. Something is terribly wrong with that one. |
I've updated master to take 5 samples and only warn if all five samples are above the threshold. |
Thanks everyone! |
@edmorley Can you explain more? Is there some aspect that makes this high priority? I have one other thing I'm still looking into but it's possible I can release later this week. |
@mperham Just that the message in 6.2.0 can be the result of a temporary false positive, rather than a consistently high RTT, and the new sampling approach will eliminate the noise from those. Customers open tickets with "sidekiq says there is a problem with my Redis instance", and after investigation there is no issue with the Redis instance, and the ping is typically low. |
Got it, thanks for the feedback. I need to remember that with great power
comes great responsibility, sorry for the support noise. I will ship 6.2.1
tomorrow.
…On Tue, Apr 6, 2021 at 7:58 AM Ed Morley ***@***.***> wrote:
@mperham <https://github.com/mperham> Just that the message in 6.2.0 can
be the result of a temporary false positive, rather than a consistent high
RTT, and the new sampling approach will eliminate the noise from those.
Customers open tickets with "sidekiq says there is a problem with my Redis
instance", and after investigation there is no issue with the Redis
instance, and the ping is typically low.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4851 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAAWXYQ6QMUDCHTTSGYAZ3THMOSZANCNFSM4ZW4W75Q>
.
|
6.2.1 is out. |
Thank you :-) |
❤️ |
Looks like it's working :) Thanks! |
Ruby version: 2.7.2
Rails version: 6.1
Sidekiq / Pro / Enterprise version(s): 6.2.0
sidekiq.yml:
Hello!
I'm a little worried about the recently introduced warnings about RTT: #4824
I noticed this warning showing up in our logs several times a day, usually with around this range:
However, usually the RTT is somewhere between 800 and 5000. The thing is, we're on Heroku and have basically no control over the Redis instance except the plan size (we have a medium-range "premium-5" instance). Sidekiq jobs seem to be handled reliably and speedily, no complaints.
So this warning is currently just noise to us. Is there a way to turn them off? Or am I missing something important here?
Thanks!
(BTW, love Sidekiq and your work, thanks :))
The text was updated successfully, but these errors were encountered: