You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently upgraded to Sidekiq 7 and encountered ConnectionPool::TimeoutError exceptions in our workers after setting up a single-threaded capsule along with Sidekiq's rate limiting. For now, we've reverted our changes to not using capsules, but I really wanted to get this working.
Tracking down the cause wasn't easy because the errors occurred sporadically and didn't seem to follow a pattern. However, after a lot of digging in the depth of Sidekiq's codebase, I finally managed to create a small setup to reproduce the exception: (for simplicity without Rails)
I'm starting Sidekiq with sidekiq -r ./sidekiq.rb. Inside IRB, I can now schedule WorkerB which runs in the "single" capsule and then WorkerA which runs in the "default" capsule: (the order of execution is important)
$ irb -r ./sidekiq.rb
irb(main):001:0> WorkerB.perform_async('hello from worker b')
=> "c5360245000d1e1df50e9f79"
irb(main):002:0> WorkerA.perform_async('hello from worker a')
=> "2bf003b6f8214c6dffb90ff5"
The Sidekiq process / log shows that WorkerB runs fine but WorkerA is causing the aforementioned exception
Along with a long stack trace pointing to sidekiq-ent/limiter/concurrent.rb.
I could also get this exception when running WorkerA before WorkerB but not as consistent (maybe a timing issue with the connection pool). In this case, WorkerB will fail and the error message is "Waited 1 sec, 0/5 available".
I further tried to debug the problem and found that the redis_pool inside the Sidekiq::Limiter::Concurrent class was set to a size of 1 for both workers. Finally, I found the redis_pool method in Sidekiq::Limiter which looks like this:
If I understand the code correctly, this will memoize the redis pool from Thread.current[:sidekiq_capsule] which is the first capsule being used. When removing the @redis ||= part, the above example works fine, i.e.
defself.redis_poolSidekiq.redis_poolend
With this change applied, the redis_pool.size inside Sidekiq::Limiter::Concurrent changed from 1 to 5 and back depending on the worker / queue (as it should?) an no exception occurred anymore.
Although this fix "works", I'm not sure if this is the correct way to solve the issue. Besides, I didn't really understand why running a worker inside a wrong capsule / config would immediately cause a ConnectionPool::TimeoutError in the first place. Is the connection pool size synchronized with the number of Sidekiq threads? (maybe you can shed some light on the way connection pools are used in Sidekiq, just out of interest)
I hope you can fix this problem soon with the given information.
The text was updated successfully, but these errors were encountered:
sos4nt
changed the title
ConnectionPool::TimeoutError exception when using limiter inside capsule (with fix suggestion)
ConnectionPool::TimeoutError when using limiter inside capsule (with fix suggestion)
Jan 27, 2023
I think the proper fix is removing the = sign from ||=. With Sidekiq 6.x you can either provide a custom connection pool for limiter usage or you can reuse Sidekiq's global pool. Because that global pool is now per-capsule in Sidekiq 7.x, we can't memoize the pool anymore.
Ruby version: 3.0.4
Rails version: n/a
Sidekiq / Pro / Enterprise version(s): 7.0.3
Possibly related to #5702, #5685, and #5684.
We recently upgraded to Sidekiq 7 and encountered
ConnectionPool::TimeoutError
exceptions in our workers after setting up a single-threaded capsule along with Sidekiq's rate limiting. For now, we've reverted our changes to not using capsules, but I really wanted to get this working.Tracking down the cause wasn't easy because the errors occurred sporadically and didn't seem to follow a pattern. However, after a lot of digging in the depth of Sidekiq's codebase, I finally managed to create a small setup to reproduce the exception: (for simplicity without Rails)
I'm starting Sidekiq with
sidekiq -r ./sidekiq.rb
. Inside IRB, I can now scheduleWorkerB
which runs in the "single" capsule and thenWorkerA
which runs in the "default" capsule: (the order of execution is important)The Sidekiq process / log shows that WorkerB runs fine but WorkerA is causing the aforementioned exception
Along with a long stack trace pointing to
sidekiq-ent/limiter/concurrent.rb
.I could also get this exception when running WorkerA before WorkerB but not as consistent (maybe a timing issue with the connection pool). In this case, WorkerB will fail and the error message is "Waited 1 sec, 0/5 available".
I further tried to debug the problem and found that the
redis_pool
inside theSidekiq::Limiter::Concurrent
class was set to asize
of 1 for both workers. Finally, I found theredis_pool
method inSidekiq::Limiter
which looks like this:If I understand the code correctly, this will memoize the redis pool from
Thread.current[:sidekiq_capsule]
which is the first capsule being used. When removing the@redis ||=
part, the above example works fine, i.e.With this change applied, the
redis_pool.size
insideSidekiq::Limiter::Concurrent
changed from 1 to 5 and back depending on the worker / queue (as it should?) an no exception occurred anymore.Although this fix "works", I'm not sure if this is the correct way to solve the issue. Besides, I didn't really understand why running a worker inside a wrong capsule / config would immediately cause a
ConnectionPool::TimeoutError
in the first place. Is the connection pool size synchronized with the number of Sidekiq threads? (maybe you can shed some light on the way connection pools are used in Sidekiq, just out of interest)I hope you can fix this problem soon with the given information.
The text was updated successfully, but these errors were encountered: