Several Potential Enhancements #314

jakemack · 2012-07-25T19:28:20Z

I'm in the process of evaluating Sidekiq as a replacement for Resque in our production app. Nightly, we process ~1.3 million jobs on Heroku, so cutting down on the memory footprint will greatly reduce our costs for dynos. It looks promising so far, but there are a few niceties of Resque that seem to be missing. Despite that, I think we're going to try switching over.

That being said, I wanted to suggest a few new features that would help us greatly (and hopefully will be useful to other people as well). With these suggestions come the offer to implement most of them and submit them as pull requests. I just wanted to make sure my time isn't wasted developing something you didn't want to incorporate into the gem. Also, since you know the internals better than I do, I thought you might have suggestions on the architecture/implementation as I get further along in developing them.

The first thing should be very simple: I'd like to add a polling feature to the Sidekiq UI much like Resque has to avoid having to refresh every couple seconds while monitoring queues and workers.

The second feature comes directly from the amount of job processing we do and how we queue them up. We're currently queuing up the jobs one at a time, as that is the only way Resque offers. Needless to say, queuing up over a million jobs sequentially takes a bit of time. I'd like to add a method to Sidekiq::Worker along the lines of Sidekiq::Worker.perform_many_async(*args) where you pass in an array of parameters for the worker's perform method. For example, if you had TonsOfWork#perform(number) as your perform method, you'd call TonsOfWork.perform_many_async([1, 2, 3, 4]) which would insert four jobs into Redis, each with the class TonsOfWork, one for each parameter in the array. If you had a perform method that took multiple parameters, you would just pass an array of arrays (TonsOfWork.perform_many_async([[1, 'work'], [2, 'play'], [3, 'sleep'], [4, 'eat]])). Redis' RPUSH method allows multiple values to be pushed in one call as of 2.4, so it should be a relatively simple addition. I haven't checked which versions of Redis are supported by the redis gem, so perhaps we'd have to put some sort of check in there to use this method and fall back to adding the jobs in separate calls.

The third change would be an option for changing how queues are handled. In our application, we don't care about queue starvation. In fact, we have a queue that we only want to run if no other queues have any jobs in them. So, I'd like to implement a strict queue option that always pulls from the highest priority (first declared, like Resque perhaps?) queue first, then proceeds to the next and the next. It could just be a command line switch such as sidekiq --enable_strict_queue.

Lastly, I've branched resque-scheduler and integrated the common heroku autoscaling code into it so that it runs at all times instead of during job queuing/processing. With Sidekiq, it seems that the middleware is a perfect place for this code to run, though I haven't looked into middleware much yet. This one I could write as a separate gem, I just wanted to get your thoughts on it. It would just be simple code to check the number of jobs queued, the number of workers running, and some decision process on whether or not to call into the Heroku API and scale workers up or down. Does it sound like a good candidate for Sidekiq middleware? And if so, would it be client middleware or server middleware?

Thanks a lot for taking the time to read through this post as it's gotten a little long. I'm very excited to start using Sidekiq and chop our dyno count down by an order of magnitude.

The text was updated successfully, but these errors were encountered:

mperham · 2012-07-25T20:45:31Z

Fantastic.
I've seen this once before. I would question why only one thing is queueing up all these jobs. Why not use workers to enqueue the jobs: fan out the enqueueing work to 100 sidekiq workers? Another concern is overloading the perform_* call with different characteristics. Do I need to provide a perform_many_at(timestamp, *many) call too?
I'd be fine with a "strictly ordered queues" flag.
My gut says you want your monitoring to run separately from the thing being monitored/controlled. I'm not convinced this is appropriate for middleware. We wrote a custom http endpoint which just returns a tiny JSON payload with our total queue backlog for alerting with Pingdom. You could write a simple script which starts or stops sidekiq instances based on the backlog size.

jakemack · 2012-07-26T02:54:20Z

Cool, I'll get started on 1 and 3. Your point for 4 makes perfect sense as that's essentially what my modified version of resque-scheduler is doing, since it's a separate process.

As for 2, my issue with splitting the enqueuing up between workers is that it just adds an extra layer of complexity to enqueuing any job I want a lot of. What do you do once splitting it between 100 workers is no longer fast enough? Do you just create another job that queues up jobs that queue up the actual jobs? You could just throw more workers at it, but that's not particularly efficient, especially when Redis already supports adding many things at once. Another issue with splitting it up is you now need to use all of those connections to Redis to add more jobs rather than to grab and process jobs. I could see that becoming an issue when you have a connection limit with a service like RedisToGo. On the flip side, I'm also curious what performance would be like sending one massive request to RedisToGo. I'm thinking it would be faster than making and destroying 100 connections per worker, but I really don't know.

As for the perform_many_at/_in calls, I'd imagine it would be nice to have them as well in order to keep the API consistent. I'm not sure what performance would look like when Sidekiq is trying to pull 500k jobs from Redis all scheduled at the same instant, but I'd imagine they wouldn't all manage to make it into Redis at the same time (unless I also upgraded the Poller class to push all messages for each queue in one Redis call rather than many, which might be a useful upgrade in and of itself.) Looking through the code, it doesn't seem like it would be that much different than the code I'd add for the basic perform_many_async call.

Anyway, we can mull over 2 a little more while I work on the other things. Queuing up serially is what we're doing now with Resque and it works, it's just not as efficient as I'd like it to be. So, it's not the end of the world (yet) if it's a feature we don't end up building, but I'd definitely like to explore the possibility more until I'm fully convinced one way or the other.

GAV1N · 2012-09-26T04:23:05Z

Any further thoughts on the strict queue option?

GAV1N · 2012-09-26T04:25:09Z

Whoops, just found that it's addressed here: #319.

mperham closed this as completed Aug 4, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Several Potential Enhancements #314

Several Potential Enhancements #314

jakemack commented Jul 25, 2012

mperham commented Jul 25, 2012

jakemack commented Jul 26, 2012

GAV1N commented Sep 26, 2012

GAV1N commented Sep 26, 2012

Several Potential Enhancements #314

Several Potential Enhancements #314

Comments

jakemack commented Jul 25, 2012

mperham commented Jul 25, 2012

jakemack commented Jul 26, 2012

GAV1N commented Sep 26, 2012

GAV1N commented Sep 26, 2012