Skip to content
This repository has been archived by the owner on Feb 7, 2023. It is now read-only.

Commit

Permalink
Change default lock timeout to 30sec. It's a more sensible default fo…
Browse files Browse the repository at this point in the history
…r the following reasons:

* A lot of people complain about jobs being double processed currently. So there must be a lot of poorly written job processor code out there that stalls the event loop :). Or folks, like me, forget that we're running our code on tiny instances in the cloud where the CPU is so limited that a tiny bit of JS work will max the CPU. A 30sec timeout would give a bit more buffer. At least until we figure out a generic solution like OptimalBits#488.
* An expired lock (due to event loop stalling) is quite fatal now that we check the lock prior to moving the job to the completed or failed (previously we would still move it if there wasn't a lock). So if a long running job (let's say 2min) stalls the event loop for even just 5sec, it means that the job can never complete at that point. It might still continue processing, but another worker would have likely picked it up as a stalled job and processed it again. Or if it doesn't happen to get picked up as stalled, when it finally completes it still won't be moved to completed because it lost the lock at one time.
* The tradeoff is that it will take longer for jobs to be considered 'stalled'. So instead waiting max 5sec to find out if a job was stalled, we'd wait max 30sec. I think this is generally OK and that most people aren't running jobs that are that time-sensitive. Actual stalled jobs [due to process crashes] should be extremely rare anyways.

This also sets the stalledInterval to 30sec since it doesn't do much good to have it run more frequently than the lock timeout. It's also slightly expensive to run in Redis as it iterates all jobs in the 'active' queue (see moveUnlockedJobsToWait.lua), so it'd be nice to run this less often anyways.
  • Loading branch information
bradvogel committed Apr 23, 2017
1 parent 58d981f commit 95072a8
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 6 deletions.
4 changes: 2 additions & 2 deletions README.md
Expand Up @@ -419,9 +419,9 @@ __Arguments__

// Advanced settings (see below)
settings?: QueueSettings {
lockDuration?: number = 5000,
lockDuration?: number = 30000,
lockRenewTime?: number = lockDuration / 2,
stalledInterval?: number = 5000,
stalledInterval?: number = 30000,
maxStalledCount?: number = 1,
guardInterval?: number = 5000,
retryProcessDelay?: number = 5000,
Expand Down
9 changes: 5 additions & 4 deletions lib/queue.js
Expand Up @@ -60,8 +60,9 @@ var MAX_TIMEOUT_MS = Math.pow(2, 31) - 1; // 32 bit signed
// Advanced settings
settings?: QueueSettings {
lockDuration?: number = 5000,
stalledInterval?: number = 5000,
lockDuration?: number = 30000,
lockRenewTime?: number = lockDuration / 2,
stalledInterval?: number = 30000,
maxStalledCount?: number = 1, // The maximum number of times a job can be recovered from the 'stalled' state
guardInterval?: number = 5000,
retryProcessDelay?: number = 5000
Expand Down Expand Up @@ -140,8 +141,8 @@ var Queue = function Queue(name, url, opts){
this.retrieving = 0;

this.settings = _.defaults(opts.settings, {
lockDuration: 5000,
stalledInterval: 5000,
lockDuration: 30000,
stalledInterval: 30000,
maxStalledCount: 1,
guardInterval: 5000,
retryProcessDelay: 5000
Expand Down

0 comments on commit 95072a8

Please sign in to comment.