[Feature] Implement Time to live (TTL) for jobs #479

manast · 2017-03-30T09:43:11Z

When adding jobs to the queue, it should be possible to define a maximum TTL. If the job takes more time to be processed than the TTL, it would be automatically failed.
A few things to consider:

It may be useful for the job to mark that it is till working on the job properly, to get a TTL extension.
When a job is failed by force, the queue should notify the worker so that it may have a chance to make a graceful shutdown, otherwise we may end having a worker that is busy for ever.

manast · 2017-03-31T10:15:48Z

Note that we have the timeout option currently, but does address the two issues mentioned above.

DevBrent · 2017-04-04T16:06:25Z

If there is any progress on notifying the worker from the queue for a graceful shutdown, I'd like to hear about that in #484 for graceful shutdown within stalled jobs.

pvraj · 2017-04-06T13:06:10Z

Hello,
Until this ticket is completed, is there a way you suggest I can prevent jobs that exceed the timeout from still being processed? I tried Bull because the TTL functionality was not working in Kue. Thank you!

manast · 2017-04-06T13:45:45Z

Not really. In fact this TTL functionality could work really well after #488 is ready, since it will allow us to actually kill the process that has exceeded the TTL really forcing it to stop working. These are pretty high prioritized items so expect them to be released soon.

adamreisnz · 2021-08-03T05:09:51Z

Hello, it has been 4+ years since this ticket was opened. Is there any progress on its implementation?

#488 appears to be implemented, which was supposedly a prerequisite for this feature?

sinasalek · 2021-10-21T11:30:23Z

It's not the best solution but for now you can check the job status on every loop iteration in the process and abort if it status is set to failed. So for stopping a job, you can just set its status to failed.

elucidsoft · 2021-12-02T21:21:03Z

So I just added logic to handle this, all I did was add a timestamp to jobs, then in the workers before they do anything else, they check the timestamp. If its older than 120 seconds (in our case), they cancel it. Sounds like a stupid solution right? But the workers make short work of getting of all the stale jobs in a queue this way so they can begin working on valid jobs, it takes seconds to clear out thousands of stale jobs.

adamreisnz · 2021-12-20T02:22:18Z

@sinasalek

It's not the best solution but for now you can check the job status on every loop iteration in the process and abort if it status is set to failed. So for stopping a job, you can just set its status to failed.

If this is done externally (e.g. outside of the scope of the worker doing the job), say in a separate script, do you know if the worker will actually be stopped/abort the job, if the job has been set to failed by the external script?

If the worker keeps waiting for the job to finish (and you have a stalled job because the worker is stuck), this solution would not work for that case.

@elucidsoft

So I just added logic to handle this, all I did was add a timestamp to jobs, then in the workers before they do anything else, they check the timestamp

Is this to prevent jobs that have failed previously from being picked up if they are older than 120 seconds?
I assume this won't work for the scenario that a job runs for the first time and runs more than 120 seconds?

@manast is there any progress on an official TTL solution for the new version of BullMQ? Happy to bounce around implementation ideas if there is a design proposal that needs refinement.

manast added the enhancement label Mar 30, 2017

manast mentioned this issue Mar 30, 2017

Removing active/running jobs #473

Closed

manast changed the title ~~New feature. Implement Time to live (TTL) for jobs~~ Feature] Implement Time to live (TTL) for jobs Mar 31, 2017

manast changed the title ~~Feature] Implement Time to live (TTL) for jobs~~ [Feature] Implement Time to live (TTL) for jobs Mar 31, 2017

manast mentioned this issue Apr 4, 2017

timeout option not works as expected #486

Closed

manast mentioned this issue Sep 19, 2018

Ability not to start job after delay #1055

Closed

This was referenced Aug 3, 2021

Add a way to expire jobs taskforcesh/bullmq#301

Open

timeout still working? taskforcesh/bullmq#136

Closed

cincodenada mentioned this issue Nov 4, 2021

Add more detail about timeouts and startDate to the reference #2199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Implement Time to live (TTL) for jobs #479

[Feature] Implement Time to live (TTL) for jobs #479

manast commented Mar 30, 2017

manast commented Mar 31, 2017

DevBrent commented Apr 4, 2017

pvraj commented Apr 6, 2017

manast commented Apr 6, 2017 •

edited

adamreisnz commented Aug 3, 2021 •

edited

sinasalek commented Oct 21, 2021

elucidsoft commented Dec 2, 2021

adamreisnz commented Dec 20, 2021 •

edited

[Feature] Implement Time to live (TTL) for jobs #479

[Feature] Implement Time to live (TTL) for jobs #479

Comments

manast commented Mar 30, 2017

manast commented Mar 31, 2017

DevBrent commented Apr 4, 2017

pvraj commented Apr 6, 2017

manast commented Apr 6, 2017 • edited

adamreisnz commented Aug 3, 2021 • edited

sinasalek commented Oct 21, 2021

elucidsoft commented Dec 2, 2021

adamreisnz commented Dec 20, 2021 • edited

manast commented Apr 6, 2017 •

edited

adamreisnz commented Aug 3, 2021 •

edited

adamreisnz commented Dec 20, 2021 •

edited