You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Related to issue #402, we have some long-running tasks that may last for hours. Currently, if a worker encounters a failure, the task is only retried after the in_progress_key expires, which is based on the max_timeout - potentially many hours.
A huge enhancement would be to lower the default self.in_progress_timeout_s to a lower value, like 10 seconds. The worker could then periodically update the in_progress_key expirations on every heartbeat, increasing it by a few seconds each time. This could ensure that jobs are retried promptly if a worker fails, rather than waiting for a long timeout.
This would be incredibly helpful for handling worker failures on long-running tasks.
The text was updated successfully, but these errors were encountered:
Related to issue #402, we have some long-running tasks that may last for hours. Currently, if a worker encounters a failure, the task is only retried after the
in_progress_key
expires, which is based on themax_timeout
- potentially many hours.arq/arq/worker.py
Line 264 in 9109c2e
A huge enhancement would be to lower the default
self.in_progress_timeout_s
to a lower value, like 10 seconds. The worker could then periodically update thein_progress_key
expirations on every heartbeat, increasing it by a few seconds each time. This could ensure that jobs are retried promptly if a worker fails, rather than waiting for a long timeout.This would be incredibly helpful for handling worker failures on long-running tasks.
The text was updated successfully, but these errors were encountered: