Documentation should discourage or warn about using eta/countdown parameters #8069

norbertcyran · 2023-02-16T08:39:23Z

Checklist

I have checked the issues list
for similar or identical bug reports.
I have checked the pull requests list
for existing proposed fixes.
I have checked the commit log
to find out if the bug was already fixed in the main branch.
I have included all related issues and possible duplicate issues in this issue
(If there are none, check this box anyway).

Related Issues and Possible Duplicates

Related Issues

Continuous memory leak #4843 - inappropriate usage of eta and countdown can lead to increasing worker RAM usage which may be mistakenly recognized as a memory leak
docs regarding how to use eta/countdown should mention visibility_timeout setting for Redis and SQS brokers #4766

Possible Duplicates

None

Description

eta and countdown parameters can be easily overused, especially when using Redis as the broker.

When a task with those parameters is sent to the queue, the worker grabs that task immediately but doesn't actually execute it until eta/countdown conditions pass. Until that, the task is stored in memory. Obviously, one such task won't cause a problem, but in our production environment, we ended up in a situation where hundreds of thousands were accumulated in the worker, causing RAM usage to increase by 30GB over a period of a few days. This behavior is documented very deeply in the documentation, so I guess not many people have reached that page.

The situation can get even worse because of the visibility_timeout setting (1 hour by default), which causes those tasks to be redelivered. If the eta/countdown points to a distant future, tasks will be redelivered every visibility_timeout period.

The documentation suggests increasing the visibility_timeout to match the longest possible ETA in the application:

So you have to increase the visibility timeout to match the time of the longest ETA you’re planning to use

IMO, that's not a good idea as it can have a negative effect on reliability. On worker's failure, dropped tasks won't get redelivered until visibility_timeout passes. If set to some high value, it may impact user experience (e.g. email is sent after x days when its content is no longer relevant, some data important for the user is processed with x days of delay, etc.).

Suggestions

I'd suggest strongly discouraging using eta/countdown parameters. In my opinion, they should be only used with very low values (seconds, minutes). Instead, alternatives could be suggested, e.g. database-backed celery beat.

In Caveats I'd not recommend increasing visibility_timeout as a feasible solution, but if the task needs a longer ETA, I'd point to the alternatives mentioned above.

Optionally, it would be nice to add a warning section with all the implications I've mentioned above.

Lots of credit to the author of that article: https://engineering.instawork.com/celery-eta-tasks-demystified-424b836e4e94 who explained the problem perfectly

The text was updated successfully, but these errors were encountered:

open-collective-bot · 2023-02-16T08:39:25Z

Hey @norbertcyran 👋,
Thank you for opening an issue. We will get back to you as soon as we can.
Also, check out our Open Collective and consider backing us - every little helps!

We also offer priority support for our sponsors.
If you require immediate assistance please consider sponsoring us.

auvipy · 2023-02-16T10:20:13Z

would you mind send a PR to improve the doc?

norbertcyran · 2023-02-17T17:40:04Z

@auvipy sure! Here it is: #8075

ad-m-ss · 2023-04-06T01:47:48Z

@norbertcyran could you close that issue since the PRs is merged?

auvipy · 2023-04-08T08:48:09Z

thanks for reminding

norbertcyran added Category: Documentation Issue Type: Bug Report labels Feb 16, 2023

norbertcyran mentioned this issue Feb 16, 2023

Continuous memory leak #4843

Open

norbertcyran pushed a commit to norbertcyran/celery that referenced this issue Feb 17, 2023

Improve documentation on ETA/countdown tasks (celery#8069)

f83d0b6

norbertcyran mentioned this issue Feb 17, 2023

Improve documentation on ETA/countdown tasks (#8069) #8075

Merged

auvipy pushed a commit that referenced this issue Feb 18, 2023

Improve documentation on ETA/countdown tasks (#8069)

2b4b500

auvipy added this to the 5.3 milestone Apr 8, 2023

auvipy closed this as completed Apr 8, 2023

ShaheedHaque mentioned this issue Dec 24, 2023

What is --statedb for? #8737

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation should discourage or warn about using eta/countdown parameters #8069

Documentation should discourage or warn about using eta/countdown parameters #8069

norbertcyran commented Feb 16, 2023

open-collective-bot bot commented Feb 16, 2023

auvipy commented Feb 16, 2023

norbertcyran commented Feb 17, 2023

ad-m-ss commented Apr 6, 2023

auvipy commented Apr 8, 2023

Documentation should discourage or warn about using eta/countdown parameters #8069

Documentation should discourage or warn about using eta/countdown parameters #8069

Comments

norbertcyran commented Feb 16, 2023

Checklist

Related Issues and Possible Duplicates

Related Issues

Possible Duplicates

Description

Suggestions

open-collective-bot bot commented Feb 16, 2023

auvipy commented Feb 16, 2023

norbertcyran commented Feb 17, 2023

ad-m-ss commented Apr 6, 2023

auvipy commented Apr 8, 2023