Replies: 39 comments 35 replies
-
Facing this exact issue. |
Beta Was this translation helpful? Give feedback.
-
I experience the same issue with
The redis service would restart but the celery worker won't consume messages from redis afterwards. |
Beta Was this translation helpful? Give feedback.
-
Same issue with: |
Beta Was this translation helpful? Give feedback.
-
same issue with: |
Beta Was this translation helpful? Give feedback.
-
Same issue with: although the worker continued to consume tasks for ~20 minutes after redis reconnected, then stopped. |
Beta Was this translation helpful? Give feedback.
-
Same issue: |
Beta Was this translation helpful? Give feedback.
-
I also want to know if any negative impact to run celery without the heartbeat/gossip/mingle enabled? Any idea for this ? thanks ! |
Beta Was this translation helpful? Give feedback.
-
Same issue here |
Beta Was this translation helpful? Give feedback.
-
Had the same issue. But, after some random period of time those tasks gets consume. Sometimes it takes loooong time (even hours), but they finally start to show up. |
Beta Was this translation helpful? Give feedback.
-
We had the same issue, for now, we're "working around it" by using --without-mingle and --without-gossip (I did not use --without-heartbeat) but seems like the problem is resolved. Hopefully we won't run into new issues because we deactivated both features. |
Beta Was this translation helpful? Give feedback.
-
Same issue here, using a redis cluster |
Beta Was this translation helpful? Give feedback.
-
Same here on |
Beta Was this translation helpful? Give feedback.
-
Same issue here with
|
Beta Was this translation helpful? Give feedback.
-
Same issue here
|
Beta Was this translation helpful? Give feedback.
-
I added a step by step process to reproduce the bug, please let me know if I can provide anything else to help this being resolved. Thanks a lot ! |
Beta Was this translation helpful? Give feedback.
-
Same issue but I'm using RabbitMQ as a broker and Redis as the backend: |
Beta Was this translation helpful? Give feedback.
-
same issue: |
Beta Was this translation helpful? Give feedback.
-
same issue: any ideas when a fix will be available? |
Beta Was this translation helpful? Give feedback.
-
Hi, I'm new to celery and experiencing a similar issue using celery+redis. Not sure if this can help but lowering the visibility_timeout to a low number seems to help, as per default the visibility_timeout is set to 3600. I still need to test it out a bit more, but maybe someone more experienced with celery and redis can try it out. Thanks. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the Very helpful comment BTW |
Beta Was this translation helpful? Give feedback.
-
Same issue, any update for resolve? |
Beta Was this translation helpful? Give feedback.
-
Same issue here. we use celery in our three projects. but all of them are facing this un-expected error.. we will try replace redis to rabbitmq. however if it not work, we have to think about the usage of celery in real-production |
Beta Was this translation helpful? Give feedback.
-
Same issue, we switched from redis to rabbitmq a few weeks ago, still the issue. |
Beta Was this translation helpful? Give feedback.
-
yes, time to shake the tree on this issue. Been listed for 2 years now. And no rectification. But I suppose removing the Redis into an independent instance that is stable and rarely reset would be the middle ground. |
Beta Was this translation helpful? Give feedback.
-
As I mentioned in a comment above (and a few other places), there is indeed a lot of work behind the scenes. YESTERDAY we’ve reached a significant milestone with the “behind the scenes” effort to improve our testing infrastructure. All in all, I want to clarify that we took a step aside so we can take many more steps forward, and our focus is building the infrastructure that will allow us to deal with many years of issues, not just one at a time, by giving the community the simplicity it needs to contribute. Once our new testing infrastructure is finalized, we’ll be able to focus on the Celery v5.4 release, which I’d hope to also include a fix for this issue as I prefer to avoid just decreasing the official support and fixing the bug instead. |
Beta Was this translation helpful? Give feedback.
-
Just to confirm, if the Redis server is run from outside a typical Docker compose implementation and is reasonably static and rarely restarted, this would largely avoid this issue? I mainly have this issue when a main application, that has the Redis instance as part of the docker-compose file, is restarted, usually because I am updating the code. (every few weeks is common). When I do this I have 2 other containers that run as celery workers based on separate docker-compose files. They will be disconnected on Redis restart and fail to reconnect until I do a complete restart of those containers. (Which I commonly forget to do) So technically, if I rarely stop Redis, I should avoid the problem most of the time. If this is the case, it should be a recommendation in the docs until rectified. |
Beta Was this translation helpful? Give feedback.
-
The tree has been shaken! |
Beta Was this translation helpful? Give feedback.
-
Celery v5.4.0rc1 is ready for testing! |
Beta Was this translation helpful? Give feedback.
-
Same problem with This is my import configparser
config = configparser.ConfigParser()
config.read('config.ini')
CELERY_CONFIG = config['celery']
broker_url = CELERY_CONFIG['broker']
result_backend = CELERY_CONFIG['backend']
worker_cancel_long_running_tasks_on_connection_loss = True
broker_connection_retry_on_startup = True
# FIXME: work-around for sudden connection drop on Redis.
# Track this problem on:
# - https://groups.google.com/g/celery-users/c/6yF34oA30Ys
# - https://github.com/celery/celery/discussions/7276
broker_connection_max_retries = None
broker_pool_limit = None
worker_deduplicate_successful_tasks = True
worker_concurrency = 1
worker_prefetch_multiplier = 1
worker_state_db = "state.db"
worker_send_task_events = True
worker_pool = 'prefork'
task_time_limit = 3600 My worker starting command are: Running After my worker stopped consuming task, I tried to It seems like the worker couldn't close the connection and stuck there. I haven't tried the |
Beta Was this translation helpful? Give feedback.
-
I am also experiencing this issue. Simple setup: Celery worker, one Redis database as broker, one as results backend, both on the same Redis instance. When restarting Redis, meaning it comes back up within seconds or even less, this is logged:
Sometimes, tasks 'work' again, but more often, they don't. Whether they do or don't, seems random. When tasks are received, nothing is logged. Contrary to some other messages in this thread, Celery shuts down gracefully when stopping it. What I've tried: Set
... so it's no surprise it doesn't work, as restarting the single Redis instance causes the broker to become unavailable too. Set keepalive on the socket, according to the configuration at #7276 (reply in thread). That doesn't help either. |
Beta Was this translation helpful? Give feedback.
-
I am experiencing an issue with celery==5.2.3 that I did not experience with celery 4.4.7 which I have recently migrated from.
I am using redis (5.0.9) as the message broker. When I manually restart redis, the celery worker from time to time (after the restart of redis) stops consuming tasks indefinitely. Celery beat is able to publish tasks to the broker without any problem after the redis restarts. Once I force a restart of the worker, it will get all the past scheduled tasks by beat.
Only if I run celery 5 worker without heartbeat/gossip/mingle this does not happen and I can restart redis without the worker stopping to consume tasks after it reconnects to it.
I am running the worker with the following options to "make it work":
celery -A proj worker -l info --without-heartbeat --without-gossip --without-mingle
When I try running celery with rabbitmq as the message broker and with mingle/gossip/heartbeat I cannot reproduce the bug (this only happens with redis). But for the scenario I am using I need to keep using redis.
I have 2 questions:
Logs prior to when it get's stuck. I did wait for half an hour and tasks (periodic task are scheduled every 5 minutes) were not consumed by the worker, then I did hit ctrl+c. There is no logs when it stops consuming messages, it just "freezes":
Celery report
Beta Was this translation helpful? Give feedback.
All reactions