From b278fadcc817993537ed523e869d3f246869c7ed Mon Sep 17 00:00:00 2001 From: Chris Lee Date: Wed, 23 Dec 2020 12:04:17 -0500 Subject: [PATCH] Indico 5.0.1rc1 (#2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Remove defaults for unsupported Python runtimes. * Remove obsolete test. * Doc pytest plugin (#6289) * update to new pytest name * doc pytest plugin * trim heading to the length of the new pytest name * add warning against use of sort key on dynamodb table, closes #6332 * Remove celery.five and bump vine dep (#6338) * improv: Replace `five.values` with `dict.values` * improv: Use `time.monotonic()` in kombu tests Also in the docs where it is used to demonstrate `memcache` timeouts. * rm: Delete `celery.five` `vine.five` is no longer present in `vine >= 5`. * triv: Remove refs to `celery.five` in docs, &c * build: Bump `vine` dependency to 5.0+ * Wheels are no longer universal. * Remove failing before_install step. * Update changelog. * Bump version: 5.0.0rc2 → 5.0.0rc3 * Fix release date. * Remove unused import. * Correctly skip these tests when the relevant dependency is missing. * Expose retry_policy for Redis result backend Rather than adding a new top-level config option, I have used a new key in the already existing setting `result_backend_transport_options`. Closes #6166 * Update changelog for 4.3.1. * fix typo (#6346) * Travis CI: Test Python 3.9 release candidate 1 (#6328) * Travis CI: Test Python 3.9 release candidate 1 * fixup! Travis CI: matrix --> jobs * fixup! Fix indentation error * fixup! tox.ini: 3.9 --> 3.9-dev * Fix test failure in Python 3.9RC1. Co-authored-by: Omer Katz * Fix the broken celery upgrade settings command. * Fix celery migrate settings options. * Remove Riak result backend settings. * Rephrase to mention that 3.5 is also EOL. * Add a note about the removal of the Riak result backend. * Fix examples of starting a worker in comments (#6331) * Remove deprecated function from app.log.Logging. * Migration guide. * Document breaking changes for the CLI in the Whats New document. * Add a "port code to Python 3" migration step. * Update supported Python versions in the introduction document. * Update bash completion. * Add note about new shell completion. * Update daemonization docs. * Remove amqp backend. (#6360) Fixes #6356. * Warn when deprecated settings are used (#6353) * Warn when deprecated settings are used. * Mention deprecation in docs. * Refer to the right place in the documentation. * Complete What's New. * Add wall of contributors. * Update codename. * Fix alt text. * isort. * PyPy 3.7 is currently in alpha. No need for that sentence. * Mention the new pytest-celery plugin. * Mention retry policy for the redis result backend. * Fix phrasing. * Mention ordered group results are now the default. * pyupgrade. * Complete release notes. * Bump version: 5.0.0rc3 → 5.0.0 * Happify linters. * Specify utf-8 as the encoding for log files. Fixes #5144. * Fixed some typos in readme * Fix custom headers propagation for protocol 1 hybrid messages * Retry after race during schema creation in database backend (#6298) * Retry after race during schema creation in database backend Fixes #6296 This race condition does not commonly present, since the schema creation only needs to happen once per database. It's more likely to appear in e.g. a test suite that uses a new database each time. For context of the sleep times I chose, the schema creation takes ~50 ms on my laptop. I did a simulated test run of 50 concurrent calls to MetaData.create_all repeated 200 times and the number of retries was: - 0 retries: 8717x - 1 retry: 1279x - 2 retries 4x * Add test for prepare_models retry error condition * Add name to contributors * Update daemonizing.rst Fix daemonizing documentation for issue #6363 to put `multi` before `-A` * Revert "Update daemonizing.rst" (#6376) This reverts commit 96ec6db611f86f44a99f58d107c484dc011110ce. * bugfix: when set config result_expires = 0, chord.get will hang. (#6373) * bugfix: when set config result_expires = 0, chord.get will hang. `EXPIRE key 0` will delete a key in redis, then chord will never get the result. fix: https://github.com/celery/celery/issues/5237 * test: add testcase for expire when set config with zero. * Display a custom error message whenever an attempt to use -A or --app as a sub-command option was made. Fixes #6363 * Remove test dependencies for Python 2.7. * Restore the celery worker --without-{gossip,mingle,heartbeat} flags (#6365) In the previously used argparse arguments framework, these three options were used as flags. Since 5.0.0, they are options which need to take an argument (whose only sensible value would be "true"). The error message coming up is also (very) hard to understand, when running the celery worker command with an odd number of flags: Error: Unable to parse extra configuration from command line. Reason: not enough values to unpack (expected 2, got 1) When the celery worker is run with an even number of flags, the last one is considered as an argument of the previous one, which is a subtle bug. * Provide clearer error messages when app fails to load. * fix pytest plugin registration documentation (#6387) * fix pytest plugin registration documentation * Update docs/userguide/testing.rst Co-authored-by: Thomas Grainger Co-authored-by: Omer Katz * Contains a workaround for the capitalized configuration issue (#6385) * Contains a workaround for the capitalized configuration issue * Update celery/apps/worker.py Co-authored-by: Omer Katz * Update celery/apps/worker.py Co-authored-by: Omer Katz Co-authored-by: Omer Katz * Remove old explanation regarding `absolute_import` (#6390) Resolves #6389. * Update canvas.rst (#6392) * Update canvas.rst Tiny fixes. * Update docs/userguide/canvas.rst Co-authored-by: Omer Katz Co-authored-by: Omer Katz * Remove duplicate words from docs (#6398) Remove the duplicate usage of “required” in documentation (specifically, `introduction.rst`). * Allow lowercase log levels. (#6396) Fixes #6395. * Detach now correctly passes options with more than one word. (#6394) When specifying options such as `-E` the detached worker should receive the `--task-events` option. Instead it got the `--task_events` option which doesn't exist and therefore silently failed. This fixes #6362. * The celery multi command now works as expected. (#6388) * Contains the missed change requested by @thedrow * Added a some celery configuration examples. * fixed loglevel info->INFO in docs * return list instead set in CommaSeparatedList _broadcast method of kombu Mailbox. does not support set https://github.com/celery/kombu/blob/7b2578b19ba4b1989b722f6f6e7efee2a1a4d86a/kombu/pidbox.py#L319 * Rewrite detaching logic (#6401) * Rewrite detaching logic. * Ignore empty arguments. * Ensure the SystemD services are up to date. * fix: Pass back real result for single task chains When chains are delayed, they are first frozen as part of preparation which causes the sub-tasks to also be frozen. Afterward, the final (0th since we reverse the tasks/result order when freezing) result object from the freezing process would be passed back to the caller. This caused problems in signaling completion of groups contained in chains because the group relies on a promise which is fulfilled by a barrier linked to each of its applied subtasks. By constructing two `GroupResult` objects (one during freezing, one when the chain sub-tasks are applied), this resulted in there being two promises; only one of which would actually be fulfilled by the group subtasks. This change ensures that in the special case where a chain has a single task, we pass back the result object constructed when the task was actually applied. When that single child is a group which does not get unrolled (ie. contains more than one child itself), this ensures that we pass back a `GroupResult` object which will actually be fulfilled. The caller can then await the result confidently! * fix: Retain `group_id` when tasks get re-frozen When a group task which is part of a chain was to be delayed by `trace_task()`, it would be reconstructed from the serialized request. Normally, this sets the `group_id` of encapsulated tasks to the ID of the group being instantiated. However, in the specific situation of a group that is the last task in a chain which contributes to the completion of a chord, it is essential that the group ID of the top-most group is used instead. This top-most group ID is used by the redis backend to track the completions of "final elements" of a chord in the `on_chord_part_return()` implementation. By overwriting the group ID which was already set in the `options` dictionaries of the child tasks being deserialized, the chord accounting done by the redis backend would be made inaccurate and chords would never complete. This change alters how options are overridden for signatures to ensure that if a `group_id` has already been set, it cannot be overridden. Since group ID should be generally opaque to users, this should not be disruptive. * fix: Count chord "final elements" correctly This change amends the implementation of `chord.__length_hint__()` to ensure that all child task types are correctly counted. Specifically: * all sub-tasks of a group are counted recursively * the final task of a chain is counted recursively * the body of a chord is counted recursively * all other simple signatures count as a single "final element" There is also a deserialisation step if a `dict` is seen while counting the final elements in a chord, however this should become less important with the merge of #6342 which ensures that tasks are recursively deserialized by `.from_dict()`. * test: Add more integration tests for groups These tests are intended to show that group unrolling should be respected in various ways by all backends. They should make it more clear what behaviour we should be expecting from nested canvas components and ensure that all the implementations (mostly relevant to chords and `on_chord_part_return()` code) behave sensibly. * test: Fix old markings for chord tests * fix: Make KV-store backends respect chord size This avoids an issue where the `on_chord_part_return()` implementation would check the the length of the result of a chain ending in a nested group. This would manifest in behaviour where a worker would be blocked waiting for for the result object it holds to complete since it would attempt to `.join()` the result object. In situations with plenty of workers, this wouldn't really cause any noticable issue apart from some latency or unpredictable failures - but in concurrency constrained situations like the integrations tests, it causes deadlocks. We know from previous commits in this series that chord completion is more complex than just waiting for a direct child, so we correct the `size` value in `BaseKeyValueStoreBackend.on_chord_part_return()` to respect the `chord_size` value from the request, falling back to the length of the `deps` if that value is missing for some reason (this is necessary to keep a number of the tests happy but it's not clear to me if that will ever be the case in real life situations). * fix: Retain chord header result structure in Redis This change fixes the chord result flattening issue which manifested when using the Redis backend due to its deliberate throwing away of information about the header result structure. Rather than assuming that all results which contribute to the finalisation of a chord should be siblings, this change checks if any are complex (ie. `GroupResult`s) and falls back to behaviour similar to that implemented in the `KeyValueStoreBackend` which restores the original `GroupResult` object and `join()`s it. We retain the original behaviour which is billed as an optimisation in f09b041. We could behave better in the complex header result case by not bothering to stash the results of contributing tasks under the `.j` zset since we won't be using them, but without checking for the presence of the complex group result on every `on_chord_part_return()` call, we can't be sure that we won't need those stashed results later on. This would be an opportunity for optimisation in future if we were to use an `EVAL` to only do the `zadd()` if the group result key doesn't exist. However, avoiding the result encoding work in `on_chord_part_return()` would be more complicated. For now, it's not worth the brainpower. This change also slightly refactors the redis backend unit tests to make it easier to build fixtures and hit both the complex and simple result structure cases. * Update obsolete --loglevel argument values in docs * Set logfile, not loglevel. * Mention removed deprecated modules in the release notes. Fixes #6406. * Copy __annotations__ when creating tasks This will allow getting type hints. Fixes #6186. * test: Improve chord body group index freezing test Add more elements to the body so we can verify that the `group_index` counts up from 0 as expected. This change adds the `pytest-subtests` package as a test dependency so we can define partially independent subtests within test functions. * test: Use all() for subtask checks in canvas tests When we expect all of the tasks in some iterable to meet a conditional, we should make that clear by using `all(condition for ...)`. * test: Add more tests for `from_dict()` variants Notably, this exposed the bug tracked in #6341 where groups are not deeply deserialized by `group.from_dict()`. * fix: Ensure group tasks are deeply deserialised Fixes #6341 * Fix `celery shell` command * predefined_queues_urls -> predefined_queues * Update changelog. * Bump version: 5.0.0 → 5.0.1 * [Fix #6361] Fixing documentation for RabbitMQ task_queue_ha_policy * Fix _autodiscover_tasks_from_fixups function * fixup! Fix _autodiscover_tasks_from_fixups function * Correct configuration item: CELERY_RESULT_EXPIRES Related issue: https://github.com/celery/celery/issues/4050 https://github.com/celery/celery/issues/4050#issuecomment-524626647 * Flush worker prints, notably the banner In some cases (kubernetes, root) the banner is only printed at the end of the process execution, instead of at the beginning. * [Fix #6361] Remove RabbitMQ ha_policy from queue * ci: Fix TOXENV for pypy3 unit tests Fixes #6409 * ci: Move Python 3.9 test base from dev to release * docs: fix celery beat settings * move to travis-ci.com * fix: Ensure default fairness maps to `SCHED_FAIR` (#6447) Fixes #6386 * Preserve callbacks when replacing a task with a chain (#6189) * Preserve callbacks when replacing a task with a chain. * Preserve callbacks when replacing a task with a chain. * Added tests. * Update celery/app/task.py Co-authored-by: maybe-sybr <58414429+maybe-sybr@users.noreply.github.com> * Mark test as flaky. * Fix race condition in CI. * fix: Run linked tasks in original slot for replace This change alters the handling of linked tasks for chains which are used as the argument to a `.replace()` call for a task which itself has a chain of signatures to call once it completes. We ensure that the linked callback is not only retained but also called at the appropiate point in the newly reconstructed chain comprised of tasks from both the replacement chain and the tail of the encapsulating chain of the task being replaced. We amend some tests to validate this behaviour better and ensure that call/errbacks behave as expected if the encapsulating chain has either set. One test is marked with an `xfail` since errbacks of encapsulating chains are not currently called as expected due to some ambiguity in when an errback of a replaced task should be dropped or not (#6441). Co-authored-by: Asif Saif Uddin Co-authored-by: maybe-sybr <58414429+maybe-sybr@users.noreply.github.com> * Fix minor documentation omission (#6453) Co-authored-by: Lewis Kabui * Fix max_retries override on self.retry (#6436) * Fix max_retries override * Fix max_retries override * Fix max_retries override * Update exceptions.py typo * Update autoretry.py typo * Update task.py Prevent exception unpacking for tasks without autoretry_for * Update test_tasks.py Unit test * Update test_tasks.py Added a new test * Update autoretry.py Fox for explicit raise in tasks * Update test_tasks.py * Update autoretry.py * Update task.py * Update exceptions.py * Update task.py * Happify linter. * Raise proper error when replacing with an empty chain. (#6452) Fixes #6451. * Update changelog. * Bump version: 5.0.1 → 5.0.2 * Update daemonizing.rst Improved systemd documentation for auto-start of the service, and mention the possibility to depend on RabbitMQ service. Also add Restart=always for Celery Beat example * Update celerybeat.service * Fix old celery beat variables Change made 5 days ago in 7c3da03a07882ca86b801ad78dd509a67cba60af is faulty, the correct celery beat variables do start with `CELERYBEAT` and not `CELERY_BEAT` * Fix formatting. * Fix formatting. * fix: Make `--workdir` eager for early handling (#6457) This change makes the `--workdir` options an eager one which `click` will process early for us, before any of the others. At the same time, we add a callback which ensures that the `chdir()` is run during handling of the argument so that all subsequent actions (e.g. app loading) occur in the specified working directory. Fixes #6445 * Fix example. Fixes #6459. * When using the MongoDB backend, don't cleanup if result_expires is 0 or None. (#6462) Fixes #6450. * Add missing space (#6468) * Fix passing queues into purge command (#6469) In current wersion calling `celery --app my.celery_app purge -Q queue_name` is failing with following trace: ``` names = (queues or set(app.amqp.queues.keys())) - exclude_queues TypeError: unsupported operand type(s) for -: 'list' and 'list' ``` Becouse code is expecting set and `queues` is actually a list. Here is a fix. * Change donations sidebar to direct users to OpenCollective. * Added pytest to extras. Missed in 9a6c2923e859b6993227605610255bd632c1ae68. * Restore app.start() and app.worker_main() (#6481) * Restore `app.start()` and `app.worker_main()`. * Update celery/app/base.py Co-authored-by: maybe-sybr <58414429+maybe-sybr@users.noreply.github.com> * Fix spelling error. Co-authored-by: maybe-sybr <58414429+maybe-sybr@users.noreply.github.com> * fix: `node_format()` logfile before detaching Fixes #6426 * Multithreaded backend (#6416) * Cache backend to thread local storage instead of global variable * Cache oid to thread local storage instead of global variable * Improve code returning thread_local data * Move thread local storage to Celery class, introduced thread_oid and added unittests * Remove python2 compatibility code * Restore ability to extend the CLI with new sub-commands. * Adjust documentation to demonstrate how to introduce sub-command plugins in 5.x. Fixes #6439. * autopep8 & isort. * Linters now run using Python 3.9. * Fix apply_async() in Calling Tasks userguide * Fix dead links in contributing guide (#6506) * Fix inconsistency in documentation for `link_error` (#6505) * Make documentation of link_error consistent Fixes #4099 * Fix undefined variable in example * Add to contributors list * Update testing.rst (#6507) Use double back ticks for some code examples, so that quotes don't get converted into smart-quotes. https://github.com/celery/celery/issues/6497 * Don't upgrade click to 8.x since click-repl doesn't support it yet. Fixes #6511. Upstream issue: https://github.com/click-contrib/click-repl/issues/72 * Update documentation on changes to custom CLI options in 5.0. Fixes #6380. * update step to install homebrew * redis: Support Sentinel with SSL Use the SentinelManagedSSLConnection when SSL is enabled for the transport. The redis-py project doesn't have a connection class for SSL+Sentinel yet. So, create a class in redis.py to add that functionality. * Revert "redis: Support Sentinel with SSL" (#6518) This reverts commit 18a0963ed36f87b8fb884ad27cfc2b7f1ca9f53c. * Reintroduce support for custom preload options (#6516) * Restore preload options. Fixes #6307. * Document breaking changes for preload options in 5.0. Fixes #6379. * Changelog for 5.0.3. * Bump version: 5.0.2 → 5.0.3 * Added integration tests for calling a task (#6523) * DummyClient of cache+memory:// backend now shares state between threads (#6524) * isort. * Update changelog. * Bump version: 5.0.3 → 5.0.4 * Change deprecated from collections import Mapping/MutableMapping to from collections.abc ... (#6532) * fix #6047 * Fix type error in S3 backend (#6537) * Convert key from bytes to str * Add unit test for S3 delete of key with type bytes * events.py: Remove duplicate decorator in wrong place (#6543) `@handle_preload_options` was specified twice as a decorator of `events`, once at the top (wrong) and once at the bottom (right). This fixes the `celery events` commands and also `celery --help` * Update changelog. * Bump version: 5.0.4 → 5.0.5 * ADD: indico additions - trails * FIX: remove dev.txt dependencies * ADD: handle revoke failures * ADD: trailer_request support and better drain resolution * ADD: merge options was overriding link_error values * PATCH: DLX and reject behaviour * FIX: amqp dependencies Co-authored-by: Omer Katz Co-authored-by: Thomas Grainger Co-authored-by: Martin Paulus Co-authored-by: maybe-sybr <58414429+maybe-sybr@users.noreply.github.com> Co-authored-by: Ash Berlin-Taylor Co-authored-by: qiaocc Co-authored-by: Christian Clauss Co-authored-by: Weiliang Li Co-authored-by: Akash Agrawal Co-authored-by: Michal Kuffa Co-authored-by: Frazer McLean Co-authored-by: Maarten Fonville Co-authored-by: laixintao Co-authored-by: Nicolas Dandrimont Co-authored-by: Bas ten Berge Co-authored-by: Zvi Baratz Co-authored-by: Justinas Petuchovas Co-authored-by: bastb Co-authored-by: Artem Bernatskyi Co-authored-by: ZubAnt Co-authored-by: Lewis Kabui Co-authored-by: David Pärsson Co-authored-by: Anthony Lukach Co-authored-by: Safwan Rahman Co-authored-by: Stepan Henek Co-authored-by: KexZh Co-authored-by: Thomas Riccardi Co-authored-by: Egor Sergeevich Poderiagin Co-authored-by: Asif Saif Uddin (Auvi) Co-authored-by: Lewis M. Kabui <13940255+lewisemm@users.noreply.github.com> Co-authored-by: Ixiodor Co-authored-by: Mathieu Rollet Co-authored-by: Mike DePalatis Co-authored-by: partizan Co-authored-by: Nick Pope Co-authored-by: Matus Valo Co-authored-by: Matus Valo Co-authored-by: henribru <6639509+henribru@users.noreply.github.com> Co-authored-by: Stuart Axon Co-authored-by: Sonya Chhabra Co-authored-by: AbdealiJK Co-authored-by: František Zatloukal Co-authored-by: elonzh Co-authored-by: Sven Koitka Co-authored-by: Arnon Yaari --- .vscode/settings.json | 2 +- celery/app/amqp.py | 5 ++++- celery/app/base.py | 4 ++-- celery/app/task.py | 4 ++++ celery/app/trace.py | 1 - celery/backends/asynchronous.py | 12 +++++++---- celery/backends/base.py | 25 +++++++++++++++++------ celery/backends/redis.py | 20 +++++++++++++++++-- celery/canvas.py | 30 ++++++++++++++++++---------- celery/result.py | 24 +++++++++++++++++++++- celery/states.py | 5 +++-- celery/worker/request.py | 35 ++++++++++++++++++++++++--------- requirements/dev.txt | 5 ----- 13 files changed, 128 insertions(+), 44 deletions(-) delete mode 100644 requirements/dev.txt diff --git a/.vscode/settings.json b/.vscode/settings.json index 97924983a3..d3def91314 100644 --- a/.vscode/settings.json +++ b/.vscode/settings.json @@ -1,3 +1,3 @@ { "editor.formatOnSave": false -} \ No newline at end of file +} diff --git a/celery/app/amqp.py b/celery/app/amqp.py index 1a0454e9a9..10d99cb907 100644 --- a/celery/app/amqp.py +++ b/celery/app/amqp.py @@ -279,7 +279,7 @@ def TaskConsumer(self, channel, queues=None, accept=None, **kw): def as_task_v2(self, task_id, name, args=None, kwargs=None, countdown=None, eta=None, group_id=None, group_index=None, - expires=None, retries=0, chord=None, + trailer_request=None, expires=None, retries=0, chord=None, callbacks=None, errbacks=None, reply_to=None, time_limit=None, soft_time_limit=None, create_sent_event=False, root_id=None, parent_id=None, @@ -336,6 +336,7 @@ def as_task_v2(self, task_id, name, args=None, kwargs=None, 'expires': expires, 'group': group_id, 'group_index': group_index, + 'trailer_request': trailer_request, 'retries': retries, 'timelimit': [time_limit, soft_time_limit], 'root_id': root_id, @@ -371,6 +372,7 @@ def as_task_v2(self, task_id, name, args=None, kwargs=None, def as_task_v1(self, task_id, name, args=None, kwargs=None, countdown=None, eta=None, group_id=None, group_index=None, + trailer_request=None, expires=None, retries=0, chord=None, callbacks=None, errbacks=None, reply_to=None, time_limit=None, soft_time_limit=None, @@ -415,6 +417,7 @@ def as_task_v1(self, task_id, name, args=None, kwargs=None, 'kwargs': kwargs, 'group': group_id, 'group_index': group_index, + 'trailer_request': trailer_request, 'retries': retries, 'eta': eta, 'expires': expires, diff --git a/celery/app/base.py b/celery/app/base.py index 27e5b610ca..5de58b15b1 100644 --- a/celery/app/base.py +++ b/celery/app/base.py @@ -689,7 +689,7 @@ def send_task(self, name, args=None, kwargs=None, countdown=None, router=None, result_cls=None, expires=None, publisher=None, link=None, link_error=None, add_to_parent=True, group_id=None, group_index=None, - retries=0, chord=None, + trailer_request=None, retries=0, chord=None, reply_to=None, time_limit=None, soft_time_limit=None, root_id=None, parent_id=None, route_name=None, shadow=None, chain=None, task_type=None, **options): @@ -730,7 +730,7 @@ def send_task(self, name, args=None, kwargs=None, countdown=None, message = amqp.create_task_message( task_id, name, args, kwargs, countdown, eta, group_id, group_index, - expires, retries, chord, + trailer_request, expires, retries, chord, maybe_list(link), maybe_list(link_error), reply_to or self.thread_oid, time_limit, soft_time_limit, self.conf.task_send_sent_event, diff --git a/celery/app/task.py b/celery/app/task.py index 2265ebb9e6..1156ebfaec 100644 --- a/celery/app/task.py +++ b/celery/app/task.py @@ -80,6 +80,7 @@ class Context: taskset = None # compat alias to group group = None group_index = None + trailer_request = [] chord = None chain = None utc = None @@ -114,6 +115,7 @@ def as_execution_options(self): 'parent_id': self.parent_id, 'group_id': self.group, 'group_index': self.group_index, + 'trailer_request': self.trailer_request or [], 'chord': self.chord, 'chain': self.chain, 'link': self.callbacks, @@ -907,6 +909,7 @@ def replace(self, sig): chord=chord, group_id=self.request.group, group_index=self.request.group_index, + trailer_request=self.request.trailer_request, root_id=self.request.root_id, ) sig.freeze(self.request.id) @@ -934,6 +937,7 @@ def add_to_chord(self, sig, lazy=False): sig.set( group_id=self.request.group, group_index=self.request.group_index, + trailer_request=self.request.trailer_request, chord=self.request.chord, root_id=self.request.root_id, ) diff --git a/celery/app/trace.py b/celery/app/trace.py index f9b8c83e6e..3b004d2d75 100644 --- a/celery/app/trace.py +++ b/celery/app/trace.py @@ -208,7 +208,6 @@ def handle_failure(self, task, req, store_errors=True, call_errbacks=True): einfo = ExceptionInfo() einfo.exception = get_pickleable_exception(einfo.exception) einfo.type = get_pickleable_etype(einfo.type) - task.backend.mark_as_failure( req.id, exc, einfo.traceback, request=req, store_result=store_errors, diff --git a/celery/backends/asynchronous.py b/celery/backends/asynchronous.py index 32475d5eaa..9a530235d8 100644 --- a/celery/backends/asynchronous.py +++ b/celery/backends/asynchronous.py @@ -5,7 +5,7 @@ from collections import deque from queue import Empty from time import sleep -from weakref import WeakKeyDictionary +from weakref import WeakKeyDictionary, WeakSet from kombu.utils.compat import detect_environment @@ -173,7 +173,10 @@ def _maybe_resolve_from_buffer(self, result): def _add_pending_result(self, task_id, result, weak=False): concrete, weak_ = self._pending_results if task_id not in weak_ and result.id not in concrete: - (weak_ if weak else concrete)[task_id] = result + ref = (weak_ if weak else concrete) + results = ref.get(task_id, WeakSet() if weak else set()) + results.add(result) + ref[task_id] = results self.result_consumer.consume_from(task_id) def add_pending_results(self, results, weak=False): @@ -292,13 +295,14 @@ def on_state_change(self, meta, message): if meta['status'] in states.READY_STATES: task_id = meta['task_id'] try: - result = self._get_pending_result(task_id) + results = self._get_pending_result(task_id) except KeyError: # send to buffer in case we received this result # before it was added to _pending_results. self._pending_messages.put(task_id, meta) else: - result._maybe_set_cache(meta) + for result in results: + result._maybe_set_cache(meta) buckets = self.buckets try: # remove bucket for this result, since it's fulfilled diff --git a/celery/backends/base.py b/celery/backends/base.py index 1aac2a0fc9..37c26dc4be 100644 --- a/celery/backends/base.py +++ b/celery/backends/base.py @@ -63,6 +63,12 @@ Result backends that supports chords: Redis, Database, Memcached, and more. """ +trailer_request_obj = namedtuple( + "trailer_request", + ("id", "group", "errbacks", "chord", "trailer_request", "group_index"), + defaults=(None, ) * 6 +) + def unpickle_backend(cls, args, kwargs): """Return an unpickled backend.""" @@ -130,7 +136,7 @@ def __init__(self, app, self.base_sleep_between_retries_ms = conf.get('result_backend_base_sleep_between_retries_ms', 10) self.max_retries = conf.get('result_backend_max_retries', float("inf")) - self._pending_results = pending_results_t({}, WeakValueDictionary()) + self._pending_results = pending_results_t({}, {}) self._pending_messages = BufferMap(MESSAGE_BUFFER_MAX) self.url = url @@ -164,6 +170,14 @@ def mark_as_failure(self, task_id, exc, self.store_result(task_id, exc, state, traceback=traceback, request=request) if request: + if request.trailer_request: + self.mark_as_failure( + request.trailer_request["id"], exc, traceback=traceback, + store_result=store_result, call_errbacks=call_errbacks, + request=trailer_request_obj(**request.trailer_request), + state=state + ) + if request.chord: self.on_chord_part_return(request, state, exc) if call_errbacks and request.errbacks: @@ -218,11 +232,10 @@ def _call_task_errbacks(self, request, exc, traceback): def mark_as_revoked(self, task_id, reason='', request=None, store_result=True, state=states.REVOKED): exc = TaskRevokedError(reason) - if store_result: - self.store_result(task_id, exc, state, - traceback=None, request=request) - if request and request.chord: - self.on_chord_part_return(request, state, exc) + + return self.mark_as_failure( + task_id, exc, request=request, store_result=store_result, state=state + ) def mark_as_retry(self, task_id, exc, traceback=None, request=None, store_result=True, state=states.RETRY): diff --git a/celery/backends/redis.py b/celery/backends/redis.py index dd3677f569..7a96bdbd66 100644 --- a/celery/backends/redis.py +++ b/celery/backends/redis.py @@ -1,4 +1,5 @@ """Redis result store backend.""" +import uuid import time from contextlib import contextmanager from functools import partial @@ -12,7 +13,7 @@ from celery import states from celery._state import task_join_will_block from celery.canvas import maybe_signature -from celery.exceptions import ChordError, ImproperlyConfigured +from celery.exceptions import ChordError, ImproperlyConfigured, TaskRevokedError from celery.result import GroupResult, allow_join_result from celery.utils.functional import dictfilter from celery.utils.log import get_logger @@ -157,7 +158,8 @@ def drain_events(self, timeout=None): def consume_from(self, task_id): if self._pubsub is None: return self.start(task_id) - self._consume_from(task_id) + else: + self._consume_from(task_id) def _consume_from(self, task_id): key = self._get_key_for_task(task_id) @@ -269,6 +271,7 @@ def __init__(self, host=None, port=None, db=None, password=None, self.connection_errors, self.channel_errors = ( get_redis_error_classes() if get_redis_error_classes else ((), ())) + self.result_consumer = self.ResultConsumer( self, self.app, self.accept, self._pending_results, self._pending_messages, @@ -397,6 +400,10 @@ def _unpack_chord_result(self, tup, decode, _, tid, state, retval = decode(tup) if state in EXCEPTION_STATES: retval = self.exception_to_python(retval) + + if isinstance(retval, TaskRevokedError): + raise retval + if state in PROPAGATE_STATES: raise ChordError(f'Dependency {tid} raised {retval!r}') return retval @@ -484,6 +491,15 @@ def on_chord_part_return(self, request, state, result, resl = [unpack(tup, decode) for tup in resl] try: callback.delay(resl) + except TaskRevokedError as exc: + logger.exception( + 'Group %r task was revoked: %r', request.group, exc) + if callback.id is None: + callback.id = str(uuid.uuid4()) + return self.chord_error_from_stack( + callback, + exc + ) except Exception as exc: # pylint: disable=broad-except logger.exception( 'Chord callback for %r raised: %r', request.group, exc) diff --git a/celery/canvas.py b/celery/canvas.py index a4de76428d..25fd111ab4 100644 --- a/celery/canvas.py +++ b/celery/canvas.py @@ -238,8 +238,15 @@ def _merge(self, args=None, kwargs=None, options=None, force=False): }) else: new_options = self.options + + new_options = new_options if new_options else {} + new_options["link_error"] = ( + new_options.get("link_error", []) + new_options.pop("link_error", []) + ) + if self.immutable and not force: return (self.args, self.kwargs, new_options) + return (tuple(args) + tuple(self.args) if args else self.args, dict(self.kwargs, **kwargs) if kwargs else self.kwargs, new_options) @@ -274,7 +281,7 @@ def clone(self, args=None, kwargs=None, **opts): partial = clone def freeze(self, _id=None, group_id=None, chord=None, - root_id=None, parent_id=None, group_index=None): + root_id=None, parent_id=None, group_index=None, trailer_request=None): """Finalize the signature by adding a concrete task id. The task won't be called and you shouldn't call the signature @@ -303,6 +310,8 @@ def freeze(self, _id=None, group_id=None, chord=None, opts['chord'] = chord if group_index is not None: opts['group_index'] = group_index + if trailer_request is not None: + opts['trailer_request'] = trailer_request # pylint: disable=too-many-function-args # Borks on this, as it's a property. return self.AsyncResult(tid) @@ -686,13 +695,13 @@ def run(self, args=None, kwargs=None, group_id=None, chord=None, return results_from_prepare[0] def freeze(self, _id=None, group_id=None, chord=None, - root_id=None, parent_id=None, group_index=None): + root_id=None, parent_id=None, group_index=None,trailer_request=None): # pylint: disable=redefined-outer-name # XXX chord is also a class in outer scope. _, results = self._frozen = self.prepare_steps( self.args, self.kwargs, self.tasks, root_id, parent_id, None, self.app, _id, group_id, chord, clone=False, - group_index=group_index, + group_index=group_index, trailer_request=trailer_request ) return results[0] @@ -700,7 +709,7 @@ def prepare_steps(self, args, kwargs, tasks, root_id=None, parent_id=None, link_error=None, app=None, last_task_id=None, group_id=None, chord_body=None, clone=True, from_dict=Signature.from_dict, - group_index=None): + group_index=None, trailer_request=None): app = app or self.app # use chain message field for protocol 2 and later. # this avoids pickle blowing the stack on the recursion @@ -777,7 +786,7 @@ def prepare_steps(self, args, kwargs, tasks, res = task.freeze( last_task_id, root_id=root_id, group_id=group_id, chord=chord_body, - group_index=group_index, + group_index=group_index, trailer_request=trailer_request, ) else: res = task.freeze(root_id=root_id) @@ -1088,8 +1097,7 @@ def apply_async(self, args=None, kwargs=None, add_to_parent=True, if link is not None: raise TypeError('Cannot add link to group: use a chord') if link_error is not None: - raise TypeError( - 'Cannot add link to group: do that on individual tasks') + link_error = None app = self.app if app.conf.task_always_eager: return self.apply(args, kwargs, **options) @@ -1205,7 +1213,7 @@ def _freeze_gid(self, options): return options, group_id, options.get('root_id') def freeze(self, _id=None, group_id=None, chord=None, - root_id=None, parent_id=None, group_index=None): + root_id=None, parent_id=None, group_index=None, trailer_request=None): # pylint: disable=redefined-outer-name # XXX chord is also a class in outer scope. opts = self.options @@ -1219,6 +1227,8 @@ def freeze(self, _id=None, group_id=None, chord=None, opts['chord'] = chord if group_index is not None: opts['group_index'] = group_index + if trailer_request is not None: + opts['trailer_request'] = trailer_request root_id = opts.setdefault('root_id', root_id) parent_id = opts.setdefault('parent_id', parent_id) new_tasks = [] @@ -1328,7 +1338,7 @@ def __call__(self, body=None, **options): return self.apply_async((), {'body': body} if body else {}, **options) def freeze(self, _id=None, group_id=None, chord=None, - root_id=None, parent_id=None, group_index=None): + root_id=None, parent_id=None, group_index=None, trailer_request=None): # pylint: disable=redefined-outer-name # XXX chord is also a class in outer scope. if not isinstance(self.tasks, group): @@ -1337,7 +1347,7 @@ def freeze(self, _id=None, group_id=None, chord=None, parent_id=parent_id, root_id=root_id, chord=self.body) body_result = self.body.freeze( _id, root_id=root_id, chord=chord, group_id=group_id, - group_index=group_index) + group_index=group_index, trailer_request=trailer_request) # we need to link the body result back to the group result, # but the body may actually be a chain, # so find the first result without a parent diff --git a/celery/result.py b/celery/result.py index 0c10d58e86..7334045564 100644 --- a/celery/result.py +++ b/celery/result.py @@ -554,6 +554,14 @@ def _on_ready(self): if self.backend.is_async: self.on_ready() + def collect(self, **kwargs): + for task in self.results: + task_results = list(task.collect(**kwargs)) + if isinstance(task, ResultSet): + yield task_results + else: + yield task_results[-1] + def remove(self, result): """Remove result from the set; it must be a member. @@ -666,7 +674,8 @@ def __getitem__(self, index): def get(self, timeout=None, propagate=True, interval=0.5, callback=None, no_ack=True, on_message=None, - disable_sync_subtasks=True, on_interval=None): + disable_sync_subtasks=True, on_interval=None, **kwargs): + # PATCH: added kwargs for more generalized interface """See :meth:`join`. This is here for API compatibility with :class:`AsyncResult`, @@ -949,6 +958,14 @@ def as_tuple(self): def children(self): return self.results + @property + def state(self): + for child in self.children: + if (child.state in states.EXCEPTION_STATES + or child.state in states.UNREADY_STATES): + break + return child.state + @classmethod def restore(cls, id, backend=None, app=None): """Restore previously saved group result.""" @@ -1065,3 +1082,8 @@ def result_from_tuple(r, app=None): return Result(id, parent=parent) return r + + +def get_exception_in_callback(task_id: str) -> Exception: + with allow_join_result(): + return AsyncResult(task_id).get(propagate=False) \ No newline at end of file diff --git a/celery/states.py b/celery/states.py index e807ed4822..375ca72b9f 100644 --- a/celery/states.py +++ b/celery/states.py @@ -140,12 +140,13 @@ def __le__(self, other): #: Task is waiting for retry. RETRY = 'RETRY' IGNORED = 'IGNORED' +TRAILED = 'TRAILED' READY_STATES = frozenset({SUCCESS, FAILURE, REVOKED}) -UNREADY_STATES = frozenset({PENDING, RECEIVED, STARTED, REJECTED, RETRY}) +UNREADY_STATES = frozenset({PENDING, RECEIVED, STARTED, REJECTED, RETRY, TRAILED}) EXCEPTION_STATES = frozenset({RETRY, FAILURE, REVOKED}) PROPAGATE_STATES = frozenset({FAILURE, REVOKED}) ALL_STATES = frozenset({ - PENDING, RECEIVED, STARTED, SUCCESS, FAILURE, RETRY, REVOKED, + PENDING, RECEIVED, STARTED, SUCCESS, FAILURE, RETRY, REVOKED, TRAILED }) diff --git a/celery/worker/request.py b/celery/worker/request.py index 81c3387d98..40c9cc4a28 100644 --- a/celery/worker/request.py +++ b/celery/worker/request.py @@ -3,6 +3,7 @@ This module defines the :class:`Request` class, that specifies how tasks are executed. """ +import os import logging import sys from datetime import datetime @@ -35,6 +36,8 @@ IS_PYPY = hasattr(sys, 'pypy_version_info') +REJECT_TO_HIGH_MEMORY = os.getenv("REJECT_TO_HIGH_MEMORY") + logger = get_logger(__name__) debug, info, warn, error = (logger.debug, logger.info, logger.warning, logger.error) @@ -75,7 +78,7 @@ class Request: if not IS_PYPY: # pragma: no cover __slots__ = ( - '_app', '_type', 'name', 'id', '_root_id', '_parent_id', + '_app', '_type', 'name', 'id', '_root_id', '_parent_id', '_trailer_request', '_on_ack', '_body', '_hostname', '_eventer', '_connection_errors', '_task', '_eta', '_expires', '_request_dict', '_on_reject', '_utc', '_content_type', '_content_encoding', '_argsrepr', '_kwargsrepr', @@ -485,7 +488,8 @@ def on_failure(self, exc_info, send_failed_event=True, return_ok=False): """Handler called if the task raised an exception.""" task_ready(self) if isinstance(exc_info.exception, MemoryError): - raise MemoryError(f'Process got: {exc_info.exception}') + if not REJECT_TO_HIGH_MEMORY or not self.task.acks_late: + raise MemoryError(f'Process got: {exc_info.exception}') elif isinstance(exc_info.exception, Reject): return self.reject(requeue=exc_info.exception.requeue) elif isinstance(exc_info.exception, Ignore): @@ -497,17 +501,25 @@ def on_failure(self, exc_info, send_failed_event=True, return_ok=False): return self.on_retry(exc_info) # (acks_late) acknowledge after result stored. - requeue = False if self.task.acks_late: reject = ( self.task.reject_on_worker_lost and - isinstance(exc, WorkerLostError) + isinstance(exc, (WorkerLostError, MemoryError, Terminated)) ) ack = self.task.acks_on_failure_or_timeout if reject: - requeue = True - self.reject(requeue=requeue) - send_failed_event = False + if REJECT_TO_HIGH_MEMORY: + # If we have a higher memory queue, reject without retry + self.reject(requeue=False) + # Don't send a failure event + send_failed_event = False + return + else: + send_failed_event = True + return_ok = False + # Acknowledge the message so it doesn't get retried + # and can be marked as complete + self.acknowledge() elif ack: self.acknowledge() else: @@ -521,8 +533,8 @@ def on_failure(self, exc_info, send_failed_event=True, return_ok=False): self._announce_revoked( 'terminated', True, str(exc), False) send_failed_event = False # already sent revoked event - elif not requeue and (isinstance(exc, WorkerLostError) or not return_ok): - # only mark as failure if task has not been requeued + elif not return_ok: + # We do not ever want to retry failed tasks unless worker lost or terminated self.task.backend.mark_as_failure( self.id, exc, request=self._context, store_result=self.store_errors, @@ -626,6 +638,11 @@ def group_index(self): # used by backend.on_chord_part_return to order return values in group return self._request_dict.get('group_index') + @cached_property + def trailer_request(self): + # used by backend.on_chord_part_return to order return values in group + return self._request_dict.get('trailer_request') or [] + def create_request_cls(base, task, pool, hostname, eventer, ref=ref, revoked_tasks=revoked_tasks, diff --git a/requirements/dev.txt b/requirements/dev.txt deleted file mode 100644 index 9712c15a2e..0000000000 --- a/requirements/dev.txt +++ /dev/null @@ -1,5 +0,0 @@ -pytz>dev -git+https://github.com/celery/kombu.git -git+https://github.com/celery/py-amqp.git -git+https://github.com/celery/billiard.git -vine==1.3.0 \ No newline at end of file