Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Django task with transaction atomic, return None instead of the task UUID #8976

Closed
6 of 19 tasks
FraCata00 opened this issue Apr 24, 2024 · 15 comments · Fixed by #8984
Closed
6 of 19 tasks

New Django task with transaction atomic, return None instead of the task UUID #8976

FraCata00 opened this issue Apr 24, 2024 · 15 comments · Fixed by #8984

Comments

@FraCata00
Copy link
Contributor

FraCata00 commented Apr 24, 2024

Checklist

  • I have verified that the issue exists against the main branch of Celery.
  • This has already been asked to the discussions forum first.
  • I have read the relevant section in the
    contribution guide
    on reporting bugs.
  • I have checked the issues list
    for similar or identical bug reports.
  • I have checked the pull requests list
    for existing proposed fixes.
  • I have checked the commit log
    to find out if the bug was already fixed in the main branch.
  • I have included all related issues and possible duplicate issues
    in this issue (If there are none, check this box anyway).
  • I have tried to reproduce the issue with pytest-celery and added the reproduction script below.

Mandatory Debugging Information

  • I have included the output of celery -A proj report in the issue.
    (if you are not able to do this, then at least specify the Celery
    version affected).
  • I have verified that the issue exists against the main branch of Celery.
  • I have included the contents of pip freeze in the issue.
  • I have included all the versions of all the external dependencies required
    to reproduce this bug.

Optional Debugging Information

  • I have tried reproducing the issue on more than one Python version
    and/or implementation.
  • I have tried reproducing the issue on more than one message broker and/or
    result backend.
  • I have tried reproducing the issue on more than one version of the message
    broker and/or result backend.
  • I have tried reproducing the issue on more than one operating system.
  • I have tried reproducing the issue on more than one workers pool.
  • I have tried reproducing the issue with autoscaling, retries,
    ETA/Countdown & rate limits disabled.
  • I have tried reproducing the issue after downgrading
    and/or upgrading Celery and its dependencies.

Related Issues and Possible Duplicates

Related Issues

  • None

Possible Duplicates

  • None

Environment & Settings

Celery version:
5.4.0

celery report Output:

Steps to Reproduce

Required Dependencies

  • Minimal Python Version: N/A or Unknown
  • Minimal Celery Version:5.4.0
  • Minimal Kombu Version: N/A or Unknown
  • Minimal Broker Version: N/A or Unknown
  • Minimal Result Backend Version: N/A or Unknown
  • Minimal OS and/or Kernel Version: N/A or Unknown
  • Minimal Broker Client Version: N/A or Unknown
  • Minimal Result Backend Client Version: N/A or Unknown

Python Packages

pip freeze Output:

Other Dependencies

N/A

Minimally Reproducible Test Case

Expected Behavior

Return the correct task uuid like before the release 5.4.0

Actual Behavior

Issue ⚠️

`AttributeError: 'NoneType' object has no attribute 'id'

@app.task
def example_task(id):
       try:
            obj = Model.objects.get(id=id)
            obj.number += 1
            obj.save(update_fields=["number"])
       except ObjectDoesNotExist as e:
            return "str(e)"


task = example_task.delay_on_commit(example_model_instance.id)
example_model_instance.task_uuid = task.id

Literally basic minimal task 😄 , but the delay_on_commit not return the task uuid for AsyncResult, and in my case django raise excpetion (rightly) AttributeError -> maybe the task with delay_on_commit return None?

docs:

class DjangoTask(Task):
    """
    Extend the base :class:`~celery.app.task.Task` for Django.

    Provide a nicer API to trigger tasks at the end of the DB transaction.
    """

    def delay_on_commit(self, *args, **kwargs):
        """Call :meth:`~celery.app.task.Task.delay` with Django's ``on_commit()``."""
        return transaction.on_commit(functools.partial(self.delay, *args, **kwargs))

    def apply_async_on_commit(self, *args, **kwargs):
        """Call :meth:`~celery.app.task.Task.apply_async` with Django's ``on_commit()``."""
        return transaction.on_commit(functools.partial(self.apply_async, *args, **kwargs))
@FraCata00 FraCata00 changed the title New Django tash with transaction atomic, not return the task_id New Django task with transaction atomic, not return the task_id, return None Apr 24, 2024
@FraCata00 FraCata00 changed the title New Django task with transaction atomic, not return the task_id, return None New Django task with transaction atomic, return None instead of the task UUID Apr 24, 2024
@Nusnus
Copy link
Member

Nusnus commented Apr 24, 2024

@browniebroke

@Nusnus
Copy link
Member

Nusnus commented Apr 24, 2024

[x] I have tried to reproduce the issue with pytest-celery and added the reproduction script below.

Where is it?

@FraCata00
Copy link
Contributor Author

[x] I have tried to reproduce the issue with pytest-celery and added the reproduction script below.

Where is it?

oh no sorry... my fault

@FraCata00
Copy link
Contributor Author

FraCata00 commented Apr 24, 2024

I think that django transaction.on_commit should return None, but this is the source of django transaction,py
transaction.py

if use the decorator instead the on_commit function?
delay and apply_async like before but:

@transaction.atomic
def delay(.....)

@transaction.atomic
def apply_async(.....)

@browniebroke
Copy link
Contributor

browniebroke commented Apr 24, 2024

Yes, that's expected behaviour, transaction.on_commit(...) delays the execution of the wrapped function to a later stage, when the Django DB transaction finishes, so you won't get a task UUID until then. If you need access to the task UUID, you should use delay or apply_async, NOT the ..._on_commit variants. While I didn't forsee this issue, I'm glad we picked entirely new method names, which means there is an escape hatch for you. We should document this caveat better, though.

Your solution to use @transaction.atomic decorator does not do the same thing. It will open either a new transaction, or a savepoint (if you're already in a transaction), which may or may not have the same effect. Consider this case, for example:

def create_user(request):
    with transaction.atomic():
        user = User.objects.create(first_name="bob", last_name="doe")
        send_welcome_email.delay_on_commit(user.pk)
        # later in the transaction, do something that cause the transaction to be rolledback
        # e.g. set the email to a value that already exists, raising IntegrityError
        user.email = "bob.doe@example.com"
        user.save() # <- raises IntegrityError, and rollback the whole transaction

When the whole transaction is rolled back, the whole operation failed, and user is NOT created in the DB, so we shouldn't send the welcome email. Since the transaction wasn't commited, the task never triggered (and no UUID was generated).

Now if we replace the delay_on_commit by a call to delay with the @transaction.atomic, then the task will be queued in a nested transaction, and we will try to send welcome email (although that will probably crash, since the user wasn't actually persistend in the DB).

@FraCata00
Copy link
Contributor Author

FraCata00 commented Apr 24, 2024

I confirm that delay_on_commit return None:

In [9]: @app.task
   ...: def sum_number(n1, n2):
   ...:     print(n1+n2)
   ...: 

In [10]: task = sum_number.delay(1,2)

In [11]: task
Out[11]: <AsyncResult: 3a4138ec-35d7-4d46-808f-3f73a1dfee53>

In [12]: task = sum_number.delay_on_commit(1,2)

In [13]: task

In [14]: assert task is None

In [15]: task is None
Out[15]: True

@FraCata00
Copy link
Contributor Author

Yes, that's expected behaviour, transaction.on_commit(...) delays the execution of the wrapped function to a later stage, when the Django DB transaction finishes, so you won't get a task UUID until then. If you need access to the task UUID, you should use delay or apply_async, NOT the ..._on_commit variants. While I didn't forsee this issue, I'm glad we picked entirely new method names, which means there is an escape hatch for you. We should document this caveat better, though.

Your solution to use @transaction.atomic decorator does not do the same thing. It will open either a new transaction, or a savepoint (if you're already in a transaction), which may or may not have the same effect. Consider this case, for example:

def create_user(request):
    with transaction.atomic():
        user = User.objects.create(first_name="bob", last_name="doe")
        send_welcome_email.delay_on_commit(user.pk)
        # later in the transaction, do something that cause the transaction to be rolledback
        # e.g. set the email to a value that already exists, raising IntegrityError
        user.email = "bob.doe@example.com"
        user.save() # <- raises IntegrityError, and rollback the whole transaction

When the whole transaction is rolled back, the whole operation failed, and user is NOT created in the DB, so we shouldn't send the welcome email. Since the transaction wasn't commited, the task never triggered (and no UUID was generated).

Now if we replace the delay_on_commit by call todelaywith the@transaction.atomic`, then the task will be queued in a nested transaction, and we will try to send welcome email (although that will probably crash, since the user wasn't actually persistend in the DB).

Ohh correctly...so if I need the task UUID just use mandatory the delay or apply_async call task method, correctly?

@browniebroke
Copy link
Contributor

so if I need the task UUID just use mandatory the delay or apply_async call task method, correctly?

Correct, yes

@FraCata00
Copy link
Contributor Author

FraCata00 commented Apr 24, 2024

we can develop the way to wait for the end of the transaction and return the task UUID?
or it's a bad idea?

@browniebroke
Copy link
Contributor

browniebroke commented Apr 24, 2024

That kind of defeat the whole purpose of the on_commit utility, which is a mechanism to delay execution and move on without waiting. Here is the source:

https://github.com/django/django/blob/ec8552417df51df8482df61b8ad78a7002634011/django/db/transaction.py#L129-L134

Which essentially calls this:

https://github.com/django/django/blob/ec8552417df51df8482df61b8ad78a7002634011/django/db/backends/base/base.py#L727-L750

The function is NOT executed at this point, it's basically added to a list of callbacks (line 732):

self.run_on_commit.append((set(self.savepoint_ids), func, robust))

This list is read and executed in run_and_clear_commit_hooks (defined below):

https://github.com/django/django/blob/ec8552417df51df8482df61b8ad78a7002634011/django/db/backends/base/base.py#L752C9-L769

That may happen in a completely different context that where you trigger the task.

@browniebroke
Copy link
Contributor

Let me know if anything is unclear, it would be good to incorporate explanations to lift any doubts you may have in the Celery docs...

@FraCata00
Copy link
Contributor Author

FraCata00 commented Apr 24, 2024

That kind of defeat the whole purpose of the on_commit utility, which is a mechanism to delay execution and move on without waiting. Here is the source:

https://github.com/django/django/blob/ec8552417df51df8482df61b8ad78a7002634011/django/db/transaction.py#L129-L134

Which essentially calls this:

https://github.com/django/django/blob/ec8552417df51df8482df61b8ad78a7002634011/django/db/backends/base/base.py#L727-L750

The function is NOT executed at this point, it's basically added to a list of callbacks (line 732):

self.run_on_commit.append((set(self.savepoint_ids), func, robust))

This list is read and executed in run_and_clear_commit_hooks (defined below):

https://github.com/django/django/blob/ec8552417df51df8482df61b8ad78a7002634011/django/db/backends/base/base.py#L752C9-L769

That may happen in a completely different context that where you trigger the task.

So... it's a way for not waiting the finish of the transaction, correctly?

In this while, iterate and call func to execute for every function to execute on_commit
https://github.com/django/django/blob/ec8552417df51df8482df61b8ad78a7002634011/django/db/backends/base/base.py#L756-L767

@browniebroke
Copy link
Contributor

Let me know if that helps: #8984

@Nusnus
Copy link
Member

Nusnus commented Apr 28, 2024

Let me know if that helps: #8984

LGTM - Check out my comments in the PR.

@FraCata00
Copy link
Contributor Author

Let me know if that helps: #8984

Sure!! 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants