Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] Delay pushing jobs to queue until database transactions are committed #35266

Closed
wants to merge 13 commits into from

Conversation

themsaid
Copy link
Member

@themsaid themsaid commented Nov 18, 2020

DB::transaction(function(){
    $user = User::create(...);

    SendWelcomeEmail::dispatch($user);
});

When running the code above, the SendWelcomeEmail job may get dispatched and picked by a worker before the transaction is committed. This will lead to errors since the user model won't exist when the job runs.

This PR is an attempt to capture queue dispatches, store them in a local cache, and only perform the dispatch when all transactions has been committed. Given the example above, the SendWelcomeEmail won't get dispatched to the queue until the transaction is committed.

To enable this behaviour, you need to set the after_commits configuration value to true in the connection settings inside the queue.php config file. You can also use the afterTransactions method when dispatching the job:

DB::transaction(function(){
    $user = User::create(...);

    SendWelcomeEmail::dispatch($user)->afterTransactions();
});

In the case of rollbacks, the jobs will still get pushed to the queue. But in that case the errors that will happen when the job runs will make sense. This PR's only purpose is to delay dispatching until the transactions are committed or rolledback.

You can add a dispatchAfterTransactions public property to mailables, notifications, listeners, and broadcastable events to achieve the same behaviour.

@mfn
Copy link
Contributor

mfn commented Nov 18, 2020

This touches the problem of "transactional events", very often neglected in frameworks => props!

However, this should be optional and may not be desired by default. There might be job (e.g. for logging something?) which are independent of whether the transaction will be rolled back or not.

There are other approaches like using a dedicated marker interface for opt-in.

Please see mfn#1 for an attempt though I lacked the time to move forward but technically it's a working PR. Your input would be highly appreciated.

See also:

@themsaid
Copy link
Member Author

However, this should be optional and may not be desired by default. There might be job (e.g. for logging something?) which are independent of whether the transaction will be rolled back or not.

Those jobs will still run even after a roll back. Unless an unhandled exception was thrown.

@paras-malhotra
Copy link
Contributor

paras-malhotra commented Nov 18, 2020

Interesting approach. Rails has been lingering on solving this issue for quite some time and I think they've decided to go with 3 options (no PR yet though, only a discussion):

  1. Raise an exception when jobs are enqueued inside DB transactions (default option)
  2. Defer: On commit, enqueue and on rollback, drop the job
  3. Forcefully enqueue

TBH the rollback job dropping seems wierd though and this PR seems to be a better approach imo. I do think that we should have an option to forcefully enqueue the job.

@themsaid
Copy link
Member Author

I think giving the people the option to opt in is a good idea. Could be a config value inside queue.php. Will wait for @taylorotwell feedback first.

@mfn
Copy link
Contributor

mfn commented Nov 18, 2020

Those jobs will still run even after a roll back. Unless an unhandled exception was thrown.

Ah, thank you => important piece I missed and it's unrelated to what I suggested => apologies.

@graemlourens
Copy link

graemlourens commented Nov 18, 2020

Just for clarification: Why is it even necessary in @themsaid 's initial example in this conversation to put the job dispatching in the transaction? Isn't it enough to put it AFTER the transaction...? In that case the job will only be dispatched if the transaction succeeded, in case of rollback that transaction would throw an exception and no job would be dispatched

Or probably i'm missing the obvious?

@paras-malhotra
Copy link
Contributor

@graemlourens the issue is that it's easy to forget that the job needs to be dispatched after the transaction. That's what this PR intends to solve.

@mfn
Copy link
Contributor

mfn commented Nov 18, 2020

What @paras-malhotra said, but also your business logic might be agnostic and not know, and should not care, whether a transaction surrounds it or not.

That's why basically years ago I started to use https://github.com/fntneves/laravel-transactional-events in my code. Only dispatch job after the transaction commits successfully and (in most cases) discard jobs if the transaction fails: you don't want to send an email invitation if something breaks later in the transaction cycle and reverts it.

@graemlourens
Copy link

Thank you both for the clarification. Makes sense and the added benefit of protecting developers is a large one that i absolutely did not initially see.

@taylorotwell
Copy link
Member

@paras-malhotra do you have a link to the related Rails discussion?

@paras-malhotra
Copy link
Contributor

@taylorotwell here they are: rails/rails#26045 and rails/rails#26103

@taylorotwell
Copy link
Member

So it sounds like we need a good summary of the types of behavior we want to support and probably the various things will need to be specified at the transaction level because you may want to different behavior in different parts of your code.

Regarding a default of throwing exceptions anytime a job is enqueued within a transaction. I'm undecided on this. This is a pretty big breaking change and I'm not sure I have the appetite to take on a breaking change like that right now.

I can definitely see the benefit of discarding jobs on transaction rollback or errors but again you would maybe want to be able to specify that the job should be queued anyways?

Probably want a way to force a job to queue immediately even within a transaction?

@mfn
Copy link
Contributor

mfn commented Nov 20, 2020

I can definitely see the benefit of discarding jobs on transaction rollback or errors but again you would maybe want to be able to specify that the job should be queued anyways?

Probably want a way to force a job to queue immediately even within a transaction?

Please take a moment to look at mfn#1 , the gist is:

  • regular jobs are dispatched as is => no change
  • if you want to have jobs to adhere to a possible transaction they are wrapped in, tag the job with an explicit interface (e.g. TransactionalEvent in above PR)
    • if no transaction is active, it behaves like a regular job dispatch

Note: I'm not pushing my PR, otherwise I would have submitted it already.

I see the value of making this "job dispatch should be transaction aware" without having a dedicate marker interface, aka:

  • dispatch(…) => as is
    and additionally have something like
  • dispatchTransaction()
    => if a transaction is active, don't dispatch on rollback and only dispatch on commit
    => if no transaction active, behave like dispatch()
    => complexity factory: needs to take nested transaction into account and unwind appropriately (my PR does this, but has some further package dependencies to achieve this, much like the original https://github.com/fntneves/laravel-transactional-events )

But I firmly believe the "transaction awareness" needs to a conscious decision by the developer.

Note: this also means theoretically dispatch() could be made transaction aware by default and something like dispatchAlways() could be introduced, i.e. I'm not saying it has to be "opt-in"; could also be "opt-out" because there's quite some value of having transaction aware jobs.

@paras-malhotra
Copy link
Contributor

I'd suggest:

  1. dispatch(..) to auto-dispatch after commit and drop on rollback.
  2. forceDispatch(...) to forcefully dispatch immediately regardless of transactions.
  3. alwaysDispatch(...) to auto-dispatch after commit or rollback.

Regardless of whether we throw an exception or not, this will always be a breaking change unless we have dispatch dispatch immediately (which beats the purpose of forgetting to dispatch outside of the transaction).

@alsma
Copy link

alsma commented Nov 20, 2020

Just want to share my pain related to this issue and transactional events:
We have a lot of business logic that can be invoked by user request as well as by some system workflows.
For instance user can confirm email by opening link from email in browser or by connecting social network profile
(see gist https://gist.github.com/alsma/4d71b7252832b7c93e2eafc656ba6669)
We have naming convention that event classes which name ends in ing supposed to be raised inside database transaction and event classes which name ends in ed supposed to be raised outside database transaction
So in case if we have multiple nested transactions(in my project I have up to 4-5 nesting levels) it's quite hard to carry those events to top level and raise them there. This exposes internal stuff of one modules to another and braking SRP.

Anyone have similar issues ? Any approach recommended ?

@alsma
Copy link

alsma commented Nov 20, 2020

One more thing to mention: common pitfail to raise event that broadcasted to client via websocket inside DB transaction.

  1. transaction could fail due to error
  2. transaction could be retried multiple times due to deadlocks

@themsaid themsaid marked this pull request as draft November 20, 2020 16:13
@johnylemon
Copy link
Contributor

agree with @alsma

I use events a lot. Queued listeners are doing their job and eventually dispatch another jobs or fire events

So what about that case?

DB::transaction(function(){
    
    // (some logic here)

    // this event should be launched regardless of the end result
    // and listener for that will do some calculations and dispatches job that will analyze something
    event(new SomeAttemptEvent); 
     
    // more logic here

    $someModel = Model::create([]); // maybe fires another (queued) event on `creating`/`created` event 
    
    // more logic here

    // should be fired only if everything was ok
    // maybe listener for that dispatches another job
    // this line won't be reached if something failed before
    // no need to care about that
    event(new SomethingFinishedSuccessfully); 
});

It will be a little bit confusing which of these events and jobs should be fired and which should be not.

@mfn
Copy link
Contributor

mfn commented Nov 21, 2020

@johnylemon events are always handled sync, aren't they?

So I guess in your case, the Listener would actually by a mediator which uses the database-transaction-aware dispatch version and dispatch the actual job.

OTOH https://github.com/fntneves/laravel-transactional-events does support "transaction events"

@johnylemon
Copy link
Contributor

@mfn nope, events are not handled synchronously everytime

@mfn
Copy link
Contributor

mfn commented Nov 24, 2020

IDK, using a config option feels very wrong to me; but maybe that's just me.

@alsma
Copy link

alsma commented Nov 24, 2020

Guys I did some small research and want to share my thoughts on this.
It seems that this PR and transactional events module fix only particular cases, more likely we need to be more generic and allow to schedule execution of any arbitrary code after transaction commit. In this case we will allow developer to rise events and dispatch jobs after transaction commit as well.

API can be something like following:

DB::transaction(function() {
    $user = User::register();
    // ... some logic goes here
    event(new UserRegistering($user)); // event listeners will be executed in scope of current DB transaction
   
   DB::onceCommitted(function () {
        event(new UserRegistered($user)); // event listeners will be executed after DB transaction commit
        dispatch(new FinishRegistration($user)); // job will be dispatched after DB transaction commit
   });
});

@themsaid
Copy link
Member Author

@alsma I have that in mind actually but want to test it with queued jobs first since it's the most common cause of problems. Once we have something stable, we can apply the same concepts to a generic function/interface that can be used with any code.

So I think it's better if we keep the conversation in this PR only around queued jobs.

@themsaid themsaid marked this pull request as ready for review November 24, 2020 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants