Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async execution of transactions results in deadlock #327

Closed
goteguru opened this issue Apr 27, 2021 · 0 comments · Fixed by #328
Closed

Async execution of transactions results in deadlock #327

goteguru opened this issue Apr 27, 2021 · 0 comments · Fixed by #328

Comments

@goteguru
Copy link
Contributor

goteguru commented Apr 27, 2021

Concurrent execution of transactions deadlocks.

Minimal example:

async def test_task(db): 
     async with db.transaction():
            await db.fetch_one("SELECT 1")

async with Database(database_url) as database:
     await database.fetch_one("SELECT 1")
     tasks = [test_task(database) for i in range(4)]
     await asyncio.gather(*tasks)

This is especially troublesome, because if we remove the 6th line (or move it to the end of the function) everything works ok. Very confusing for the user.

Please notice it's a very common use-case if somebody is using an async library. The user might want to send multiple mails in parallel (and record the result) or pull remote resources asynchronously and do something in the database.

Detailed analysis:

Connections are stateful in most (probably all) database engines. Databases try to hide the connection by use of contexvars, which is created per Task. The Database object creates the Connection on the fly if not exits. Unfortunately if we do anything before starting a transaction, the connection will be bound and all subsequent transactions will be executed using that connection compromising the state if executed concurrently.

This is not a problem if the connections are hierarchical. The library has a transaction buffer to trace this. Unfortunately this buffer causes trouble if the transactions are executed parallel in different Tasks. If task "A" started first, then task "B", but task "A" finishes before task "B" (which is possible) the transaction boundaries will overlap and the module deadlocks with exception (sqlite) or unhandled (postgres).

Possible workaround is to wrap everything in Tasks like:

...
async with Database(database_url) as database:
     await asyncio.create_task(database.fetch_one("SELECT 1"))
     tasks = [test_task(database) for i in range(4)]
     await asyncio.gather(*tasks)

However it is slightly counter-intuitive and even useless if we already executed anything before several levels higher in the callstack (ie. having a connection).

One intuitive solution could be copying the context for every new root transactions (ie. creating new connection for every root transaction). See the proposed fix in the pull request #328 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant