🐛 Allow exit code for dependencies with `yield` to always execute, by removing capacity limiter for them, to e.g. allow closing DB connections without deadlocks #5122

adriangb · 2022-07-11T02:41:00Z

Related to #3205

Tested this against https://github.com/harsh8398/fastapi-dependency-issue

codecov · 2022-07-11T02:42:00Z

Codecov Report

Merging #5122 (c7e3263) into master (e35df68) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##            master     #5122    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files          532       539     +7     
  Lines        13684     13906   +222     
==========================================
+ Hits         13684     13906   +222

Impacted Files	Coverage Δ
fastapi/__init__.py	`100.00% <100.00%> (ø)`
fastapi/concurrency.py	`100.00% <100.00%> (ø)`
tests/main.py	`100.00% <0.00%> (ø)`
fastapi/utils.py	`100.00% <0.00%> (ø)`
fastapi/routing.py	`100.00% <0.00%> (ø)`
fastapi/encoders.py	`100.00% <0.00%> (ø)`
tests/test_query.py	`100.00% <0.00%> (ø)`
fastapi/websockets.py	`100.00% <0.00%> (ø)`
fastapi/openapi/docs.py	`100.00% <0.00%> (ø)`
tests/test_ws_router.py	`100.00% <0.00%> (ø)`
... and 16 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

github-actions · 2022-07-11T02:44:25Z

📝 Docs preview for commit c507e76 at: https://62cb8e56fb08263a0b2a8795--fastapi.netlify.app

Kludex · 2022-07-11T04:46:04Z

If a new CapacityLimiter is created every time, then the token can be 1... Can't it?

The inf would make sense for a global limiter... Isn't it the case?

adriangb · 2022-07-11T04:58:50Z

Yeah I guess it can be 1. Or we can make it a global. Either way.

github-actions · 2022-07-11T05:02:40Z

📝 Docs preview for commit 0892b02 at: https://62cbaec8c51dc34d83b10e22--fastapi.netlify.app

github-actions · 2022-07-12T01:56:43Z

📝 Docs preview for commit d822171 at: https://62ccd4afa2dc031bf1f8e7af--fastapi.netlify.app

github-actions · 2022-07-13T19:30:39Z

📝 Docs preview for commit 3557911 at: https://62cf1d1ed35ee72b45e8b61f--fastapi.netlify.app

github-actions · 2022-07-13T19:36:07Z

📝 Docs preview for commit c53e0f6 at: https://62cf1e6ca2b9a32bbabee427--fastapi.netlify.app

github-actions · 2022-07-17T16:41:26Z

📝 Docs preview for commit 3b73873 at: https://62d43b8f94f05f7127e81a5e--fastapi.netlify.app

tiangolo · 2022-09-04T18:21:36Z

Niceee! 🤓 🚀

I tested this with the example code in: #3205

I used wrk and I ran twice:

$ wrk -c 100 -t 100 --timeout 60 "http://127.0.0.1:8000/"

Without the fix

Without the fix, I got first:

Running 10s test @ http://127.0.0.1:8000/
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.40s     2.04s    8.05s    63.64%
    Req/Sec     0.00      0.00     0.00    100.00%
  22 requests in 10.09s, 3.05KB read
Requests/sec:      2.18
Transfer/sec:     309.54B

And then the second run I got:

Running 10s test @ http://127.0.0.1:8000/
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     0.00us    0.00us   0.00us    -nan%
    Req/Sec     0.00      0.00     0.00      -nan%
  0 requests in 10.08s, 0.00B read
Requests/sec:      0.00
Transfer/sec:       0.00B

...this seems to reflect the deadlock. 🎉 (I'm celebrating being able to see something broken, it's sometimes hard to see the broken part 😅 )

After that, I Ctrl+C the Uvicorn process, and it hung in there, at:

^C
INFO:     Shutting down
INFO:     Waiting for background tasks to complete. (CTRL+C to force quit)

After 7 minutes I force-killed the process. So, indeed, deadlock.

With the fix, outside

With the fix, putting a single global CapacityLimiter outside of the function.

First I got:

Running 10s test @ http://127.0.0.1:8000/
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.97s     3.07s   10.02s    67.50%
    Req/Sec     0.00      0.00     0.00    100.00%
  40 requests in 10.09s, 5.55KB read
Requests/sec:      3.96
Transfer/sec:     562.78B

For the second run I got:

Running 10s test @ http://127.0.0.1:8000/
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     6.31s     2.45s    9.64s    73.08%
    Req/Sec     0.00      0.00     0.00    100.00%
  26 requests in 10.09s, 3.61KB read
Requests/sec:      2.58
Transfer/sec:     365.96B

Third run:

Running 10s test @ http://127.0.0.1:8000/
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     6.90s     2.76s    9.58s    83.33%
    Req/Sec     0.00      0.00     0.00    100.00%
  12 requests in 10.08s, 1.66KB read
Requests/sec:      1.19
Transfer/sec:     169.06B

Not sure why it keeps reducing the number of requests. It seems something would be still wrong, but not sure what. 😔

After that, and after I wrote this, I Ctrl+C the Uvicorn process, it finished immediately. So it seems it handled the deadlock.

I ran the whole thing again, and I Ctrl+C the Uvicorn process immediately after the 3 runs, and then it waited for a bit at:

^C
INFO:     Shutting down
INFO:     Waiting for background tasks to complete. (CTRL+C to force quit)

But after 50 seconds it terminated normally.

With the fix, local

Keeping the local CapacityLimiter created every time, as in this PR.

The first run I got:

Running 10s test @ http://127.0.0.1:8000/
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.71s     2.90s   10.06s    65.00%
    Req/Sec     0.00      0.00     0.00    100.00%
  40 requests in 10.08s, 5.55KB read
Requests/sec:      3.97
Transfer/sec:     563.34B

The second run I got:

Running 10s test @ http://127.0.0.1:8000/
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.26s     2.50s    8.48s    80.00%
    Req/Sec     0.00      0.00     0.00    100.00%
  5 requests in 10.09s, 710.00B read
Requests/sec:      0.50
Transfer/sec:      70.37B

And the third run I got:

Running 10s test @ http://127.0.0.1:8000/
  100 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.52s     2.99s    9.27s    50.00%
    Req/Sec     0.00      0.00     0.00    100.00%
  4 requests in 10.10s, 568.00B read
Requests/sec:      0.40
Transfer/sec:      56.25B

The results are similar to having a global CapacityLimiter than having a local one. The 5 or 4 requests are comparable to some of the other tests I ran with the global CapacityLimiter. It's in the same ballpark.

Terminating the process also worked as expected instead of hanging there in a deadlock.

For some reason, it's still being able to handle fewer requests every time. Maybe it's just that it's trying to read the file every time or something like that and there's a bottleneck in that, just in the file system. I'm not sure about that.

Checking the AnyIO docs I see that:

Capacity limiters are like semaphores except that a single borrower (the current task by default) can only hold a single token at a time.

Ref: https://anyio.readthedocs.io/en/stable/synchronization.html

Then I'm not sure if it could be problematic if the same thread worker tried to get the limiter again, and if that could block with the other thing running on the same thread that already got the limiter token.

Until we figure this out, I'll leave the local CapacityLimiter (as in this PR) to be safe (although in my local tests it didn't seem to affect, but not sure why).

But anyway, it seems that this clearly improves the situation.

I wish there was a way to test this, in pytest. But not sure how to replicate the whole problem and situation in a feasible way (not involving a whole setup for "benchmarks" with ab, Locust, or wrk).

For now, I'll merge this, but if anyone has any idea about how this could be tested, to have checks in CI making sure the problem doesn't happen again, it would be awesome.

Thanks @adriangb for all the investigation and work! 🙇 🤓 🍰

adriangb · 2022-09-04T18:30:25Z

I wish there was a way to test this, in pytest. But not sure how to replicate the whole problem and situation in a feasible way (not involving a whole setup for "benchmarks" with ab, Locust, or wrk)

It might be reproducible by talking pure-ASGI to the FastAPI app and launching multiple tasks using a TaskGroup? Might be hard to get the timing right though.

You could try setting the capacity limit to float("inf") instead of 1.

I think the slowdowns might come from acquiring database connections. There's likely some semaphore there as well.

tiangolo · 2022-09-04T18:32:26Z

Hehe I was writing this and you commented before me 😅

Ah, wait, I didn't read @Kludex comment properly. I didn't realize it could be an inf float. That probably makes sense and we can avoid creating a new object every time.

I just di that and updated the branch. 🚀

And yeah, maybe multiple ASGIs in a task group could work. But I'm not sure how to make it fail instead of hang, if it just hangs it's gonna be more difficult to test. 🤔

tiangolo · 2022-09-04T18:34:58Z

Hehehe the tests have spoken... it seems we need to use the local variable after all. Otherwise AnyIO gets confused with what is the async backend. 😂

…nc context to create the object This reverts commit 09dc61a.

github-actions · 2022-09-04T18:35:32Z

📝 Docs preview for commit 09dc61a at: https://6314efcd67c58a208eaa7d3e--fastapi.netlify.app

Kludex · 2022-09-04T18:36:10Z

wait

adriangb · 2022-09-04T18:39:09Z

But I'm not sure how to make it fail instead of hang, if it just hangs it's gonna be more difficult to test

I think anyio.fail_on_after() should do the trick right? But if we were to put this in a test it might be good to figure out how to reproduce it without SQLAlchemy.

github-actions · 2022-09-04T18:39:58Z

📝 Docs preview for commit c7e3263 at: https://6314f0dd00ed39272fd435c8--fastapi.netlify.app

Kludex · 2022-09-04T18:41:22Z

Hehehe the tests have spoken... it seems we need to use the local variable after all. Otherwise AnyIO gets confused with what is the async backend. joy

https://github.com/agronholm/anyio/blob/07b49f76597f353185c825e2acd1b1b9fe274727/src/anyio/_backends/_asyncio.py#L2068

https://anyio.readthedocs.io/en/stable/api.html?highlight=runvar#anyio.lowlevel.RunVar

tiangolo · 2022-09-04T19:06:59Z

@adriangb

I think anyio.fail_on_after() should do the trick right? But if we were to put this in a test it might be good to figure out how to reproduce it without SQLAlchemy.

Totally agree with that! That would be great. Not sure how but it would be great indeed.

@Kludex

Thanks for the links and pointers! (also on Discord)

Copying here for completeness, the issue about CapacityLimiters in Starlette: encode/starlette#1724

The snippet to update the default token size of the default CapacityLimiter in encode/starlette#1724 (comment) and for FastAPI in: https://gist.github.com/Kludex/f889f02bb81adf3a8d63176ad4d8bcb1

In conclusion, for now I'll merge this with the local capacity limiter. Maybe at some point Starlette or FastAPI can have a way to customize different capacity limiters for different things (not sure, I haven't thought much about it) and then this could be set there.

And about the option of using a RunVar (semi pseudo-contextvar) for a global-loop-local capacity limiter, for now I'll leave the local variable as it's simpler and we can be more certain the implementation works well. The RunVar can have a cost involved while dealing with these pseudo-contextvars that might offset the cost of creating the local object in the function. And the implementation would certainly be more complex. So I'll rather keep this one, to be a bit more certain, and avoid the risk of introducing other complex errors (contextvars are my nemesis 😂).

Thanks again you both for all the help! 🙇 🍰

nickswiss · 2022-09-16T17:10:50Z

This has been closed, but I am not really sure on the final solution as far as running syncronous SQLAlchemy with FastAPI. In my case, I am using FastAPI with SQLAlchemy and Oracle, which does not have a trusted async library.
If it is possible to use SQLAlchemy synchronously, what is the correct approach?

tiangolo · 2022-11-20T14:54:03Z

@nickswiss I'm not sure I understand the question or if there's something you're not being able to do. If you still have questions or problems, it's probably better to ask in a new GitHub issue, the template will guide you to structure the question in a way that will make it easier to help you, with an example I can run, etc.

…ation

lint

c507e76

adriangb mentioned this pull request Jul 11, 2022

Run sync context managers in a single thread #5114

Closed

Use 1

0892b02

Merge branch 'master' into no-limiter-for-exit

d822171

Merge branch 'master' into no-limiter-for-exit

3557911

refactor and comment

c53e0f6

Merge branch 'master' into no-limiter-for-exit

3b73873

wcmoosa mentioned this pull request Aug 28, 2022

Using dependency injection to get SQLAlchemy session can lead to deadlock #3205

Closed

9 tasks

tiangolo changed the title ~~remove thread capacity limit for running sync context manager __exit__~~ 🐛 Allow exit code for dependencies with yield to always execute, by removing capacity limiter for them, to e.g. allow closing DB connections without deadlocks Sep 4, 2022

♻️ Make the CapacityLimiter a global variable

09dc61a

⏪ Revert making CapacityLimiter a global, AnyIO needs to be in an asy…

c7e3263

…nc context to create the object This reverts commit 09dc61a.

tiangolo merged commit f8460a8 into fastapi:master Sep 4, 2022

adriangb deleted the no-limiter-for-exit branch September 4, 2022 19:10

This was referenced Sep 4, 2022

sqlalchemy queue pool limit lockup/timeout fastapi/full-stack-fastapi-template#104

Closed

Requests randomly fail to receive data #3137

Closed

OpetherMB mentioned this pull request Nov 9, 2022

performance latency after updating to Fastapi > 0.82.0 #5602

Closed

9 tasks

pmeier mentioned this pull request Oct 31, 2023

Don't use fastapi Depends for database sessions Quansight/ragna#136

Closed

shangyian mentioned this pull request Mar 18, 2024

Upgrade FastAPI to the latest version by pinning it to >=0.110.0 DataJunction/dj#943

Merged

3 tasks

jonathan-d-zhang mentioned this pull request Apr 3, 2024

Investigate performance issues introduced by 9db3e05 vipyrsec/dragonfly-mainframe#235

Closed

noahklein00 added a commit to noahklein00/fastapi that referenced this pull request Aug 24, 2024

Continue patch from fastapi#5122 by not limiting response model valid…

3c29f0f

…ation

noahklein00 mentioned this pull request Aug 24, 2024

🐛 Fix deadlock that can occur when closing dependencies during response model validation #12066

Open

🐛 Allow exit code for dependencies with yield to always execute, by removing capacity limiter for them, to e.g. allow closing DB connections without deadlocks #5122

🐛 Allow exit code for dependencies with yield to always execute, by removing capacity limiter for them, to e.g. allow closing DB connections without deadlocks #5122

Conversation

adriangb commented Jul 11, 2022 • edited by tiangolo Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jul 11, 2022

Uh oh!

Kludex commented Jul 11, 2022

Uh oh!

adriangb commented Jul 11, 2022

Uh oh!

github-actions bot commented Jul 11, 2022

Uh oh!

github-actions bot commented Jul 12, 2022

Uh oh!

github-actions bot commented Jul 13, 2022

Uh oh!

github-actions bot commented Jul 13, 2022

Uh oh!

github-actions bot commented Jul 17, 2022

Uh oh!

tiangolo commented Sep 4, 2022

Without the fix

With the fix, outside

With the fix, local

Uh oh!

adriangb commented Sep 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tiangolo commented Sep 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tiangolo commented Sep 4, 2022

Uh oh!

github-actions bot commented Sep 4, 2022

Uh oh!

Kludex commented Sep 4, 2022

Uh oh!

adriangb commented Sep 4, 2022

Uh oh!

github-actions bot commented Sep 4, 2022

Uh oh!

Kludex commented Sep 4, 2022

Uh oh!

tiangolo commented Sep 4, 2022

Uh oh!

nickswiss commented Sep 16, 2022

Uh oh!

tiangolo commented Nov 20, 2022

Uh oh!

🐛 Allow exit code for dependencies with `yield` to always execute, by removing capacity limiter for them, to e.g. allow closing DB connections without deadlocks #5122

🐛 Allow exit code for dependencies with `yield` to always execute, by removing capacity limiter for them, to e.g. allow closing DB connections without deadlocks #5122

adriangb commented Jul 11, 2022 •

edited by tiangolo

Loading

codecov bot commented Jul 11, 2022 •

edited

Loading

adriangb commented Sep 4, 2022 •

edited

Loading

tiangolo commented Sep 4, 2022 •

edited

Loading