Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add startup/shutdown dependencies and dependency caching lifespan control #3516

Closed
wants to merge 37 commits into from

Conversation

adriangb
Copy link
Contributor

@adriangb adriangb commented Jul 11, 2021

Issues this addresses

To address these issues, this PR provides for more advanced dependency injection concepts.
In particular:

  • Gives users the ability to set a cache scope so that values can be cached for the app's lifetime instead of just a single request.
  • Give users control over when teardown is run for dependencies with yield. It can now be run during app shutdown, after the response is sent (current default) or right before the response is sent (FEATURE - yield dependencies exit code before response #2697)

Finally, this PR gives lifespan events (startup/shutdown) the dependency injection.

These changes are presented together for cohesiveness (they aren't very useful individually), but they can easily be split into multiple smaller PRs for easier review.

Here's a small motivating example:

from asyncpg import create_pool, Connection, Pool
from fastapi import Depends, FastAPI
from pydantic import BaseSettings


class Config(BaseSettings):
    dsn: str


def get_config() -> Config:
    return Config()


async def get_connection_pool(cfg: Config = Depends(get_config)) -> Pool:
    async with create_pool(dsn=cfg.dsn) as pool:
        yield pool


async def get_connection(pool: Pool = Depends(get_connection_pool, lifetime="app", cache="app")) -> Connection:
    async with pool.acquire() as conn:
        yield conn


async def check_db_connection(conn: Connection = get_connection()) -> None:
    assert (await conn.execute("SELECT 1;")) == 1


app = FastAPI(on_startup=[check_db_connection])


@app.get("/")
async def root(conn: Connection = get_connection()) -> int:
    return await conn.execute("SELECT 2;")  # or some other real query

@adriangb
Copy link
Contributor Author

@Faylixe @sm-Fifteen, based on your input in the linked issues, I'd appreciate if you could glance over this and give any feedback. Thanks!

@adriangb
Copy link
Contributor Author

adriangb commented Jul 11, 2021

This even supports sync and async yield dependencies, so that you can easily do startup/shutdown handling

Implementing this requires:

  1. Creating an AsyncContextStack that the app lifetime scoped dependencies can use.
  2. Binding that AsyncContextStack to app startup/shutdown.

(1) is pretty straightforward.
(2) is a bit trickier, mainly because Starlette now supports:

  • app.on_event("startup")(...)
  • Starlette(on_startup=[....])
  • Starlette(lifespan=...)

And if you set any on_startup events, using Starlette(lifespan=...) will raise an error.
So you have to choose one or the other.
FastAPI currently does not support Starlette(lifespan=...), so I guess we can use on_startup/on_shutdown, but if FastAPI ever does want to support Starlette(lifespan=...), that would require modifying this implementation.

@adriangb adriangb changed the title Add support for app lifetime dependencies Add support for app lifespan dependencies Jul 11, 2021
fastapi/dependencies/models.py Outdated Show resolved Hide resolved
fastapi/applications.py Outdated Show resolved Hide resolved
fastapi/applications.py Outdated Show resolved Hide resolved
fastapi/applications.py Outdated Show resolved Hide resolved
fastapi/params.py Outdated Show resolved Hide resolved
@sm-Fifteen
Copy link
Contributor

sm-Fifteen commented Jul 12, 2021

I'm not personally sold on having the dependency's lifespan be specified on the caller's end instead of the dependency's end. On one hand, it means any dependency marked as a lifetime dep anywhere causes all of its transitive dependencies to be run and cached as lifetime deps at the same time, which may not be what the user intended. It's also unclear to me what happens if the same function ends up called both as a request-scoped and a lifetime-scoped dependency at different places throughout the application. Would the request-scoped dep short-circuit to using the already available lifetime dep or would it create a separate instance of it? What if the lifetime dep hasn't been initialized yet?

The main difference between regular startup events and this system is that this system is lazy: nothing is run until the first request that needs it comes in.

Or more generally, this is not DI for startup/shutdown events (which is what the above linked issues were asking for) but rather startup/shutdown events for the DI system. Maybe it's not what we wanted, but perhaps it's just as good?

More generally, I'm not sure of the benefit of lazily-loaded shared resources compared to having dependencies context-managed by the lifetime protocol (the main reason I can think of being parametrized dependencies, which weren't the sort of thing I was considering using for lifetime deps in the first place).

In most real world scenarios, the first request will come from a load balancer, Kubernetes, etc to a healthcheck endpoint, and the app won't even take outside traffic until it consistently is returning 200.
If one wanted to, they could make a Startup dependency that collects all lifetime dependencies and binds them to the healthcheck to ensure that they get initialized before any user requests:

I can't speak for most FastAPI users, but the sort of setup I'm using (just a plain reverse proxy for a small API server connected to a bunch of heterogeneous databases) wouldn't fit under that scenario.

Really, my main use case for lifetime deps when I opened #617 was to not have active object handles littering the global namespace outside of functions (see #726 (comment), and imagine something like that, but minus the underscored global variables). I figured those could just be wrapped in CM-like functions and annotated for FastAPI to know that those functions, when used as dependencies, are not supposed to be re-run, but to have the eagerly singleton'd result (hopefully a thread-safe result, I don't think that can reliably be checked by FastAPI, so it's up to the user to make sure) returned instead.

EDIT: One of my 617 ideas that I eventually left out actually had a context manager yielding in a loop instead of having FastAPI handle the caching, and would keep looping until forced to shut down by a signal, but I had a gut feeling that this would require quite a bit of Python black magic to work with the signal coming from above when StopIteration is an exception that's supposed to bubble up from below, and I had a feeling that a loop might not work well when multiple threads/coroutines need concurrent access to that resource.

It also generally makes sense to me that your server should fail to startup if the database connection your application requires to run is unreachable, or if your core config file can't be found, which is why I would have imagined these as not being lazy, but we might just have different use cases for these.

@adriangb
Copy link
Contributor Author

adriangb commented Jul 12, 2021

Firstly, thank you for the extensive and detailed feedback!

we might just have different use cases for these

This is very possible, I may have been shortsighted with my use cases, but I'll try to answer some of good questions you bring up

It's also unclear to me what happens if the same function ends up called both as a request-scoped and a lifetime-scoped dependency at different places throughout the application

Would the request-scoped dep short-circuit to using the already available lifetime dep or would it create a separate instance of it?

I think this is an important question. To me it boils down to "is an app lifespan dep == to the same dep but with a request lifespan?". Let me know if this interpretation is incorrect.
We can make it work either way, I think we'd just have to add a tuple element to the cache key to make them unequal.
In other words, we can have either of these two work:

class Config:
    ...


def get_config() -> Config:
    return Config()


@app.get("/")
def root(cfg1 = Depends(get_config, lifetime="app"), cfg2 = Depends(get_config)):
    # note that this would only apply to the first call
    # subsequent calls would depend on the resolution of the below discussion regarding cache keys
    assert cfg1 is cfg2  # option 1, current implementation
    assert cfg1 is not cfg2  # option 2, adding the `lifetime` parameter as a cache key element

If we add the lifetime value to the cache key (here), we should get option 2. Otherwise we get option 1.

On one hand, it means any dependency marked as a lifetime dep anywhere causes all of its transitive dependencies to be run and cached as lifetime deps at the same time, which may not be what the user intended

The intention is that only the top level (let's call it root) dependency gets marked as a lifetime dependency and cached as such. If a transitive dependency (get_config in this example) is used by both a lifetime dependency and a non-lifetime dependency it will only be "shared" the first call:

class Config:
    ...


def get_config() -> Config:
    return Config()


def lifetime_dep(cfg: Config = Depends(get_config)):
    return cfg

def request_dep(cfg: Config = Depends(get_config)):
   return cfg

calls = 0

@app.get("/")
def root(cfg1 = Depends(lifetime_dep, lifetime="app"), cfg2 = Depends(request_dep)):
    global calls
    calls +=1
    if calls == 1:
        assert cfg1 is cfg2
    else:
        assert cfg1 is not cfg2

In this case, get_config is used as a request lifetime sub dependency from both lifetime_dep and request_dep.
So it should definitely be the same for both.
Thus in the first call, cfg1 is the same as cfg2.
But in subsequent calls, cfg1 is always the same object, since it is coming from lifetime_dep which is cached.
cfg2 is regenerated on every call.

I'm not sure of the benefit of lazily-loaded shared resources compared to having dependencies context-managed by the lifetime protocol

I agree, it would be nice to have the opposite (DI for the startup/shutdown system) as well. Like you say, it often makes sense to not even start the application unless the database, etc. are able to connect. But maybe there is a use case for both?

just a plain reverse proxy for a small API server

As far as I know, even simple reverse proxies like NGINX have healthcheck functionality.

In your use case, how do you know a deploy was successful if you don't have some sort of healthcheck / rediness?

I'm not personally sold on having the dependency's lifespan be specified on the caller's end instead of the dependency's end

This is not necessarily a solution, but maybe something like the following can be used to clean up things:

# in db.py or something
def get_db_connection(cfg: Config = Depends(get_config)) -> Connection:
    # do some stuff with cfg to get a database connection

def DBConnection() -> Connection:
    return Depends(get_db_connection, lifetime="app")

# in main.py
@app.get("/")
def root(conn: Connection = DBConnection()):
    ...

Now the caller doesn't have to specify Depends(..., lifetime="app").
This does make overrides a bit uglier though because you have to know to override get_db_connection and not DBConnection.
Perhaps there are cleaner alternatives involving a hashable class implementing __call__.

@Faylixe
Copy link

Faylixe commented Jul 13, 2021

I'm not personally sold on having the dependency's lifespan be specified on the caller's end instead of the dependency's end.

+1 on this, not a big fan as well.

Now the caller doesn't have to specify Depends(..., lifetime="app").

In that case, if dependency behavior should be defined by the user I would prefer having a factory function handling this:

def AppDepends(*args, **kwargs) -> Any:
    return Depends(*args, **kwargs, lifespan="app")

Which then can be use like a traditional one:

@app.get("/")
def root(cfg1 = AppDepends(get_config), cfg2 = Depends(get_config)):

Otherwise thanks a lot for working on this, not an easy one and I am really looking forward for this feature <3

@adriangb
Copy link
Contributor Author

adriangb commented Jul 13, 2021

Yeah, I totally understand on:

  • Wanting this to be tied to startup/shutdown
  • Wanting to specify the lifetime on the dependency side

The main reason I'm not doing it like that is that:

  • There is no DI integration into startup/shutdown
  • The DI system currently relies on specifying things on the caller's side (eg. use_cache)

And so doing it this way allowed me to do it in relatively few lines of code, with no major refactors, while still satisfying all of my needs.
I think trying to hit either of those other two features would require more extensive refactors (although maybe I'm just not familiar enough with the codebase).

I do think there's some use for both, I can see scenarios where it is desirable to lazily initialize lifetime dependencies.

I would prefer having a factory function handling this

That does seem like a cleaner short term solution. We could also do this in the FastAPI codebase, something similar is already being done for param_functions.Security. To go one step further, we could not even have the lifetime parameter in param_functions.Depends and instead set it up like this:

def Depends(  # unchanged
    dependency: Optional[Callable[..., Any]] = None, *, use_cache: bool = True
) -> Any:
    return params.Depends(dependency=dependency, use_cache=use_cache)

def AppDepends(
    dependency: Optional[Callable[..., Any]] = None, *, use_cache: bool = True
) -> Any:
    return params.Depends(dependency=dependency, use_cache=use_cache, lifetime="app")

@graingert
Copy link
Contributor

graingert commented Jul 19, 2021

hello! I'd highly recommend using the Starlette(lifespan=...) interface for this so people can create anyio.TaskGroups in their lifespan dependencies.

To support this, lifespan was changed in 0.16.0 to only support async context manager factories, eg using @contextlib(2).asynccontextmanager.

Another change about this is that the yield statement can now throw an instance of anyio.get_cancelled_exc_class() eg:

@contextlib.asynccontextmanager
async def TaskGroup() -> AsyncGenerator[anyio.TaskGroup, None, None]:
    async with anyio.create_task_group() as tg:
        yield tg  #  <- this might throw asyncio.CancelledError or trio.CancelledError!

@adriangb
Copy link
Contributor Author

adriangb commented Jul 19, 2021

100% agreed, thank you for the suggestion. Unfortunately there are a couple blockers to this:

The first is that FastAPI does not support that yet (in particular, FastAPI's router class doesn't have a public lifespan parameter). And so adding those keyword arguments to Router, etc. would have been out of scope for this PR. It may be possible to skirt around adding any public parameters, I'd have to do some testing.

The second bigger problem is that Statlette can't have both Starlette(startup=[...]) and Starlette(lifespan=...) at the same time, if we add a lifespan argument we would disable users existing startup/shutdown hooks, which would break a lot of code.

@sm-Fifteen
Copy link
Contributor

The second bigger problem is that Statlette can't have both Starlette(startup=[...]) and Starlette(lifespan=...) at the same time, if we add a lifespan argument we would disable users existing startup/shutdown hooks, which would break a lot of code.

The startup and shutdown hooks are merely shimmed by the default starlette lifespan context manager (see encode/starlette#799), now. There's nothing stopping FastAPI from adding a custom lifespan handler to take care of lifespan dependencies that also shims the old startup and shutdown events.

@adriangb
Copy link
Contributor Author

@graingert I pushed a version using Starlette(lifespan=...) in
629e465
@sm-Fifteen you're right, thanks for the recommendation. I ended up with this shim:

fastapi/fastapi/routing.py

Lines 454 to 479 in 629e465

self.lifespan_dependencies = {}
@contextlib.asynccontextmanager
async def dep_stack_cm() -> AsyncGenerator:
if AsyncExitStack:
async with AsyncExitStack() as self.lifespan_astack:
yield
else:
self.lifespan_astack = None
yield
self.lifespan_dependencies = {}
async def lifespan_context(app: Any) -> AsyncGenerator:
async with dep_stack_cm():
await self.startup()
yield
await self.shutdown()
super().__init__(
routes=routes, # type: ignore # in Starlette
redirect_slashes=redirect_slashes,
default=default, # type: ignore # in Starlette
on_startup=on_startup, # type: ignore # in Starlette
on_shutdown=on_shutdown, # type: ignore # in Starlette
lifespan=lifespan_context,
)
I think the only tricky bit left is that whenever FastAPI enables FastAPI(lifespan=...) we'll have to handle adding the user's lifespan into this glue lifespan (I suppose checking if it is a context manager or generator, etc.).

adriangb and others added 2 commits July 19, 2021 13:16
Co-authored-by: Thomas Grainger <tagrain@gmail.com>
@graingert
Copy link
Contributor

I think it's worth waiting for the Starlette 0.16.0 upgrade in this pr because so much changed about how lifespan works

@adriangb
Copy link
Contributor Author

Are there more changes beyond having to wrap the lifespan func in @contextlib.asynccontextmanager?

Do you know if FastAPI has a plan / upgrade path for Starlette versions?

@GabeMedrash
Copy link

Yeah, that's a good point: there's a decent amount of flexibility built into the combination of cache and cache_lifespan as you've implemented here, and your comparisons look about right to me on first glance, though I'm sure there are "details" (there always are with Spring).

Of course, the challenge is making sure that it's clear/explicit how to accomplish a task given the API. I'm just some random guy on the internet--no need to give me too much attention--but, I tend to find, e.g., scope="prototype" a little clearer than cache=False, cache_lifespan="request". Perhaps though something like scope="prototype" (and the like) is ultimately syntactic sugar over these more low-level args.

@adriangb
Copy link
Contributor Author

adriangb commented Jul 22, 2021

I think you're making a very good point in that scope (in the Spring sense) is syntactic sugar on top of a combination of how long the value is stored (cache in our case) and when the bean is created/destroyed (lifespan/lifetime in our case).

I think the nice thing is that because of how generator dependencies and the cache are implemented here (in particular, that the dependency need not be the same object as the value it yields), we have full control over both aspects independently. This got me thinking about all of the options we have that users might want. I'm going to leave aside creation of dependencies since that is governed by when they are first needed (which simplifies understanding of startup/shutdown events: they just move the dependency creation to happen before the first request).

Dependency Lifespans

Request

This is the current default / only option.
Once the endpoint function runs, the dependency is destroyed in the background.

App

The original proposal in this PR.
The dependency gets destroyed when the app is destroyed.

Endpoint

The dependency gets destroyed when the endpoint finishes running (i.e. immediately before the request is done processing). Notably, this would allow the dependency to raise HTTPExceptions while being torn down. This is has already been proposed and implemented in #2697.

None/ Background

The dependency gets destroyed in the background immediately after its value is retrieved.
This could be useful if for example you retrieve a scalar value (or something that's not the dependencie's object itself) and need to do some teardown, but the endpoint doesn't depend on the teardown itself, it only needs the value.
Instead of waiting until the endpoint finishes running to start the teardown (which is the behavior with the current / Request lifespan), the teardown starts immediately after the the value is retrieved, running concurrently in the background with the endpoint itself.

Caching

True / Request

This is the current default behavior.
The dependency's returned value is cached for the lifetime of the request.

False

This is current optional behavior
Every time this dependency is requested, it is re created.

App

Proposed in this PR.
The value of the dependency is cached for the lifetime/lifespan of the App (note that the dependency itself could be torn down immediately, eg. if using lifespan="background").
This acts as a default for request cached dependencies: if an app-cached value exists, it is returned and the dependency is not re-constructed.

Combinations

I think that by combining these, all major use cases can be covered.

Some of these (like cache=False, lifespan="app") I don't see much use in, but I suppose there could be use cases, and it might make it more nebulous to try to hide these combinations by prohibiting them or aliasing combinations into a scope than it would be to clearly document the behavior and let users determine what makes sense for them.

Implementation

I did a quick draft implementation and it's not hard to do, it just means keeping around 3 AsyncExitStacks and 2 caches instead of just 1 of each.

@adriangb
Copy link
Contributor Author

adriangb commented Jul 23, 2021

I pushed a series of commits implementing the options proposed in #3516 (comment).

I am thinking that startup/shutdown dependencies should also be split out into another PR, which would greatly reduce the diff size, but I'll keep it in here for now since I think it's a strong use case for the other changes.

The only one I didn't implement is lifetime="background" because the implementation is dissimilar to the others and requires a bit more technical thought; that can be a future PR / feature.

I'd like to acknowledge @zoliknemet for the excellent implementation in #2697, which I partially borrowed to implement lifetime="endpoint".

Additionally:

  • The implementation is now using enums, as proposed in Add startup/shutdown dependencies and dependency caching lifespan control #3516 (comment). cache={True,False} gets converted to the Enum names if passed, making the change backwards compatible (although it might break static analysis tools unless we add a Union[...] type hint).
  • Any startup/shutdown dependencies that depend on Request, Response, etc. are caught and a user-friendly error is given.

Comment on lines 534 to 543
dependant.request_param_name,
dependant.query_params,
dependant.header_params,
dependant.cookie_params,
dependant.body_params,
dependant.request_param_name,
dependant.websocket_param_name,
dependant.http_connection_param_name,
dependant.response_param_name,
dependant.path
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope this is a sensible set of exclusions. I dunno what to do with the security dependencies.

@adriangb
Copy link
Contributor Author

adriangb commented Jul 23, 2021

Some more doc edits. I've gone with editing the existing examples referenced in #3516 (comment) instead of duplicating them.

@adriangb
Copy link
Contributor Author

adriangb commented Jul 30, 2021

In case any of your are interested, I made a toy repo where I've been prototyping what it would look like to generalize the DI system to be less request/response coupled and to support the features proposed in this PR.

Here's an example of how it might integrate into FastAPI internally (the goal is near zero user facing changes): https://github.com/adriangb/di/

I was also able to get some other nice features working:

  • Anyio compatibility
  • Pretty good speedups for highly nested / branched dependencies by parallelizing execution using anyio task groups (try executing the comparisons/{anydep,fastapi}_nested.py files)
  • Full typing support (i.e. Depends tells mypy that it returns the type that the callable you gave it is annotated with instead of Any)

@adriangb
Copy link
Contributor Author

adriangb commented Sep 28, 2021

I'm going to close this in favor of discussion in #3641

Also, the library referenced in #3516 (comment) is now available on PyPi: https://pypi.org/project/di/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants