Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further develop startup and shutdown events #617

Open
sm-Fifteen opened this issue Oct 13, 2019 · 46 comments
Open

Further develop startup and shutdown events #617

sm-Fifteen opened this issue Oct 13, 2019 · 46 comments
Labels
feature New feature or request reviewed

Comments

@sm-Fifteen
Copy link
Contributor

sm-Fifteen commented Oct 13, 2019

While the documentationn for FastAPI is in general extremely solid, there's a weakpoint that I feel hints at some underdevelopped feature within the framework, and that's startup and shutdown events. They are briefly mentionned (separately) with the startup event in particular being demonstrated like this :

items = {}


@app.on_event("startup")
async def startup_event():
    items["foo"] = {"name": "Fighters"}
    items["bar"] = {"name": "Tenders"}


@app.get("/items/{item_id}")
async def read_items(item_id: str):
    return items[item_id]

...which could very well be written like this:

items = {
    "foo": {"name": "Fighters"},
    "bar": {"name": "Tenders"}
}

@app.get("/items/{item_id}")
async def read_items(item_id: str):
    return items[item_id]

...and therefore makes the feature look useless. The example for shutdown instead uses logging as an example, which makes it look like this would be the primary purposes for those events, while in reality, it's not.

Is your feature request related to a problem? Please describe.
The problem is that, throughout the entire documentation, things like database connections are created in the global scope, at module import. While this would be fine in a regular Python application, this has a number of problems, especially with objects that have a side-effect outside the code itself, like database connections. To demonstrate this, I've made a test structure that creates a lock file when initialized and deletes it when garbage collected.

Using it like this:

from fastapi import FastAPI
from lock import FileLock

app = FastAPI()
lock = FileLock("fastapi")

@app.get("/")
async def root():
    return {"message": "Hello World"}

...does not work and the lock is not deleted before shutdown (I was actually expecting it to be closed properly, like SQLAlchemy does with its connections, but clearly there's a lot of extra magic going on with SQLAlchemy that I don't even come close to understanding). This is also extremely apparent when using the --reload option on Uvicorn, bcause the lock is also not released when the modules are reloaded, causing the import to fail and the server to crash. This would be one thing, but I've had a similar incident occur some time ago when, while developping in reload mode, I've actually managed to take up every connection on my PostgreSQL server because of that problem, since while SQLAlchemy is smart enough to cleanup on exit where my FileLock cannot, the same does not happen when hot-reloading code.

So that would be one thing; the documentation should probably go into more details about what those startup and shutdown events are for (the Starlette documentation is a little more concrete about this, but no working code is given to illustrate this) and that should also be woven with the chapters about databases and such to make sure people don't miss it.

Except... That's not super ergonomic, now, is it?

from fastapi import FastAPI, Depends
from some_db_module import Connection

app = FastAPI()
_db_conn: Connection

@app.on_event("startup")
def take_lock():
	global _db_conn
	_db_conn = Connection("mydb:///")

@app.on_event("shutdown")
def release_lock():
	global _db_conn
	_db_conn.close()

def get_db_conn():
	return _db_conn

@app.get("/")
async def root(conn: Connection = Depends(get_db_conn)):
	pass

This is basically just a context manager split into two halves, linked together by a global variable. A context manager that will be entered and exited when the ASGI lifetime protocol notifies that the application has started and stopped. A context manager whose only job will be to initialize a resource to be either held while the application is running or used as an injectable dependency. Surely there's a cleaner way to do this.

Describe the solution you'd like
I've been meaning to file this bug for a few weeks now, but what finally got me to do it is the release of FastAPI 0.42 (Good job, everyone!), which has context managers as dependencies as one of its main new features. Not only that, but the examples being given are pretty much all database-related, except connections are opened and closed for each call of each route instead of being polled like SQLAlchemy (and I assume encode's async database module too). Ideally, events should be replaced with something like this, but where the dependencies are pre-initialized instead of being created on the spot. Maybe by having context managers that are started and stopped based on startup and shutdown events and yield "factory functions" that could in turn be called during dependency injection to get the object that needs to be passed.

Something along those lines:

from fastapi import FastAPI, Depends
from sqlalchemy import create_engine

app = FastAPI()

@app.lifetime_dependency
def get_db_conn():
	conn_pool = create_engine("mydb:///")

	# yield a function that closes around db_conn
	# and returns it as a dependency when called
	yield lambda: conn_pool

	conn_pool.close()

@app.get("/")
async def root(conn: Connection = Depends(get_db_conn)):
	pass

Additional context
Not sure where else to mention it, but I've ran into cases where the shutdown event does not get called before exiting, namely when using the VSCode debugger on Windows and stopping or restarting the application via the debugger's controls (haven't tried this on Linux yet). This apparently kills the thread without any sort of cleanup being performed and leaves all database connections open (and possibly unable to timeout, since DBAPI appears to suggest that all queries to be executed as part of a transaction, which most drivers do, and mid-transaction timeout is disabled by default on at least PostgreSQL). I don't think there is any way that could be fixed, though it should probably be mentionned somewhere, either there or in debugging part of the documentation.

@sm-Fifteen sm-Fifteen added the feature New feature or request label Oct 13, 2019
@sm-Fifteen
Copy link
Contributor Author

@dmontagu: You were involved in the context manager dependencies, and that looked like a pretty complex addition. I'm not super familiar with the internal FastAPI codebase (though I might try my hand at this) but does that look like something that would be feasible given how dependencies currently work?

@dmontagu
Copy link
Collaborator

@sm-Fifteen The startup/shutdown as they currently exist are taken almost without modification to starlette.

I think it would be "easy" to add support for context managers, but it would mean going from having nothing to maintain for this feature, to having something to maintain. I'm not sure if that's preferable for fastapi.

I agree that a context manager and/or exit stack could make more sense for this than the separate events as it currently exists. I think it is worth opening an issue in starlette about it. (Though I wouldn't be surprised if they eschew it due to the somewhat esoteric api.)

For what it's worth, I typically store any state necessary for startup/shutdown directly on the app instance (in app.state). That makes it a little nicer to work with than a global variable.

@sm-Fifteen
Copy link
Contributor Author

The startup/shutdown as they currently exist are taken almost without modification to starlette.

Yeah, I've noticed, they certainly look out of place and the documentation barely skimming over it doesn't help.

I think it would be "easy" to add support for context managers, but it would mean going from having nothing to maintain for this feature, to having something to maintain. I'm not sure if that's preferable for fastapi.

I agree that a context manager and/or exit stack could make more sense for this than the separate events as it currently exists. I think it is worth opening an issue in starlette about it. (Though I wouldn't be surprised if they eschew it due to the somewhat esoteric api.)

I doubt Starlette would accept something of the sort, given how the Depends mechanism is a FastAPI addition, and having that upstream wouldn't really fit Starlette's goals, as far as I can tell. As for the maintenance burden, I suppose that would depend on just how much extra code this would represent and the test coverage for it.

The FastAPI tutorial glosses over that part (no mention of app.state, though it was introduced just a month ago; keeping all DB objects in global scope) because everything used in the examples make this work "by magic" (SQLite is all local, SQLAlchemy closes all connections on garbage collection, etc.), with the exception of Async Databases where it's suddenly a requirement that you would be unlikely to notice if not specifically looking for it.

FastAPI could probably use a proper mechanism for this sort of thing to advertize in that tutorial. ;)

@tiangolo
Copy link
Owner

Thanks @dmontagu for the help here!

About the events not working as you expected while debugging, that would be better in a Starlette issue. But it's possible that as you are using --reload, it can actually be a Uvicorn issue, or even an ASGI spec issue.

About having the extra mechanism with context managers, it looks nice, but that would mean adding all that as an extra feature, and would probably require quite a lot of extra code to support just that. And I'm not sure it would really solve something that can't be done otherwise to justify all the extra effort.

I think the best would be to first check if your problem is related to one of those things?

The best way would be to have a minimal self-contained example to demonstrate the error.

This issue slightly touches a lot of different things, if there's something else not covered here, could you please create a new issue for just that one thing? And please include a small self-contained example to show the error/misbehavior you're seeing.

@sm-Fifteen
Copy link
Contributor Author

sm-Fifteen commented Oct 30, 2019

I think the best would be to first check if your problem is related to one of those things?

The best way would be to have a minimal self-contained example to demonstrate the error.

This issue slightly touches a lot of different things, if there's something else not covered here, could you please create a new issue for just that one thing? And please include a small self-contained example to show the error/misbehavior you're seeing.

The error itself isn't the problem I'm looking to fix, it's just a symptom of how we currently manage (or at least advise in the documentation) lifetime-scoped dependencies. That is, we simply initialize them in the global scope with the assumption that everything outside functions can only run once and will be destroyed automatically on server shutdown. Those assumptions are incorrect (especially the latter in async python) and I presume that's what the lifespan protocol was most likely designed to address.

About the events not working as you expected while debugging, that would be better in a Starlette issue. But it's possible that as you are using --reload, it can actually be a Uvicorn issue, or even an ASGI spec issue.

No, --reload is certainly an unusual way to run code, but I wouldn't say it's a Uvicorn issue or a problem with the ASGI spec, since uvicorn does the right thing by firing the shutdown event before reloading the code, calling the startup event again and only then starting the application again. The problem is the initialization that's performed during code reload, as in my example.

from fastapi import FastAPI
from lock import FileLock

app = FastAPI()
# Create a lock to avoid two instances of the app running at the same time
lock = FileLock("fastapi")

@app.get("/")
async def root():
    return {"message": "Hello World"}

This is not really any different from a module running initialization code in global scope outside of a if __name__ == '__main__' scope: it's usually fine but is going to lead to various kinds of issues further down the line in larger projects.

About having the extra mechanism with context managers, it looks nice, but that would mean adding all that as an extra feature, and would probably require quite a lot of extra code to support just that. And I'm not sure it would really solve something that can't be done otherwise to justify all the extra effort.

The context manager approach was more of a proposition of what a more ergonomic way of tying this with the dependency system could look like, though I get why that might prove tricky to implement. it's just that, looking at the proposed use cases for CMs as dependencies involve database connections and sessions, and I'm not sure creating and removing them for each request is the prefered way to do this.

Judging by how SQLAlchemy does connection pooling on its own, I figure it's generally better to create them once when starting your application (which most examples in the tutorial seem to be doing) and then closing them before the server shuts down. SQLAlchemy handles closing its connections on application shutdown because of magic:tm:, but most libraries shouldn't be expected to be this smart about cleaning up, and the current event-based mechanism from Starlette doesn't mesh that well with dependencies (see the _db_conn example in my original post).

The tutorial kind of shows this in the one instance where cleanup has to be handled by application code:

@app.on_event("startup")
async def startup():
await database.connect()
@app.on_event("shutdown")
async def shutdown():
await database.disconnect()

I'd actually say this is a bit cleaner than it would be in practice because database is taken from the global scope, instead of being passed as a dependency.

EDIT: When I say "Those assumptions are incorrect (especially [that everything outside functions ... will be destroyed automatically] in async python)", I'm referring to __del__ being generally unreliable in async Python, which has been raised as a problem for async iterators and generators but are a problem with async code finalization in general.

@dmontagu
Copy link
Collaborator

dmontagu commented Oct 30, 2019

@sm-Fifteen Thanks for driving an interesting discussion here!


Since you can store the database on the app instance (connected or not), I think it actually doesn't have to be that much uglier in practice; from my code:

# (server.state is set up in a separate function,
#  including putting a databases.Database instance on its state attribute)
def add_database_events(server: FastAPI) -> None:
    @server.on_event("startup")
    async def connect_to_database() -> None:
        database = server.state.database
        if not database.is_connected:
            await database.connect()

    @server.on_event("shutdown")
    async def shutdown() -> None:
        database = server.state.database
        if database.is_connected:
            await database.disconnect()

I think a clean approach like this (at least, in my opinion 😅) can be adapted for most lifetime-scoped resources.

Obviously this approach isn't shown in the tutorial, but I'm not sure its helpful to have the tutorial show what it would look like in a clean, large scale implementation, since it adds a nontrivial amount of cognitive burden.


In general, I think there's an important tradeoff to be made between having an easy-to-follow tutorial, and only showing examples with "production-worthy code organization".

I think there's probably room for more references with more-scalably-organized code, but I'm personally inclined to have those live a little bit separately from the tutorial, either in some sort of "advanced" top-level docs section (that doesn't currently exist), or just in blog articles (linked in the docs or otherwise). And given how much content already exists inside the tutorial, and the high testing-coverage bar the docs currently meet, I'd lean toward it existing outside the main fastapi repo if only to keep the project easier to maintain.

Thoughts? (Especially in relation to this issue, specifically)

@sm-Fifteen
Copy link
Contributor Author

Since you can store the database on the app instance (connected or not), I think it actually doesn't have to be that much uglier in practice; from my code:

I personally like the idea of having databases be passed as dependencies instead of living in some kind of arbitrary app state pseudo-dict (fits more easily with strict typing, allows for mocking during tests, error can be thrown from the dependency if the DB isn't configured, etc.), but it's still a worthy alternative.

In general, I think there's an important tradeoff to be made between having an easy-to-follow tutorial, and only showing examples with "production-worthy code organization".

The tutorial highlights penty of more advanced functions, and I'm personally glad it does, even the ones I personally don't use. I would say that a section on how to properly setup ressources that are expected to live for the lifetime of the application so that they're cleaned up on shutdown and on reload is a good practice anyway and should be brought up at least alongside databases. I get that this isn't something you usually see other servers talk about in their doc, but Flask and Django are WSGI, so they have no lifetime-scope to speak of, only global imports and the hope that cleanup will take care of itself on shutdown, with everything else being request-scoped (which would fit with WSGI's one-request-one-call philosophy). For ASGI frameworks, Quart advises using it much like in your example, Sanic uses a global variable initialized with events, like in the "unergonomic" example in my OP, and Channels doesn't seem to support lifespan events at all for the time being.

It's not about showing people how to make their code production-worthy, it's about showing people how to make things right so the apps they make won't break because of expected behavior (like uvicorn --reload doing a hot reload on their code while they were holding a DB connection, so it does not get closed) further down the line. With a name like "FastAPI", database interaction is probably more than just a corner case (we do have 3 separate chapters in the doc related to different kinds of databases 🤔), so it seems pretty well justified to me to at the very least show how to do this right in the doc and explain why it should be done like this, if not by having specific mechanisms to make this sort of thing as simple and elegant to use as the rest of the framework.

@euri10
Copy link
Contributor

euri10 commented Oct 31, 2019

something like this small dummy counter could be an example that could help @sm-Fifteen ?

import asyncio
import logging

import uvicorn
from fastapi import FastAPI

app = FastAPI()

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

if not logger.hasHandlers():
    sh = logging.StreamHandler()
    fmt = logging.Formatter(fmt="%(asctime)s %(name)-12s %(levelname)-8s %(message)s")
    sh.setFormatter(fmt)
    logger.addHandler(sh)


class Service(object):
    def __init__(self):
        self.counter = 0

    def startup(self):
        logger.debug("service startup")
        asyncio.create_task(self.tick())

    def shutdown(self):
        logger.debug("service shutdown")

    async def tick(self) -> None:
        while True:
            self.counter += 1
            await asyncio.sleep(5)
            logger.debug(f"counter is at: {self.counter}")


@app.on_event("startup")
async def startup():
    service = Service()
    service.startup()
    app.state.service = service
    logger.debug('end startup lifespan')


@app.on_event("shutdown")
async def shutdown():
    service = app.state.service
    service.shutdown()
    logger.debug('end shutdown lifespan')


if __name__ == '__main__':
    uvicorn.run("617_events:app")

@sm-Fifteen
Copy link
Contributor Author

something like this small dummy counter could be an example that could help @sm-Fifteen ?

import asyncio
import logging

import uvicorn
from fastapi import FastAPI

app = FastAPI()

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

if not logger.hasHandlers():
    sh = logging.StreamHandler()
    fmt = logging.Formatter(fmt="%(asctime)s %(name)-12s %(levelname)-8s %(message)s")
    sh.setFormatter(fmt)
    logger.addHandler(sh)


class Service(object):
    def __init__(self):
        self.counter = 0

    def startup(self):
        logger.debug("service startup")
        asyncio.create_task(self.tick())

    def shutdown(self):
        logger.debug("service shutdown")

    async def tick(self) -> None:
        while True:
            self.counter += 1
            await asyncio.sleep(5)
            logger.debug(f"counter is at: {self.counter}")

Hmm, no, it doesn't really address the issue, I'm afraid. For starters, you never explicitly shutdown your task on app shutdown, you only log that you did, so running your app with uvicorn --reload and then triggering a reload would likely just spawn additional counters each time until the logs become unreadable. Not shutting down your infinite loop when exiting probably won't have too many side effects, given its simplicity, but I wouldn't expect that to hold with anything that has actual side-effects outside the code, like a DB connection or an open file.

@euri10
Copy link
Contributor

euri10 commented Oct 31, 2019

well them it's just a matter of cancelling the task in the shutdown,

import asyncio
import logging

from fastapi import FastAPI

import uvicorn

app = FastAPI()

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

if not logger.hasHandlers():
    sh = logging.StreamHandler()
    fmt = logging.Formatter(fmt="%(asctime)s %(name)-12s %(levelname)-8s %(message)s")
    sh.setFormatter(fmt)
    logger.addHandler(sh)


class Service(object):
    def __init__(self):
        self.counter = 0

    def startup(self):
        logger.debug("service startup")
        self.task = asyncio.create_task(self.tick())

    def shutdown(self):
        self.task.cancel()
        logger.debug("service shutdown")

    async def tick(self) -> None:
        while True:
            self.counter += 1
            await asyncio.sleep(1)
            logger.debug(f"counter is at: {self.counter}")


@app.on_event("startup")
async def startup():
    service = Service()
    service.startup()
    app.state.service = service
    logger.debug('end startup lifespan')


@app.on_event("shutdown")
async def shutdown():
    service = app.state.service
    service.shutdown()
    logger.debug('end shutdown lifespn')


if __name__ == '__main__':
    uvicorn.run("617_events:app", reload=True)

then you see on a reload it starts back at 1...

Connected to pydev debugger (build 192.6817.19)
email-validator not installed, email fields will be treated as str.
To install, run: pip install email-validator
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO:     Started reloader process [7568]
email-validator not installed, email fields will be treated as str.
To install, run: pip install email-validator
INFO:     Started server process [7595]
INFO:     Waiting for application startup.
DEBUG:    service startup
DEBUG:    end startup lifespan
INFO:     Application startup complete.
DEBUG:    counter is at: 1
DEBUG:    counter is at: 2
DEBUG:    counter is at: 3
DEBUG:    counter is at: 4
WARNING:  Detected file change in '617_events.py'. Reloading...
INFO:     Shutting down
INFO:     Waiting for application shutdown.
DEBUG:    service shutdown
DEBUG:    end shutdown lifespn
INFO:     Application shutdown complete.
INFO:     Finished server process [7595]
email-validator not installed, email fields will be treated as str.
To install, run: pip install email-validator
INFO:     Started server process [7751]
INFO:     Waiting for application startup.
DEBUG:    service startup
DEBUG:    end startup lifespan
INFO:     Application startup complete.
DEBUG:    counter is at: 1
DEBUG:    counter is at: 2
DEBUG:    counter is at: 3
INFO:     Shutting down
INFO:     Waiting for application shutdown.
DEBUG:    service shutdow
DEBUG:    end shutdown lifespn
INFO:     Application shutdown complete.
INFO:     Finished server process [7751]
INFO:     Stopping reloader process [7568]

Process finished with exit code 0

@euri10
Copy link
Contributor

euri10 commented Oct 31, 2019

you might also like this read https://medium.com/@rob.blackbourn/asgi-event-loop-gotcha-76da9715e36d but maybe you already know it

@sm-Fifteen
Copy link
Contributor Author

well them it's just a matter of cancelling the task in the shutdown,

Forgetting to cleanup when it's not guaranteed that cleanup will be done for you is a pretty serious, if sometimes hard to catch error. PEP533 is pretty clear when it says that WSGI and async Python can't rely on the GC for ressource cleanup and instead make use of context managers to ensure that cleanup will be performed as expected (that's not the main discussion point of that PEP, but it's a pretty important point in the motivation section).

The Lifespan protocol was always designed with the idea that it should be used for cleanup; it's really just two halves of a context manager. If those two parts could be welded wold put back together, I feel like we'd have a much more solid mechanism for managing lifetime-scoped ressources than the current one. Personally, I don't think it should be valid to do any ressource initialisation in a module's body, it should only occur within the lifetime protocol's purview.

Then we'd acquire those ressources using dependency injection, because Dependency Injection is one honking great idea -- let's do more of it!

@sm-Fifteen
Copy link
Contributor Author

(bumping the issue)

Besides improving the ergonomics for lifetime-scoped dependencies by taking advantage of the lifespan protocol (which I'd ultimately consider a nice-to-have, as much as I would like something like this as a framework-level feature), the part that really bothers me is opening files, sockets and other ressources in the global space, where the only thing keeping them from leaking is a garbage collection process that can just skip over a variable for reasons that are usually very tricky to find and debug.

It works with CPython most of the time, but as I've mentionned earlier, it doesn't with objects that have an async lifetime (there's no __adel__ finalizer) or with code reloading, and it appears like this wouldn't work on pypy either.

The garbage collectors used or implemented by PyPy are not based on reference counting, so the objects are not freed instantly when they are no longer reachable. The most obvious effect of this is that files (and sockets, etc) are not promptly closed when they go out of scope. For files that are opened for writing, data can be left sitting in their output buffers for a while, making the on-disk file appear empty or truncated. Moreover, you might reach your OS’s limit on the number of concurrently opened files.

[...]

[Objects being finalized once they become unreachable] has clearly been described as a side-effect of the implementation and not a language design decision: programs relying on this are basically bogus. It would be a too strong restriction to try to enforce CPython’s behavior in a language spec, given that it has no chance to be adopted by Jython or IronPython (or any other port of Python to Java or .NET).

[...]

Last note: CPython tries to do a gc.collect() automatically when the program finishes; not PyPy. (It is possible in both CPython and PyPy to design a case where several gc.collect() are needed before all objects die. This makes CPython’s approach only work “most of the time” anyway.)

I know that pypy support is probably not on the roadmap (though maybe it should, Falcon on top of pypy is 350% faster than the Cython version, compared to the pure Python version, that's 560% faster, so that's another way FastAPI could live up to its name), but I figure that's just another reason why explicit ressource finalization and/or use of context managers should at least be encouraged in the documentation.

I just want to make sure there's actual support for this. I'd make a pull request updating the doc myself, but I just want to see if there's any agreement over what pattern should be reccomanded for this or if the current dependency mechanism could/should be modified to facilitate this.

@dmontagu
Copy link
Collaborator

dmontagu commented Dec 6, 2019

opening files, sockets and other ressources in the global space, where the only thing keeping them from leaking is a garbage collection process that can just skip over a variable for reasons that are usually very tricky to find and debug.

Maybe I'm being dumb, but can you give me an example of a situation where you'd want such a thing? I'm having a hard time imagining a scenario where the right solution is to leave a file open for the lifetime of the app, but I'm not saying it's unreasonable.


I know that pypy support is probably not on the roadmap

I think many people are using FastAPI for ML/data science applications, with numpy and other c extensions playing an important role. So I think for many of us pypy support is less interesting / offers less benefits (I personally make use of a number of C extensions).

I haven't tested it in a while, but if I recall correctly, when I last tried several months ago, pydantic compiled with cython was also faster in its benchmark by a significant factor (at least >10%, if I recall) over PyPy. (But maybe cythonized pydantic and pypy fastapi would be fastest?)

@sm-Fifteen
Copy link
Contributor Author

can you give me an example of a situation where you'd want such a thing? I'm having a hard time imagining a scenario where the right solution is to leave a file open for the lifetime of the app, but I'm not saying it's unreasonable.

Some kind of logfile, a database connection (transactions won't be committed or rolled back if the server doesn't know the client hung up), a connection pool of any kind (closing TCP connections and websockets is an active operation) like what aiohttp uses, there certainly are a few examples. Granted, files aren't as concerning as sockets, in that case, since there are less use cases as to why you would want to leave the same file open for the application's entire lifetime.

I think many people are using FastAPI for ML/data science applications, with numpy and other c extensions playing an important role. So I think for many of us pypy support is less interesting / offers less benefits (I personally make use of a number of C extensions).

Well, I'm personally using it as a lightweight database frontend for report generation on half a dozen different data sources, so C extensions don't matter as much to me, but I guess that just goes to show that different people have different use cases where different optimisations may have a different impact. Mypy is notoriously bad at CFFI (through no fault of their own, the CPython API just happens to be designed primarily for use with CPython) and uvicorn uses uvloop as its async event loop (a drop-in replacement for asyncio written in C), so that could explain the slowdown you experienced, though I've made no such test myself.

@dmontagu
Copy link
Collaborator

dmontagu commented Jan 9, 2020

@sm-Fifteen getting back to the main topic of this issue, a recent starlette PR might be relevant: encode/starlette#785

@sm-Fifteen
Copy link
Contributor Author

sm-Fifteen commented Jan 9, 2020

Thank you for bringing this to my attention, @dmontagu, this is pretty much exactly what I was saying we needed, both in this issue and in #726.

@tiangolo
Copy link
Owner

So, it seems, in the end, we'll get context managers for lifespan events from Starlette... 🎉

@sm-Fifteen
Copy link
Contributor Author

So, it seems, in the end, we'll get context managers for lifespan events from Starlette... 🎉

Well, I still think we should wrap those in our dependency injection system, but it will certainly make those easier to maintain.

@sm-Fifteen
Copy link
Contributor Author

sm-Fifteen commented Feb 17, 2020

Before I forget, we would have to specifically mention in the doc that such lifetime-scoped dependencies are unsafe to use with non-thread-safe ressources, in case FastAPI ends up running part of the endpoint in a threadpool. This means SQLAlchemy sessions are not safe to use (though you probably don't want to handle sessions as lifetime deps, that's what context manager deps are for), and Engines (connection pools) are safe to pass across threads, but a connection checked out from one such engine may not be, which could actually cause issues with cached dependencies if those were to be executed in parallel (#639) now that I'm thinking about it. Likewise, bare sqlite3 connection objects are safe to share between threads if the threading mode of the underlying library is SERIALIZED (which is the default), psycopgsql has thread-safe connections, and so on.

>>> import sqlite3
>>> db = sqlite3.connect(':memory:')
>>> for row in db.execute("pragma COMPILE_OPTIONS"):
...     print(row)
...
('COMPILER=msvc-1916',)
('ENABLE_FTS4',)
('ENABLE_FTS5',)
('THREADSAFE=1',)

Users should be advised to verify that the objects they are instanciating to be used by multiple tasks at the same time in their respective documentation, be they registered as global variables or lifetime dependencies. DBAPI mandates the presence of a theadsafety constant, which users may want to check (though I now notice pysqlite/sqlite3 advertises its connections as being non-thread-safe despite the underlying SQLite library saying they are), maybe even assert at runtime if using raw dbapi connections.

I'll stress that this is not a new constraint, since accessing a global variable from multiple threads poses the exact same risks, but a section on lifetime deps would be a good place to mention this in the documentation.

@sm-Fifteen
Copy link
Contributor Author

The implementation from encode/starlette#799 just landed in the upstream master. No release as of the time I'm writing this, but enough to start designing around it.

Like I've said before, I still believe we should deprecate raw startup and shutdown handlers in favor of wrapping the new generator method in our dependency system. Due to how Starlette is implementing it, some care might need to be taken to avoid breaking existing code using events (the on_startup and on_shutdown parameters to Starlette() cannot be used at the same time as the new lifespan parameter, and event handlers declared via the deprecated annotations will simply not fire).

@jykae
Copy link

jykae commented Aug 6, 2020

A bit off-topic, but would it be possible to define custom events for FastAPI ? I can create separate issue proposal about that if you think it's appropriate and possible.

@mdgilene
Copy link

@sm-Fifteen Looks like this has been merged into Starlette as of 0.13.6.

@sm-Fifteen
Copy link
Contributor Author

A bit off-topic, but would it be possible to define custom events for FastAPI ? I can create separate issue proposal about that if you think it's appropriate and possible.

@jykae: No, these are event handlers for ASGI events that aren't part of the HTTP flow (so they're outside of what makes sense for a route to handle), namely the ASGI lifespan startup and shutdown events. You can't implement custom events with that system, and it probably wouldn't make sense to.

@sm-Fifteen Looks like this has been merged into Starlette as of 0.13.6.

@mdgilene: The changelogs seem to indicate 0.13.5, actually (released on July 17), but thanks for the reminder!

@austin1howard
Copy link

it's a little unclear to me...is @tiangolo and any other FastAPI contributors on board with adding this functionality? Especially now that Starlette has their lifespan construct, wrapping that into the Depends construct in FastAPI seems very convenient. If this is something that the community is wanting, I wouldn't mind taking a crack at it.

@michaeloliverx
Copy link
Contributor

@spate141 This is the pattern I have been using to avoid storing any global variables. This code won't run as is but you can get the gist of what is going on. Key takeaways are

  • using functools.partial to pass the app instance to the startup/shutdown tasks
  • storing any arbitrary state on the app.state
  • using the fastapi dependency injection system using Depends which means easier testing later 😄
import dataclasses
from functools import partial

from pydantic import BaseSettings

from fastapi import Depends
from fastapi.applications import FastAPI
from fastapi.requests import Request
from fastapi.routing import APIRouter


class Config(BaseSettings):
    id: str
    password: str


@dataclasses.dataclass
class ClassifyUtil:
    id: str
    password: str

    async def expensive_initialization(self):
        pass

    async def classify_item(self):
        pass


async def startup(app: FastAPI):
    config: Config = app.state.config
    classify_util = ClassifyUtil(config.id, config.password)
    await classify_util.expensive_initialization()
    app.state.classify_util = classify_util


async def shutdown(app: FastAPI):
    pass


router = APIRouter()


async def get_classify_util(request: Request) -> ClassifyUtil:
    if not hasattr(request.app.state, "classify_util"):
        raise AttributeError("classify_util attribute not set on app state")

    classify_util: ClassifyUtil = request.app.state.classify_util
    return classify_util


@router.get("/classify/{my_item}")
async def classify_route(classify_util: ClassifyUtil = Depends(get_classify_util)):
    return await classify_util.classify_item(my_item)


def create_app(config: Config) -> FastAPI:
    config = config or Config()
    app = FastAPI()
    app.state.config = config
    app.add_event_handler(event_type="startup", func=partial(startup, app=app))
    app.add_event_handler(event_type="shutdown", func=partial(shutdown, app=app))
    return app

@spate141
Copy link

@michael0liver Thank you for the response!

One more question; How do I pass id, password which I'm getting from argparse.ArgumentParser() to the create_app() in main() method and run the server using python tmp.py? I was able to run this from __main__ when my app was outside create_app() method.

>> tmp.py

import uvicorn
import argparse
import dataclasses
from pathlib import Path
from fastapi import FastAPI
from functools import partial
from pydantic import BaseSettings

class Config(BaseSettings):
    id: str
    password: str

@dataclasses.dataclass
class ClassifyUtil:
    id: str
    password: str

    async def expensive_initialization(self):
        pass

    async def classify_item(self):
        pass
    
async def startup(app: FastAPI):
    config: Config = app.state.config
    classify_util = ClassifyUtil(config.id, config.password)
    await classify_util.expensive_initialization()
    app.state.classify_util = classify_util

def create_app(config: Config) -> FastAPI:
    config = config or Config()
    app = FastAPI()
    app.state.config = config
    app.add_event_handler(event_type="startup", func=partial(startup, app=app))
    return app

def main():
    parser = argparse.ArgumentParser(description='Start the FastAPI Server.')
    parser.add_argument("--id", dest="id", default='some_id')
    parser.add_argument("--password", dest="password", default='some_password')
    options = parser.parse_args()

    #########
    # ISSUE # 
    #########
    app = create_app(Config(id=options.id, password=options.password))
    uvicorn.run(f"{Path(__file__).stem}:app")

if __name__ == "__main__":
    main()
ERROR:    Error loading ASGI app. Attribute "app" not found in module "tmp".

@spate141
Copy link

spate141 commented May 13, 2021

@michael0liver

Ah, it was my mistake! Fixed it with directly passing app to the unicorn.run().

app = create_app(Config(id=options.id, password=options.password))
uvicorn.run(app)

EDIT:

  • Seems like I can't pass the workers, reload arguments to the uvicorn.run().
app = create_app(Config(id=options.id, password=options.password))
uvicorn.run(app, workers=4, reload=False)
WARNING:    You must pass the application as an import string to enable 'reload' or 'workers'.

Thoughts?

@michaeloliverx
Copy link
Contributor

I usually run uvicorn from the command line and pass uvicorn specific arguments that way. For my application I use environment variables along with pydantic's BaseSettings. So the Config class will read from the environment when its instantiated and look for an ID and PASSWORD keys.

tmp.py:

# tmp.py


from dataclasses import dataclass
from functools import partial
from typing import Optional

from fastapi import FastAPI
from pydantic import BaseSettings


class Config(BaseSettings):
    id: str
    password: str


@dataclass
class ClassifyUtil:
    id: str
    password: str

    async def expensive_initialization(self):
        class_name = self.__class__.__name__
        print(f"Initialising {class_name}")


async def startup(app: FastAPI):
    config: Config = app.state.config
    classify_util = ClassifyUtil(config.id, config.password)
    await classify_util.expensive_initialization()
    app.state.classify_util = classify_util


def create_app(config: Optional[Config] = None) -> FastAPI:
    if config is None:
        config = Config()  # Calling `Config()` will read `id` and `password` from the environment
    app = FastAPI()
    app.state.config = config
    app.add_event_handler(event_type="startup", func=partial(startup, app=app))
    return app


app = create_app()
ID=12345 password=very-secret-pass poetry run uvicorn tmp:app --reload

If you try and start the application without the required environment variables then you can an error:

    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "./tmp.py", line 43, in <module>
    app = create_app()
  File "./tmp.py", line 36, in create_app
    config = Config()  # Calling `Config()` will read `id` and `password` from the environment
  File "pydantic/env_settings.py", line 36, in pydantic.env_settings.BaseSettings.__init__
  File "pydantic/main.py", line 406, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 2 validation errors for Config
id
  field required (type=value_error.missing)
password
  field required (type=value_error.missing)

@spate141
Copy link

@michael0liver Thank you that was very helpful! I ended up passing id, password through environment variables and/or with .env file as mentioned here.

As far as the running the app through main() method; this was my workaround:

tmp.py

....
....

app = create_app()

def main():
    parser.add_argument("--host", dest="host", default='0.0.0.0')
    parser.add_argument("--port", dest="port", default=6006)
    options = parser.parse_args()

    uvicorn.run(
        f"{Path(__file__).stem}:app",
        host=options.host, port=int(options.port),
        workers=cpu_count(), reload=False
    )

if __name__ == "__main__":
    main()
# this will start the fastAPI app server
>> python tmp.py

@bharathanr
Copy link

After reading through this topic, and other related issues #2943 #2944, #4147, I believe that it is currently impossible to use Startlette's lifespan within FastAPI.

I also went through #3516 and #3641 but I am not sure of whether that solves the lifetime scoped dependency problem.

I am left with the following questions:

  1. How to use Startlette's lifespan in FastAPI if it is available?
  2. If FastAPI does not support Starlette's lifespan, when might it be supported?
  3. Does using the lifespan protocol through Starlette's lifespan guarantee that the objects will be cleaned up, given the difficulties seen with the debugger on VSCode, uvicorn --reload, and the general unreliability of cleanup for async code?

@sm-Fifteen I would appreciate if you could weigh in with your thoughts.

@sm-Fifteen
Copy link
Contributor Author

@bharathanr

  1. How to use Startlette's lifespan in FastAPI if it is available?

FastAPI doesn't have any facilities to integrate Starlette's lifespan CM to its DI system.

What you might be able to do is FastAPI(on_startup=None, on_shutdown=None, lifespan=my_lifespan_cm), since FastAPI will pass unrecognized kwargs to Starlette instead. Lifespan handlers don't seem to be in Starlette's documentation as of now, but it's just a generator (sync or async, Starlette handles both) that receives the instantiated application and must never yield more than once.

It would probably look something like this.

def my_lifespan_cm(app: FastAPI):
    my_db_engine = sqla.create_engine("...")
    app.state.my_db_engine = my_db_engine
    yield
    assert app.state.my_db_engine == my_db_engine # Why not? Just to be sure.
    my_db_engine.dispose()

There are also still issues with sub-mounts not receiving lifetime events that you'll want to be aware of.

  1. If FastAPI does not support Starlette's lifespan, when might it be supported?

As for when it might be supported, your guess is as good as mine.

  1. Does using the lifespan protocol through Starlette's lifespan guarantee that the objects will be cleaned up, given the difficulties seen with the debugger on VSCode, uvicorn --reload, and the general unreliability of cleanup for async code?

No, it wouldn't guarantee such a thing. The lifetime protocol is only reliable if one "remembers" to run the shutdown phase at the end. Last I checked, the VSCode debugger's stop button (pressing Ctrl-C in the terminal pane still works) didn't even allow for proper shutdown of resources cleaned up at garbage collection, so most likely it won't even bother unrolling pending generators, async or not. Uvicorn's --reload option should work fine, though.

@kontsaki
Copy link

Hello,

What you might be able to do is FastAPI(on_startup=None, on_shutdown=None, lifespan=my_lifespan_cm), since FastAPI will pass unrecognized kwargs to Starlette instead.

this doesn't seem to be the case with the **extra arguments

@StephenCarboni
Copy link

Sorry for the potentially unhelpful comment but what is holding this up?

@tiangolo tiangolo changed the title [FEATURE] Further develop startup and shutdown events Further develop startup and shutdown events Feb 24, 2023
@Kludex
Copy link
Sponsor Collaborator

Kludex commented Mar 4, 2023

We'll implement this in Starlette and Uvicorn: django/asgiref#354

@tiangolo
Copy link
Owner

tiangolo commented Mar 7, 2023

Thank you all!

This will be available in FastAPI 0.93.0, released in the next few hours. 🎉

The new docs are here: https://fastapi.tiangolo.com/advanced/events/

Sorry for the long delay! 🙈 I wanted to personally address each issue/PR and they piled up through time, but now I'm checking each one in order.

@sm-Fifteen
Copy link
Contributor Author

sm-Fifteen commented Mar 7, 2023

@tiangolo: Having the CM be accessible from user code is certainly going to be a huge improvement, but are there any plans of integrating it more seamlessly into the dependency injection framework like my original post suggested? To me, integrading dependency injection with lifetimes was always the point, and makes it so users don't need to store their initialized ressources anywhere or think about where they are being stored.

@app.lifetime_dependency
def get_db_conn():
	conn_pool = create_engine("mydb:///")
	yield conn_pool
	conn_pool.close()
	
@app.get("/")
async def root(conn: SQLAEngine = Depends(get_db_conn)):
	pass

The new doc may also have to be adjusted soon, in light of the ASGI lifespan state changes in Starlette that would now recommend storing state in the ASGI scope. Abstracting the lifetime CM behind special lifetime-scoped dependencies, like with my initial suggestion, would make it so that user code doesn't need to be modified to account for the change in storage location (from module scope/app.state to request.state). A lifespan CM is fine for Starlette, given what they are trying to be, but FastAPI is in a position to make something much more ergonomic, I believe.

(Note to bystanders: There is an ongoing discussion on the Starlette side at encode/starlette#2067)

@github-actions github-actions bot removed the answered label Mar 7, 2023
@tiangolo
Copy link
Owner

tiangolo commented Mar 7, 2023

Yep @sm-Fifteen, yep to all, haha.

But I have to do it gradually, starting with support for lifespan, which is what I just released, to get the basics out the door, and then keep improving on that.

Also because making more granular releases helps alleviate when complex changes could break some corner cases, people can upgrade up to what still works and get as many benefits as they can before migrating anything else that might have broken their (corner) use cases.

But I definitely have plans for both things you mention. In fact, I wanted to release this now to prepare the ground for the new lifespan state that's coming.

@GuillaumeDesforges
Copy link

Until development of the dependency injection, I have this somewhat working.

app.py
import dotenv
import os

from contextlib import asynccontextmanager
from fastapi import FastAPI, Form, Request
from pydantic import BaseModel
from sqlalchemy import Connection, Engine, create_engine
from sqlalchemy.orm import DeclarativeBase, mapped_column, Mapped
from sqlalchemy.orm import Session
from starlette.applications import Starlette
from typing import AsyncGenerator, TypedDict, cast
from uuid import UUID, uuid4
from uuid import UUID, uuid4


# CONFIG =====================================================================
class Config(BaseModel):
    debug: bool
    database_url: str


def load_config():
    dotenv.load_dotenv(override=True)

    debug = bool(os.getenv(key="API_DEBUG", default=True))
    database_url = os.getenv(
        key="API_DATABASE_URL",
        default="sqlite:///:memory:?check_same_thread=false",
    )
    
    return Config(
        debug=debug,
        database_url=database_url,
    )


# ORM ========================================================================
class Base(DeclarativeBase):
    pass


class Post(Base):
    __tablename__ = "post"

    id: Mapped[UUID] = mapped_column(primary_key=True, default=uuid4)
    title: Mapped[str] = mapped_column(nullable=False)


# STATEFUL LIFESPAN ==========================================================
class State(TypedDict):
    db: Connection | Engine


@asynccontextmanager
async def lifespan(app: Starlette) -> AsyncGenerator[State, None]:
    extra = app.__dict__["extra"]
    config: Config = extra["config"]

    engine = create_engine(url=config.database_url, echo=config.debug)
    connection: Connection | None = None

    if config.debug:
        Base.metadata.create_all(bind=engine)
        connection = engine.connect()

    yield {
        "db": connection or engine,
    }

    if connection is not None:
        connection.close()

    return


def get_sessionable(request: Request) -> Connection | Engine:
    # get "sessionable" database object from app state
    state: State = cast(State, request.state._state)
    connection: Connection | Engine = state["db"]
    return connection


# APP ========================================================================
config = load_config()
app = FastAPI(
    debug=config.debug,
    config=config,
    lifespan=lifespan,
)

@app.post("/post")
def post_post(
    request: Request,
    post_title: str = Form(),
):
    sessionable = get_sessionable(request)
    with Session(sessionable) as session:
        db_post = Post()
        db_post.title = post_title
        session.add(db_post)
        session.commit()

I use the lifespan state to store a connection to a database and use it in my API routes.
I'm very much looking forward to a dependency injection mechanism to simplify it and have automatic typing (instead of those ugly casting).

@martimors
Copy link

Eagerly awaiting this feature. It was available in Flask (current_app) so would be a big incentive for people to adopt FastAPI!

It would be nice to be able to do something like this

MyDep = Annotated[AbstractMyDep, Depends(get_mydep)]

@asynccontextmanager
async def lifespan(_: FastAPI, mydep: MyDep) -> AbstractAsyncContextManager:
    logger.info("Starting API...")
    await mydep.load_config()
    yield
    logger.info("Stopping API...")

The way we currently solve this is declaring mydep in the global scope, which requires all kinds of ugly hacks for tests.

@matez0
Copy link

matez0 commented Jan 2, 2024

Hi. Would it be useful to implement or integrate a similar construct, like the one in fastapi-lifespan-manager?
That could help a lot in adding test-related lifespan configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request reviewed
Projects
None yet
Development

No branches or pull requests