Fast api is blocking long running requests when using asyncio calls #8842
-
Do you have an example of how to process long-lasting processes and not block main thread. We have some AI workload and processing take from 30s to 5 mins depending on the input. Results should be returned to the front end for the user to analyze. Can you advise or prove an example how to handle the call and not block main thread/uvloop? |
Beta Was this translation helpful? Give feedback.
Replies: 7 comments 3 replies
-
Here are the three main cases. The first one ( from fastapi import FastAPI
from time import sleep
from asyncio import sleep as async_sleep
app = FastAPI()
# Blocking call in async route
# Async routes run on the main thread and are expected
# to never block for any significant period of time.
# sleep() is blocking, so the main thread will stall.
@app.get("/async_will_block")
async def async_will_block():
sleep(10)
return []
# Blocking calls on sync route
# Sync routes are run in a separate thread from a threadpool,
# so any blocking will not affect the main thread.
@app.get("/sync_no_block")
def sync_no_block():
sleep(10)
return []
# Awaiting coroutines on async routes
# Awaiting an async function causes it to yield the main thread
# while it's waiting for an operation to complete, so it's not blocking the thread.
# asyncio.sleep(), unlike time.sleep(), is an async function, so it can be awaited.
@app.get("/async_no_block")
async def async_no_block():
await async_sleep(10)
return [] Depending on how you're running your AI workload, you might run into issues with Python threads competing with each other for access to the GIL, which is how the CPython interpreter ensures that its memory can't be corrupted by two Python threads manipulating the same memory structures at the same time. This means two Python threads cannot run Python code at the same time (but most numpy number-crunching operations are fair game). This is important because it means your main thread could still get stalled by another thread that's doing is calculations in Python. All Python frameworks share this issue, not just FastAPI, but it's still a thing to consider when running long loops. Issue #1224 also comes to mind. If you have very large data payloads that you're encoding in JSON, they could be stalling the encoder, which cannot free the GIL since it needs access to Python data structures to generate its output. |
Beta Was this translation helpful? Give feedback.
-
1 configure uvicorn (hypercorn ... ) for long time reponse , otherwise it will timeout. 2 put the task in run_in_threadpool from starlette.concurrency import run_in_threadpool
@app.get("/long_answer")
async def long_answer():
rst = await run_in_threadpool(my_model.function_b, arg_1, arg_2)
return rst run_in_threadpool is based on threads , if you want use an other process for that then you gotta do it yourself. -> encode/starlette#1094 close your issue if my answer is clear , thank you. |
Beta Was this translation helpful? Give feedback.
-
thanks, it helps |
Beta Was this translation helpful? Give feedback.
-
Not understand 100% why I should be doing this if i'm doing the following
as suggested by @sm-Fifteen. What is the benefit of running this in a thread pool? Could anyone clarify? Thanks in advance 🙏 |
Beta Was this translation helpful? Give feedback.
-
There isn't a benefit. One could argue that there is even a cost to it (threads are expensive to manage). You should use Note that when you have a long running task and you don't want it to block the main thread, you should run it in a separate process. Python only executes 1 thread at a time, so if your thread is not releasing the GIL you are still blocking the main thread. |
Beta Was this translation helpful? Give feedback.
-
@JarroVGIT Is it possible to make the "long running task" asynchronous, if I neither want to put it into a seperate thread nor want it block the main thread? How can we do that? |
Beta Was this translation helpful? Give feedback.
-
Well that depends on what kind of task it is. Is it CPU-bound (e.g. lots of calculations)? Then your could use |
Beta Was this translation helpful? Give feedback.
Here are the three main cases. The first one (
async_will_block
) is what you want to avoid.