Master process should restart expired workers. #517

gnat · 2019-12-10T13:50:23Z

Uvicorn stays alive when all workers are dead. Is this intended behaviour? Could someone explain the benefit of this?

Uvicorn does not seem to restart individual workers if they are dead. Therefore, shouldn't Uvicorn itself exit when there are zero workers left?

A container orchestrator could automatically reload Uvicorn on exit, which would be awesome.

Important

We're using Polar.sh so you can upvote and help fund this issue.
We receive the funding once the issue is completed & confirmed by you.
Thank you in advance for helping prioritize & fund our backlog.

tomchristie · 2019-12-10T15:01:53Z

Uvicorn does not seem to restart individual workers if they are dead. Therefore, shouldn't Uvicorn itself exit when there are zero workers left?

Really what we'd like to do here is have the master process restart the child processes.

gnat · 2019-12-10T15:16:24Z

Either sounds good!

Also would Uvicorn restarting workers obsolete the usage of Gunicorn + Uvicorn workers? Or is Gunicorn also providing additional value that I'm not aware of?

tomchristie · 2019-12-10T15:37:22Z

@gnat For many users, yes.

I think we'd probably want to spin the GunicornWorker out into a seperatly managed package at that point, and in general just recommend running uvicorn directly.

Gunicorn gives you some more advanced options wrt. signal handling and restarts, but most users probably don't actually need that.

sposs · 2020-06-22T08:30:07Z

Hi,
For me the status of this issue is problematic: on the one hand you tell users in the doc to use Uvicorn directly (which I did), and advertise the option --limit-max-requests (which I used for safety), but on the other hand, the master process does not restart the stopped processes, and it's recommended here to actually use Gunicorn. Which is it? Shouldn't the doc be changed to mention that the --limit-max-requests does not actually restart the stopped processes? In which case I believe a little more detail is needed to be able to manually restart them... Or, maybe the doc could indicate that this option should not be used and Gunicorn be used instead to provide the same functionality?

sposs · 2020-06-22T08:34:03Z

Actually, I should specify that I used supervisor to run the master process, with the following conf

[program:checklist_server]
command=uvicorn --workers 4 backend.asgi:application --limit-max-requests 1000
user=service
autostart=true
autorestart=true
redirect_stderr=true
stopasgroup=true
killasgroup=true

and if the child processes are stopped, as the master still runs, the service is not restarted...

diwu1989 · 2020-10-02T23:27:15Z

The current behavior is very non-obvious and not what gunicorn users expect.
Please change this to auto-restart.

cb109 · 2021-04-22T09:49:03Z

This behaviour is the opposite of what I would expect to happen. The --limit-max-requests mechanic seems helpful to avoid workers going stale e.g. with memory leaks or such. But what good is a uvicorn parent process with all worker processes dead and never restarted? This seems to be a misconception that should be fixed to restart workers automatically after they hit the request limit.

If that's not feasible or out of scope I'd vote to remove the flag altogether honestly and document how to achieve the same with running under gunicorn.

Kludex · 2021-09-28T10:28:55Z

I'll be working on this.

Kludex · 2022-05-15T18:00:50Z

I have worked on this. The PR was ready (#1205), but there was not enough motivation to review, and I lost the strength to continue with that PR.

Gunicorn is the way to go here.

gnat · 2022-05-15T21:26:03Z

With all due respect @Kludex you're awesome and we highly appreciate your effort on this, but this is not your ticket to close.

If you cannot continue your effort, this issue should go back to unassigned and/or be taken out of the 1.0 milestone, but far too many uvicorn users are encountering this and it still must be documented as an issue.

@tomchristie thoughts?

Kludex · 2022-05-16T04:17:56Z

We already document that gunicorn should be used in production if using multiple workers.

PR is welcome to make it clear that the master process don't restart the dead workers.

When I say a lost strength is about trying to convince others, not about effort...

gnat · 2022-05-16T05:42:06Z

Sorry to hear that.

We already document that gunicorn should be used in production if using multiple workers.

No offense intended, but based on how this was originally triaged, it sounds like uvicorn wasn't supposed to remain a single-worker test server into 1.0.

Before the solution was transformed into a full process manager, many folks would be quite happy with:

Workers are dead? (Or % of)
Gracefully exit Uvicorn so it can be restarted.

This is dead simple and would address the immediate issue. This would be acceptable to anyone running Uvicorn under:

systemd
docker
bash script that auto restarts uvicorn

This is the 90% solution, would be KISS, and we can revisit a full blown process manager in the future. We could go into 1.0 with confidence.

Kludex · 2022-05-16T06:18:15Z

How is that "dead simple", and how it cannot be solved using gunicorn? Single-worker test server?

You keep mentioning 1.0, but I was the one who added this issue to that milestone... There is no one else supporting this to achieve that milestone.

I'll even go further...

The way I see it, we have three possible paths:

deprecate workers feature, and motivate the gunicorn usage.
further develop workers feature (this issue).
don't do anything, document better.

As we discussed, cc @euri10 ..

mrob95 · 2024-02-25T12:17:35Z

Hi, I just hit this in production.

Due to a misconfigured caching policy our uvicorn workers were hitting memory limits and getting killed, then never recovering. Once the workers that were started on start-up ran out, the app hung indefinitely.

An easy repro for this is to put the following into a main.py:

# pip install fastapi uvicorn

from fastapi import FastAPI
import os
import signal

app = FastAPI()

@app.get("/")
def die():
    os.kill(os.getpid(), signal.SIGKILL)

Then run:

uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

When the server gets hit, one of the workers dies. Once we run out of workers the server hangs:

while true; do echo "Sending request..."; curl http://localhost:8000; done
Sending request...
curl: (52) Empty reply from server
Sending request...
curl: (52) Empty reply from server
Sending request...
curl: (52) Empty reply from server
Sending request...
curl: (52) Empty reply from server
Sending request...
... (hangs indefinitely)

There are no indications in the server logs that anything went wrong.

I have fixed this problem by using gunicorn as a process manager as recommended here:

https://fastapi.tiangolo.com/deployment/server-workers/#gunicorn-with-uvicorn-workers

I'm surprised that a bug like this would be left in a web server implementation for 4+ years. Hanging without servicing requests or exiting is pretty much the worst thing a server can do.

If a robust fix is complicated to implement, I think it would be good to at least emphasise in the start-up logs (as the flask debug server does) that uvicorn alone is not intended for production usage.

gnat · 2024-02-25T13:05:54Z

@mrob95 the lines you need in your Uvicorn to do the job: #1942

I'm surprised that a bug like this would be left in a web server implementation for 4+ years. Hanging without servicing requests or exiting is pretty much the worst thing a server can do.

Yup, it's bad.

silverjam · 2024-04-12T20:17:07Z

[..]

deprecate workers feature, and motivate the gunicorn usage.
[..]

don't do anything, document better.

FWIW we got led down this path too by a desire to use the --limit-concurrency feature to make sure we kept memory usage under control in our service and eventually ran into the issue with workers hanging mentioned by @mrob95 because we were also using --limit-max-requests.

I would recommend the following path forward to mitigate the issue:

Deprecate --workers and warn (in code and docs) that --limit-max-requests should not be used in combination with it (or explicitly disallow the combination of --limit-max-requests and --workers).
Document more explicitly a recommended approach for accessing Uvicorn specific options via Gunicorn (such as the concurrency limits) -- e.g. something like this StackOverflow post or whatever is more appropriate/idiomatic.

tomchristie changed the title ~~Uvicorn stays alive when all workers are dead.~~ Master process should restart expired workers. Dec 10, 2019

tomchristie mentioned this issue Dec 18, 2019

[question] production configuration of gunicorn encode/starlette#642

Closed

jdavid mentioned this issue Dec 5, 2020

ASGI jdavid/django#2

Open

bkocis mentioned this issue Sep 28, 2021

Use gunicorn instead uvicorn in main tiangolo/fastapi#3955

Closed

9 tasks

Kludex added this to the Version 1.0 milestone Sep 28, 2021

Kludex mentioned this issue Oct 2, 2021

Create ProcessManager #1205

Closed

Kludex closed this as completed May 15, 2022

Kludex removed this from the Version 1.0 milestone May 16, 2022

gnat mentioned this issue Mar 20, 2023

How to deploy starlette in production? encode/starlette#671

Closed

Kludex reopened this Mar 20, 2023

gnat mentioned this issue Apr 13, 2023

Simplified worker monitoring and restart, without gunicorn. #1942

Closed

br3ndonland mentioned this issue Jun 4, 2023

Test workers #1995

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Master process should restart expired workers. #517

Master process should restart expired workers. #517

gnat commented Dec 10, 2019 •

edited by polar-sh bot

tomchristie commented Dec 10, 2019

gnat commented Dec 10, 2019

tomchristie commented Dec 10, 2019

sposs commented Jun 22, 2020

sposs commented Jun 22, 2020 •

edited

diwu1989 commented Oct 2, 2020

cb109 commented Apr 22, 2021 •

edited

Kludex commented Sep 28, 2021

Kludex commented May 15, 2022

gnat commented May 15, 2022

Kludex commented May 16, 2022

gnat commented May 16, 2022 •

edited

Kludex commented May 16, 2022

mrob95 commented Feb 25, 2024

gnat commented Feb 25, 2024

silverjam commented Apr 12, 2024

Master process should restart expired workers. #517

Master process should restart expired workers. #517

Comments

gnat commented Dec 10, 2019 • edited by polar-sh bot

tomchristie commented Dec 10, 2019

gnat commented Dec 10, 2019

tomchristie commented Dec 10, 2019

sposs commented Jun 22, 2020

sposs commented Jun 22, 2020 • edited

diwu1989 commented Oct 2, 2020

cb109 commented Apr 22, 2021 • edited

Kludex commented Sep 28, 2021

Kludex commented May 15, 2022

gnat commented May 15, 2022

Kludex commented May 16, 2022

gnat commented May 16, 2022 • edited

Kludex commented May 16, 2022

mrob95 commented Feb 25, 2024

gnat commented Feb 25, 2024

silverjam commented Apr 12, 2024

gnat commented Dec 10, 2019 •

edited by polar-sh bot

sposs commented Jun 22, 2020 •

edited

cb109 commented Apr 22, 2021 •

edited

gnat commented May 16, 2022 •

edited