When receiving a SIGTERM supervisors should terminate their processes before joining them #1069

sgsabbage · 2021-06-04T11:49:19Z

Fixes #852

By calling process.terminate(), it sends a SIGTERM to the child processes which should handle it appropriately. That said, I have a very naive understanding of multiprocessing so this may not be the right resolution.

… before joining them

euri10 · 2021-06-07T07:38:43Z

ok I tested the PR and as I expected it it doesnt fix it
in order to see it by yourself you do
uvicorn app:app --workers=2
in another terminal kill -15 ppid_of_parent_process
and it's still running

sgsabbage · 2021-06-07T08:20:33Z

@euri10 What environment are you running out of interest? I'm running on Ubuntu 20.04

I just tried this and it worked for me with the PR and didn't work without it:

Without:

$ uvicorn test:app --port 8001 --workers 2
INFO:     Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit)
INFO:     Started parent process [703]
INFO:     Started server process [707]
INFO:     Waiting for application startup.
INFO:     Started server process [706]
INFO:     Waiting for application startup.
INFO:     ASGI 'lifespan' protocol appears unsupported.
INFO:     Application startup complete.
INFO:     ASGI 'lifespan' protocol appears unsupported.
INFO:     Application startup complete.

$ kill -15 703

Nothing happens and had to close with CTRL-C

With:

$ uvicorn test:app --port 8001 --workers 2
INFO:     Uvicorn running on http://127.0.0.1:8001 (Press CTRL+C to quit)
INFO:     Started parent process [710]
INFO:     Started server process [714]
INFO:     Waiting for application startup.
INFO:     Started server process [713]
INFO:     Waiting for application startup.
INFO:     ASGI 'lifespan' protocol appears unsupported.
INFO:     Application startup complete.
INFO:     ASGI 'lifespan' protocol appears unsupported.
INFO:     Application startup complete.

$ kill -15 710

INFO:     Shutting down
INFO:     Finished server process [713]
INFO:     Shutting down
INFO:     Finished server process [714]
INFO:     Stopping parent process [710]

Would be interested to see if this is down to different environments handling signals differently.

I also tested spinning up a docker container and using docker stop on the container. With the PR in the container shuts down gracefully. WIthout the container hangs and then gets killed.

euri10 · 2021-06-07T08:22:35Z

this is weird, I'm on debian

sgsabbage · 2021-06-07T08:23:50Z

I will try and spin up a fresh debian environment today to see if there's any strangeness in my setup that I've missed

euri10 · 2021-06-07T08:30:19Z

I think I may have fucked up the way I sent the 15 signal...
It indeed works fine using the ppid advertised in our logger, but I was using fzf and for some reason the pid that arise 1st is not the ppid when you type uvicorn

this would be awesome

euri10 · 2021-06-07T08:32:51Z

we'd also need to test the behaviour with gunicorn

sgsabbage · 2021-06-12T18:12:15Z

we'd also need to test the behaviour with gunicorn

I've started looking at testing it with gunicorn as well, but as far as I can find the majority of the documentation suggests that gunicorn takes the place of the supervisors in this case, as it's suggested to run it with gunicorn -k uvicorn.workers.UvicornWorker, which to the best of my knowledge doesn't use either the reload or the multiprocessing supervisors.

We're entering into areas that I don't have any knowledge of beyond scanning the code though, so if there are specific deployment scenarios where gunicorn uses either of the supervisors I'm more than happy to preemptively check them out in case there are any issues!

gmeans · 2021-06-17T13:23:33Z

I found this MR due to an issue with --reload not shutting down w/ a keyboard interrupt (ctrl+c). The end result would be the python process still running and my shell was hung up.

I installed this branch and this change solved my issue. I'm on MacOS 11.4 (Big Sur).

Poetry Dependencies

[tool.poetry.dependencies]
python = "^3.8"
fastapi = "^0.65.2"
python-dotenv = "^0.17.1"
psycopg2-binary = "^2.8.6"
alembic = "^1.6.5"
SQLAlchemy = "^1.4.17"
gunicorn = "^20.1.0"
uvicorn = {git = "https://github.com/sgsabbage/uvicorn.git", branch = "process_terminate", extras = ["standard"]}

Result:

Command in the script that's run: berglas exec -- uvicorn app.server:app --reload

This seems to reliably shut down the python processes, but I do see that os: process already finished error regularly. Sometimes I see it output multiple times, but I'm assuming that's a thread thing.

Hope this helps as I'd l really like to see this resolved soon. Let me know if I can help with any other testing.

Thanks!

euri10 · 2021-06-21T11:49:25Z

This seems to reliably shut down the python processes, but I do see that os: process already finished error regularly. Sometimes I see it output multiple times, but I'm assuming that's a thread thing.

Hope this helps as I'd l really like to see this resolved soon. Let me know if I can help with any other testing.

ok this is interesting @gmeans , it seems like a mac os thing, but not sure, I was not able to reproduce on linux, can you try reproduce with log-level=trace and see if that yields something interesting in your logs, I'd be interested in seeing where that os: process already finished happens

euri10 · 2021-06-22T12:17:11Z

seems like I can reproduce in this draft PR, so at least I've got an idea of what's going on
https://github.com/encode/uvicorn/pull/1090/checks?check_run_id=2884640907#step:5:45

euri10

let's roll with this, we can still see the macos "issue" after if needed, as this PR alone seems to solves the aforementionned issue ! thanks @sgsabbage

Asday · 2021-07-30T21:58:51Z

!!!

When might I be able to see this released on PyPI?

…rocesses before joining them (#1069)" This reverts commit de53c23.

… before joining them (#1069)

HansBrende · 2021-11-24T10:57:10Z

@sgsabbage I believe that this implementation was faulty... the BaseReload should not call process.terminate() on shutdown (after it has already received a SIGINT signal causing the shutdown in the first place). Reason being: now the child process is receiving BOTH a SIGINT followed by a SIGTERM!!! Which has the unfortunate side effect of a very ungraceful shutdown, since the server does a force shutdown after the second signal it receives. FWIW I'm on MacOS Catalina (10.15.7).

More info: https://stackoverflow.com/a/31907519/2599133

Here's an example of what happens:

INFO:     Started reloader process [522] using statreload
INFO:     Started server process [524]
INFO:     Waiting for application startup.
INFO:     Application startup complete.

Now...

I send a SIGINT via ^C
uvicorn.server.Server.handle_exit (child process) receives the SIGINT -> sets self.should_exit to True
uvicorn.supervisors.basereload.BaseReload.signal_handler receives the SIGINT -> calls self.process.terminate()
uvicorn.server.Server.handle_exit (child process) receives the SIGTERM -> sets self.force_exit to True

INFO:     Shutting down
INFO:     Finished server process [524]
ERROR:    Exception in 'lifespan' protocol
Traceback (most recent call last):
  File "/Users/hansbrende/miniconda3/envs/facade-api/lib/python3.8/site-packages/starlette/exceptions.py", line 58, in __call__
    await self.app(scope, receive, send)
  File "/Users/hansbrende/miniconda3/envs/facade-api/lib/python3.8/site-packages/starlette/routing.py", line 569, in __call__
    await self.lifespan(scope, receive, send)
  File "/Users/hansbrende/miniconda3/envs/facade-api/lib/python3.8/site-packages/starlette/routing.py", line 544, in lifespan
    await receive()
  File "/Users/hansbrende/miniconda3/envs/facade-api/lib/python3.8/site-packages/uvicorn/lifespan/on.py", line 135, in receive
    return await self.receive_queue.get()
  File "/Users/hansbrende/miniconda3/envs/facade-api/lib/python3.8/asyncio/queues.py", line 163, in get
    await getter
asyncio.exceptions.CancelledError
INFO:     Stopping reloader process [522]

Kludex · 2021-11-24T11:02:06Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When receiving a SIGTERM supervisors should terminate their processes before joining them #1069

When receiving a SIGTERM supervisors should terminate their processes before joining them #1069

sgsabbage commented Jun 4, 2021

euri10 commented Jun 7, 2021

sgsabbage commented Jun 7, 2021

euri10 commented Jun 7, 2021

sgsabbage commented Jun 7, 2021

euri10 commented Jun 7, 2021

euri10 commented Jun 7, 2021

sgsabbage commented Jun 12, 2021

gmeans commented Jun 17, 2021

euri10 commented Jun 21, 2021

euri10 commented Jun 22, 2021

euri10 left a comment

Asday commented Jul 30, 2021

HansBrende commented Nov 24, 2021

Kludex commented Nov 24, 2021

When receiving a SIGTERM supervisors should terminate their processes before joining them #1069

When receiving a SIGTERM supervisors should terminate their processes before joining them #1069

Conversation

sgsabbage commented Jun 4, 2021

euri10 commented Jun 7, 2021

sgsabbage commented Jun 7, 2021

euri10 commented Jun 7, 2021

sgsabbage commented Jun 7, 2021

euri10 commented Jun 7, 2021

euri10 commented Jun 7, 2021

sgsabbage commented Jun 12, 2021

gmeans commented Jun 17, 2021

euri10 commented Jun 21, 2021

euri10 commented Jun 22, 2021

euri10 left a comment

Choose a reason for hiding this comment

Asday commented Jul 30, 2021

HansBrende commented Nov 24, 2021

Kludex commented Nov 24, 2021