Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix uvicorn gunicorn worker class restarts when USR1 is issued #1565

Merged
merged 2 commits into from Aug 26, 2022

Conversation

wonjoonSeol-WS
Copy link
Contributor

@wonjoonSeol-WS wonjoonSeol-WS commented Jul 11, 2022

Copied from my original post #1559 (comment)

Steps to reproduce

Minimal Fastapi code

import logging
import signal
import time
from logging import handlers

from fastapi import FastAPI
from starlette.responses import PlainTextResponse

logging.basicConfig(level=logging.DEBUG, filename='./log.txt')
logger = logging.getLogger()
log_handler = logging.handlers.WatchedFileHandler('./log.txt')
logger.addHandler(log_handler)

app = FastAPI(title="uvicorn-test")

logger = logging.getLogger(__name__)


@app.on_event("startup")
async def on_startup():
    # signal.signal(signal.SIGUSR1, signal.SIG_IGN)
    ctime = time.ctime()
    logger.debug(f"uvicorn-test start up {ctime}")


@app.on_event("shutdown")
async def on_shutdown():
    ctime = time.ctime()
    logger.debug(f"uvicorn-test shutdown {ctime}")


@app.get("/health_check", response_class=PlainTextResponse)
def health_check():
    ctime = time.ctime()
    logger.debug(f"logging test {ctime}")

Run with gunicorn

gunicorn -w 1 -k uvicorn.workers.UvicornWorker --pid /tmp/log/gunicorn.pid main:app

Set up logging

while [ 0 ]; do
    curl 'http://localhost:8000/health_check'
    echo
    sleep 1
done

Set up logrotate

/etc/logrotate.d/uvicorn-test

{file_location.txt}
{
    daily
    dateext
    rotate 30
    missingok
    notifempty
    create 0644 user group
    compress
    delaycompress
    postrotate
        kill -USR1 $(cat /tmp/log/gunicorn.pid)
    endscript
}

Send USR1 signal

kill -USR1 $(cat /tmp/log/gunicorn.pid)
or force logrotate

sudo logrotate --force /etc/logrotate.d/uvicorn-test

We can see the process is terminated:

~/w/uvicorn_mvp ❯ gunicorn -w 1 -k uvicorn.workers.UvicornWorker --pid /tmp/log/gunicorn.pid main:app
[2022-07-11 02:41:14 +0900] [22563] [INFO] Starting gunicorn 20.1.0
[2022-07-11 02:41:14 +0900] [22563] [INFO] Listening at: http://127.0.0.1:8000 (22563)
[2022-07-11 02:41:14 +0900] [22563] [INFO] Using worker: uvicorn.workers.UvicornWorker 
[2022-07-11 02:41:14 +0900] [22565] [INFO] Booting worker with pid: 22565
[2022-07-11 02:41:15 +0900] [22565] [INFO] Started server process [22565]
[2022-07-11 02:41:15 +0900] [22565] [INFO] Waiting for application startup.
[2022-07-11 02:41:15 +0900] [22565] [INFO] Application startup complete.


[2022-07-11 02:45:32 +0900] [22563] [INFO] Handling signal: usr1
[2022-07-11 02:45:32 +0900] [22563] [WARNING] Worker with pid 22565 was terminated due to signal 30 <<-
[2022-07-11 02:45:32 +0900] [23506] [INFO] Booting worker with pid: 23506
[2022-07-11 02:45:32 +0900] [23506] [INFO] Started server process [23506]
[2022-07-11 02:45:32 +0900] [23506] [INFO] Waiting for application startup.
[2022-07-11 02:45:32 +0900] [23506] [INFO] Application startup complete.

Proposal: Change uvicorn/worker.py to the following code:

    def init_signals(self) -> None:
        # Reset signals so Gunicorn doesn't swallow subprocess return codes
        # other signals are set up by Server.install_signal_handlers()
        # See: https://github.com/encode/uvicorn/issues/894
        for s in self.SIGNALS:
            signal.signal(s, signal.SIG_DFL)

        # Remove me if SIGUSR1 is handled in server.py
        signal.signal(signal.SIGUSR1, self.handle_usr1)
        # Don't let SIGUSR1 disturb active requests by interrupting system calls
        signal.siginterrupt(signal.SIGUSR1, False)

Re-run test

~/w/uvicorn_mvp ❯ gunicorn -w 1 -k uvicorn.workers.UvicornWorker --pid /tmp/log/gunicorn.pid main:app
[2022-07-11 02:46:42 +0900] [24005] [INFO] Starting gunicorn 20.1.0
[2022-07-11 02:46:42 +0900] [24005] [INFO] Listening at: http://127.0.0.1:8000 (24005)
[2022-07-11 02:46:42 +0900] [24005] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2022-07-11 02:46:42 +0900] [24006] [INFO] Booting worker with pid: 24006
[2022-07-11 02:46:42 +0900] [24006] [INFO] Started server process [24006]
[2022-07-11 02:46:42 +0900] [24006] [INFO] Waiting for application startup.
[2022-07-11 02:46:42 +0900] [24006] [INFO] Application startup complete.
[2022-07-11 02:46:46 +0900] [24005] [INFO] Handling signal: usr1

Process is no longer terminated, logrotates fine and log.txt is still appended after the file rotation.
image

tail -F log.txt
logging test Mon Jul 11 02:46:42 2022
logging test Mon Jul 11 02:46:43 2022
logging test Mon Jul 11 02:46:44 2022
logging test Mon Jul 11 02:46:45 2022
logging test Mon Jul 11 02:46:46 2022 <<- SIGUSR1
logging test Mon Jul 11 02:46:47 2022
logging test Mon Jul 11 02:46:48 2022

Clearly this change only impacts Uvicorn's gunicorn worker class. i.e. kill -USR1 pid after running it with uvicorn only
(uvicorn --host 0.0.0.0 --port 8000 main:app), the worker is still terminated even with the work above.

But USR1: Reopen the log files behaviour is only documented on Gunicorn and undocumented on Uvicorn master (yet) so this could be expected behaviour(?). Otherwise we would need to create an async version of handle_usr1 method in server.py.

Btw, we can't simply fully copy gunicorn's init_signals method as suggested by yinkh in #896 (comment)

As the below signals are not compatible with asyncio.

        signal.signal(signal.SIGQUIT, self.handle_quit)
        signal.signal(signal.SIGWINCH, self.handle_winch)
        signal.signal(signal.SIGABRT, self.handle_abort)

We would end up with errors like below.

~/w/uvicorn_mvp ❯ gunicorn -w 1 -k uvicorn.workers.UvicornWorker main:app
[2022-07-10 23:49:12 +0900] [63819] [INFO] Starting gunicorn 20.1.0
[2022-07-10 23:49:12 +0900] [63819] [INFO] Listening at: http://127.0.0.1:8000 (63819)
[2022-07-10 23:49:12 +0900] [63819] [INFO] Using worker: uvicorn.workers.UvicornWorker
[2022-07-10 23:49:12 +0900] [63820] [INFO] Booting worker with pid: 63820
[2022-07-10 23:49:12 +0900] [63820] [INFO] Started server process [63820]
[2022-07-10 23:49:12 +0900] [63820] [INFO] Waiting for application startup.
[2022-07-10 23:49:12 +0900] [63820] [INFO] Application startup complete.

[2022-07-10 23:49:59 +0900] [63820] [ERROR] Traceback (most recent call last):
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 628, in run_until_complete
    self.run_forever()
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 595, in run_forever
    self._run_once()
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 1845, in _run_once
    event_list = self._selector.select(timeout)
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/selectors.py", line 562, in select
    kev_list = self._selector.control(None, max_ev, timeout)
  File "/Users/-/workspace/uvicorn_mvp/.venv/lib/python3.10/site-packages/gunicorn/workers/base.py", line 198, in handle_quit
    sys.exit(0)
SystemExit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/-/workspace/uvicorn_mvp/.venv/lib/python3.10/site-packages/starlette/routing.py", line 638, in lifespan
    await receive()
  File "/Users/-/workspace/uvicorn_mvp/.venv/lib/python3.10/site-packages/uvicorn/lifespan/on.py", line 137, in receive
    return await self.receive_queue.get()
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/queues.py", line 159, in get
    await getter
asyncio.exceptions.CancelledError

[2022-07-10 23:49:59 +0900] [63820] [INFO] Worker exiting (pid: 63820)
[2022-07-10 23:49:59 +0900] [64039] [INFO] Booting worker with pid: 64039
[2022-07-10 23:49:59 +0900] [64039] [INFO] Started server process [64039]
[2022-07-10 23:49:59 +0900] [64039] [INFO] Waiting for application startup.
[2022-07-10 23:49:59 +0900] [64039] [INFO] Application startup complete.
^C[2022-07-10 23:50:49 +0900] [63819] [INFO] Handling signal: int
[2022-07-10 23:50:49 +0900] [64039] [ERROR] Traceback (most recent call last):
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 628, in run_until_complete
    self.run_forever()
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 595, in run_forever
    self._run_once()
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/asyncio/base_events.py", line 1845, in _run_once
    event_list = self._selector.select(timeout)
  File "/Users/-/.pyenv/versions/3.10.2/lib/python3.10/selectors.py", line 562, in select
    kev_list = self._selector.control(None, max_ev, timeout)
  File "/Users/-/workspace/uvicorn_mvp/.venv/lib/python3.10/site-packages/gunicorn/workers/base.py", line 198, in handle_quit
    sys.exit(0)
SystemExit: 0
...

Testing other signals such as TTIN, TTOU, QUIT, INT, TERM on gunicorn's signal handling page (https://docs.gunicorn.org/en/stable/signals.html) seem to work fine.

I have no idea if these two lines from gunicorn worker base class https://github.com/benoitc/gunicorn/blob/027f04b4b4aee4f50b980a7158add0feaf4c1b29/gunicorn/workers/base.py#L185
are needed though. It seem to work fine without it. So I excluded it for now:

        if hasattr(signal, 'set_wakeup_fd'):
            signal.set_wakeup_fd(self.PIPE[1])

Thanks.

@wonjoonSeol-WS
Copy link
Contributor Author

wonjoonSeol-WS commented Jul 11, 2022

kindly requesting a review from our last modifier @tiangolo 🙏 (original modification ref issue page: #894)

Copy link
Sponsor Member

@Kludex Kludex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't test locally, but the lines here are the same as the parent class, so it should be fine.
Ref.: https://github.com/benoitc/gunicorn/blob/027f04b4b4aee4f50b980a7158add0feaf4c1b29/gunicorn/workers/base.py#L167-L182

I'll test locally later today in case @euri10 doesn't do it first. 👀

@Kludex Kludex added this to the Version 0.19.0 milestone Jul 13, 2022
@@ -72,6 +72,11 @@ def init_signals(self) -> None:
for s in self.SIGNALS:
signal.signal(s, signal.SIG_DFL)

# Remove me if SIGUSR1 is handled in server.py
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get this comment here... This doesn't affect uvicorn standalone. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the comment from tiangolo.

This is because the UvicornWorker, which inherits from the base Gunicorn worker, declares a method init_signals() (overriding the parent method) but doesn't do anything. I suspect it's because the signal handlers are declared in the Server.install_signal_handlers() with compatibility with asyncio.

But you are right, I was thinking that install_signal_handlers() in server.py could call init_signal in future. But this is unlikely as init_signal (esp signal.signal) isn't compatible with asyncio. I will remove this comment. 03a57af

Copy link
Sponsor Member

@Kludex Kludex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wonjoonSeol-WS 🙏

@Kludex Kludex merged commit e4e65b9 into encode:master Aug 26, 2022
Kludex pushed a commit to sephioh/uvicorn that referenced this pull request Oct 29, 2022
…e#1565)

* Fix uvicorn gunicorn worker class restarts when USR1 is issued

* Remove comment
Kludex pushed a commit that referenced this pull request Oct 29, 2022
* Fix uvicorn gunicorn worker class restarts when USR1 is issued

* Remove comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants