Request interrupted during graceful restarting due to the signal #1337

cadina · 2016-08-25T15:44:00Z

@xiaost's workaround is to block signals before processing a request and unblock after finishing the request.

Since it's a very common problem, is there any side effect that blocks us to handle the signal like this by default in Gunicorn?

xiaost · 2016-08-25T17:18:47Z

gunicorn calls signal.siginterrupt which ensure syscall to be restarted.
It is not working when socket timeout > 0. solutions of EINTR issue seem to be platform-specific.
Rather than sending SIGTERM to worker, it is a better way to use tmpfile(set exit byte) for graceful shutdown

tilgovi · 2016-09-22T21:31:58Z

If someone can suggest an action here, we can try to make a change, but I am unable to decipher what you this issue is asking for.

Gunicorn already uses siginterrupt when available and recently PEP 475 has introduced new default behavior for Python 3.5 where calls should not be interrupted.

Furthermore, while these solutions may not work for every platform and every Python version, we plan to overhaul the inter-process communication for the next major iteration of Gunicorn.

Is there any action that should be taken on this issue or can it be closed?

ivarref · 2016-10-04T09:05:19Z

Hi @tilgovi . And thanks for a fine framework.

I think the issue here is that worker processes will receive SIGINT/SIGTERM and simply do a sys.exit(0) even if they are in the middle of handling a request.

I have reproduced a similar issue here: https://github.com/ivarref/gunicorn-swarm
Should I open a separate issue for this?

Correct me if I'm incorrect.

tilgovi · 2016-10-04T17:44:10Z

INT causes sys.exit but TERM does not.

Is this different from what you see? Can someone tell me what Gunicorn version, OS, Python version, and worker class they see interruptions on?

As far as I know, graceful shutdown works on every worker type we ship on Py2.7 and Py3.4 on Linux. I believe I was testing 3.4 and not 3.5 last time I tried.

ivarref · 2016-10-05T10:59:14Z

Hi @tilgovi and thanks for the fast reply.

Let me focus on SIGTERM.

I added some logging to Gunicorn:master and got the following output:

[2016-10-05 12:33:12 +0000] [24020] [INFO] Starting gunicorn 19.6.0
[2016-10-05 12:33:12 +0000] [24020] [INFO] Listening at: http://0.0.0.0:8080 (24020)
[2016-10-05 12:33:12 +0000] [24020] [INFO] Using worker: sync
[2016-10-05 12:33:12 +0000] [24028] [INFO] Booting worker with pid: 24028

# At this point I started the hammering the server with requests in a different terminal. See the later linked commit for details.

handling request ...

# At this point I executed kill 24020 in a different terminal.

[2016-10-05 12:33:35 +0000] [24020] [INFO] Arbiter: Received signal 15
[2016-10-05 12:33:35 +0000] [24020] [INFO] Handling signal: term
[2016-10-05 12:33:35 +0000] [24020] [INFO] Arbiter: Handle SIGTERM
[2016-10-05 12:33:35 +0000] [24020] [INFO] gracefully closing socket ...
handling request ...
[2016-10-05 12:33:36 +0000] [24020] [INFO] waiting before gracefully closing socket, 0
handling request ...
[2016-10-05 12:33:37 +0000] [24020] [INFO] waiting before gracefully closing socket, 1
handling request ...
[2016-10-05 12:33:38 +0000] [24020] [INFO] waiting before gracefully closing socket, 2

[2016-10-05 12:33:38 +0000] [24020] [INFO] all server sockets should be closed by now
# ^^^ Now all listening sockets should have been closed

[2016-10-05 12:33:38 +0000] [24020] [INFO] killing workers ...
[2016-10-05 12:33:38 +0000] [24020] [INFO] killing workers in 3 ...
handling request ...
# ^^^ But workers keeps accepting new requests

[2016-10-05 12:33:39 +0000] [24020] [INFO] killing workers in 2 ...
handling request ...
# ^^^ And another requests is processed

[2016-10-05 12:33:41 +0000] [24020] [INFO] killing workers in 1 ...
[2016-10-05 12:33:41 +0000] [24020] [INFO] killing workers ... GO
[2016-10-05 12:33:41 +0000] [24028] [INFO] Worker calling sys.exit(0) (pid: 24028)
[2016-10-05 12:33:41 +0000] [24028] [INFO] Worker exiting (pid: 24028)
[2016-10-05 12:33:41 +0000] [24020] [INFO] Arbiter: Handle SIGCHLD
[2016-10-05 12:33:41 +0000] [24020] [INFO] Shutting down: Master

Isn't it strange that the worker keeps handling requests after the server socket has been closed?
Am I missing something?

The line handling request ... have been de-duplicated when appearing multiple times between regular log messages.

Here is a fork of Gunicorn:master showcasing this behaviour (added some logging statements and scripts to reproduce behaviour):
ivarref@b4049c4

Some details of my system:
uname -a => Linux (...) 4.4.0-38-generic #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
./venv/bin/python --version => Python 2.7.12
lsb_release -a => Ubuntu 16.04.1 LTS, xenial

@tilgovi What do you think? It looks strange to me that workers keep accepting new requests after the server socket is closed. But I'm no IPC/socket expert, so I may very well have missed something.

Thanks for your time.

benoitc · 2016-12-27T14:00:05Z

"Isn't it strange that the worker keeps handling requests after the server socket has been closed?
Am I missing something?"

This is the purpose of graceful shutdown. Ie let's try to finish the last connection accepted in the next Graceful timeout. If not then kill it:
http://docs.gunicorn.org/en/stable/signals.html#master-process

Maybe some connections are already in the backlog and were were accepted before the workers got the signal?

tilgovi · 2018-04-28T02:03:05Z

Closing in favor of #1236

benoitc added the Feedback Requested label Dec 27, 2016

benoitc added the - Bugs - label Feb 26, 2017

benoitc added this to Other in Bugs Feb 26, 2017

tilgovi closed this as completed Apr 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request interrupted during graceful restarting due to the signal #1337

Request interrupted during graceful restarting due to the signal #1337

cadina commented Aug 25, 2016

xiaost commented Aug 25, 2016

tilgovi commented Sep 22, 2016

ivarref commented Oct 4, 2016

tilgovi commented Oct 4, 2016

ivarref commented Oct 5, 2016 •

edited

benoitc commented Dec 27, 2016

tilgovi commented Apr 28, 2018

Request interrupted during graceful restarting due to the signal #1337

Request interrupted during graceful restarting due to the signal #1337

Comments

cadina commented Aug 25, 2016

xiaost commented Aug 25, 2016

tilgovi commented Sep 22, 2016

ivarref commented Oct 4, 2016

tilgovi commented Oct 4, 2016

ivarref commented Oct 5, 2016 • edited

benoitc commented Dec 27, 2016

tilgovi commented Apr 28, 2018

ivarref commented Oct 5, 2016 •

edited