Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing unix socket after restart #1524

Closed
si14 opened this issue Jun 7, 2017 · 17 comments
Closed

Missing unix socket after restart #1524

si14 opened this issue Jun 7, 2017 · 17 comments

Comments

@si14
Copy link

si14 commented Jun 7, 2017

(can be related to #1523 or #1298)

Unix domain socket gets deleted when Gunicorn is restarted as a service:

ubuntu@host:~$ sudo service gunicorn --full-restart
ubuntu@host:~$ ls /run/gunicorn/
pid
ubuntu@host:~$ sudo service gunicorn stop
ubuntu@host:~$ sudo service gunicorn start
ubuntu@host:~$ ls /run/gunicorn/
pid  socket
ubuntu@host:~$ sudo service gunicorn restart
ubuntu@host:~$ ls /run/gunicorn/
pid
ubuntu@host:~$ sudo service gunicorn stop
ubuntu@host:~$ sudo service gunicorn start
ubuntu@host:~$ ls /run/gunicorn/
pid  socket

Gunicorn is runned by systemd, the service is configured as recommended in the docs

Full service config (slightly redacted):

[Unit]
Description=gunicorn daemon
Requires=gunicorn.socket
After=network.target

[Service]
PIDFile=/run/gunicorn/pid
User=some_user
Group=some_group
RuntimeDirectory=gunicorn
WorkingDirectory=some_path
ExecStart=venv_path/bin/gunicorn --pid /run/gunicorn/pid   \
          --workers 6  --max-requests 1000  \
          --name some_gunicorn  --statsd-host localhost:8125  \
          --bind unix:/run/gunicorn/socket app.wsgi
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
PrivateTmp=true

Environment=DATABASE_URL=...
Environment=DJANGO_SETTINGS_MODULE=...

[Install]
WantedBy=multi-user.target

Gunicorn version is 19.7.1

@valq7711
Copy link

valq7711 commented Aug 2, 2017

Hi, I had the same problem and after having 2 days reading/studying systemd doc I got why this happens. The core problem is that unix.socket (don't confuse with gunicorn.socket) placed in the runtimedirectory that is spiced up by PrivateTmp=true, that means killing it while daemon is stoping (restart=stop & start). My solution - just place socket any where else. i.e. change in gunicorn.socket ListenStream=/run/gunicorn/socket to ListenStream=/run/gunicorn.sock (for example) and --bind in gunicorn.service (--bind=unix:/run/gunicorn.sock ).
And I added 2 extra lines to [Socket] section (just in case):

SocketUser=someuser
SocketGroup=somegroup 

@alizain
Copy link

alizain commented Sep 7, 2017

@valq7711, thanks for the response. I spent quite some time debugging this before I found this issue.

@benoitc, is this the expected solution to working with systemd? If so, the docs should be updated.

@benoitc
Copy link
Owner

benoitc commented Sep 7, 2017 via email

@alizain
Copy link

alizain commented Sep 7, 2017

I'll take a look on the weekend, for sure!

@tilgovi
Copy link
Collaborator

tilgovi commented Sep 7, 2017

I do not know of any change I made that would fix this that is not already on 19.7.1.

@benoitc
Copy link
Owner

benoitc commented Sep 7, 2017

@tilgovi I was thinking to #1310 but that different. I misread the ticket..

Anyway I think it's expected that a socket created by gunicorn is deleted when gunicorn is stopped. Why should it survive in such case? fd inheritance is different and I'm not sure that require is actually creating that socket .in any case the fd inherited should be given differently.

@alizain
Copy link

alizain commented Sep 7, 2017

The problem is:

  • the gunicorn.socket unit creates a socket in the /run/gunicorn directory.
  • when the first request comes into the socket, systemd starts up the gunicorn.service unit to respond, which wipes the /run/gunicorn directory.
  • gunicorn.service complains about not having a socket file, and dies.

This happens when gunicorn is configured exactly the way it is explained in gunicorn's documentation about deploying with systemd.

I don't know enough about systemd to say that it is a functionality issue. I'm guessing it's expected behaviour with systemd. If so, gunicorn's documentation should be updated.

@benoitc
Copy link
Owner

benoitc commented Sep 7, 2017

@alizain thanks for the problem description.

I'm not a user of linux/systemd myself, I will need to check as well. Reading the current doc it seems that the socket is created via /etc/systemd/system/gunicorn.socket.

At this point I need to think what is the correct behaviour :) I will need to make some research

@benoitc
Copy link
Owner

benoitc commented Sep 7, 2017

if i'm correct systemd is the one listening on the socket so we should probably onlyuse the FD given by stemd but I maybe wrong.

@valq7711
Copy link

valq7711 commented Sep 7, 2017

Hi!
http://0pointer.de/blog/projects/socket-activation.html:

Socket activation makes it possible to start all four services completely simultaneously, without any kind of ordering. Since the creation of the listening sockets is moved outside of the daemons themselves

Thus, a socket isn't private part of any service (it's only a bus that provides communication), so private dir of any service is illogical place for it (by my opinion)

@tilgovi tilgovi self-assigned this Oct 31, 2017
@ErnstHaagsman
Copy link

I recently encountered this bug as well. Here's my project, which reproduces the issue in a Vagrant box: https://github.com/ErnstHaagsman/grouporder/tree/a14dd4078d9fd97f6a9d8350a49da309b107180a

By running:

sudo systemctl restart gunicorn
systemctl status gunicorn

It looks like gunicorn thinks the socket works ("Feb 23 18:14:25 vagrant gunicorn[22503]: [2018-02-23 18:14:25 +0000] [22503] [INFO] Listening at: unix:/run/gunicorn/gunicorn.sock (22503)")

However, ls /run/gunicorn shows that there's only a pid file, and no socket file there.

@pe224
Copy link

pe224 commented Jun 5, 2018

Can confirm that deployment with systemd as described in the current docs doesn't work.
Moving the socket as suggested by @valq7711 works for me.

@valintepes
Copy link

Ran into the same issue. In my case I set PrivateTmp=false, and was able to route to the python app on the first request each time I kill the service and restart the socket. Subsequent curls fail with nginx saying no such file or directory.

I can also confirm @valq7711 's suggestion worked.

@ykeyani
Copy link

ykeyani commented Aug 24, 2018

I spent quite a few hours narrowing it down to the socket aswell, using the systemd example in the docs. It's definitely not obvious and a note should probably be added to the example.

@valq7711 's solution works.

ju1m pushed a commit to ju1m/nixpkgs that referenced this issue Sep 14, 2018
When /run/rmilter/ is wiped out the rmilter.sock is removed,
causing postfix to fail to contact rmilter.
Probably like described here: benoitc/gunicorn#1524 (comment)
@arianitu
Copy link
Contributor

Confirmed changing path from /run/gunicorn/socket to /run/gunicorn.sock fixes the problem.

Can we get the docs updated for systemd?

@tilgovi
Copy link
Collaborator

tilgovi commented Oct 12, 2018

@arianitu I would be happy to review and merge a PR! Look in docs/source/deploy.rst.

@arianitu
Copy link
Contributor

@tilgovi opened #1895

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants