New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket removed after reload #1775
Comments
Also, not sure how this is related, but maybe it can help pin point the issue(s) with sockets : reloading puma also removes the socket on v3.12.1 with ruby 2.6.2 and 2.5.5 when preloading application is enabled. In that situation there is no error message since there is no additional attempt to remove the socket, but the socket disappear after the first reload too. In the systemd puma documentation it states that preload is compatible :
But sample configurations does not specify preload so maybe this wasn't really tested. Maybe I look at the wrong place but I didn't see any test of the signal management & reload management. Maybe we should add integrations test covering these usages in https://github.com/puma/puma/blob/master/test/test_integration.rb ? |
This issue might be better described as: "phased restart" is not currently compatible with systemd socket activation. The systemd documentation you quote doesn't actually claim otherwise—cluster mode, preload, and socket activation work with systemd level (with socket activation, graceful) restarts (non-rolling, master and all workers in parallel). Assuming this is true, I'm not sure if the limitation can/should be fixed (your If you really need phased restart (e.g. because your load time is too long), then you might need, as a workaround, to disable systemd socket activation. |
Ok so if I understand correctly, when using puma with systemd socket activation, puma should not be reloaded via a reload signal - this isn't supported by design? It should only be stop the start by systemd and it systemd will keep the socket active during the stop/start transition? |
I'm personally not sure of the answer is to your first question, "…by design?" That puma phased/rolling restart unlinks the UNIX socket file on reload (which I must presume has some motivation) is not going to work with systemd socket activation. There might be some permission scheme that would disallow puma from doing this, but I'd guess that would abort the reload. As for your second question: yes, systemd will gracefully start a new worker or cluster of works and shut down the old without any request loss. The major difference is that the entire cluster is started in parallel (like a cold start) instead of restarting workers serially, as with phased/rolling restart. That can be an advantage, in that it more likely avoids a single user interacting with different deployed versions for subsequent requests within a short time period. The linked restart doc discusses this for migrations, for example. A related question is: could you also workaround this by using TCP instead of UNIX socket? I see lots of puma issues specific to UNIX sockets where the OP isn't super clear why they want UNIX sockets in the first place. Minor possible performance advantage would be one, but IMO likely not measurably significant, at least at peak throughput if Rails is in use. |
Also regarding your original comment on testing: yes I agree there are many test coverage holes in this area and the potential for regressions is high. As part of my work on puma+systemd, I have https://github.com/dekellum/lionfish . I just updated this for puma 3.12.1 to sanity check that recent releases haven't totally hosed systemd and (tcp) socket activation. Parts of this could be reused for puma tests and CI. Unlikely I'll find time soon, but I think my next step would be to add CI to lionfish. If that looks good then could repackage it for a puma PR. I'd accept a PR for lionfish, if you wanted to work on it. |
Well I was kind of asking myself the same question as why unix socket seems to be suggested in documentation, but seems to cause many issues and performance gains are not really well documented either. It has a security advantage because it opens not port, but I think a that is not really a good argument either since you should configure your binding & firewall correctly. I think I'll move to a 0.0.0.0 port socket. |
So hopefully there are two possible workarounds, but it does seem like #1685 must have caused this regression. Reading there that |
Any progress on this one? Seems like Puma doesn't work with systemd socket activation anymore as of 4.0.0. |
Can you provide more details @calmyournerves? The original issue here was on 3.12. |
Hey @nateberkopec! I think this issue was reported on master. The changes in #1685 (which was already merged when this issue was created) seem to cause that the socket is removed when Puma gets stopped. So when we call As described in the first comment, removing the logic around |
Ran into this updating from 3.12.1 to 4.0.0 using socket activation. Socket is removed on restart. |
ruby 2.6.3 config/puma.rb # frozen_string_literal: true
# Puma can serve each request in a thread from an internal thread pool.
# The `threads` method setting takes two numbers: a minimum and maximum.
# Any libraries that use thread pools should be configured to match
# the maximum value specified for Puma. Default is set to 5 threads for minimum
# and maximum; this matches the default thread size of Active Record.
#
max_threads_count = ENV.fetch('RAILS_MAX_THREADS') { 5 }
min_threads_count = ENV.fetch('RAILS_MIN_THREADS') { max_threads_count }
threads min_threads_count, max_threads_count
# Specifies the `environment` that Puma will run in.
#
environment ENV.fetch('RAILS_ENV') { 'development' }
# early_hints true
# Specifies the number of `workers` to boot in clustered mode.
# Workers are forked webserver processes. If using threads and workers together
# the concurrency of the application would be max `threads` * `workers`.
# Workers do not work on JRuby or Windows (both of which do not support
# processes).
#
workers ENV.fetch('WEB_CONCURRENCY') { 2 }
# Use the `preload_app!` method when specifying a `workers` number.
# This directive tells Puma to first boot the application and load code
# before forking the application. This takes advantage of Copy On Write
# process behavior so workers use less memory. If you use this option
# you need to make sure to reconnect any threads in the `on_worker_boot`
# block.
#
# preload_app!
# Allow workers to reload bundler context when master process is issued
# a USR1 signal. This allows proper reloading of gems while the master
# is preserved across a phased-restart. (incompatible with preload_app)
# (off by default)
#
prune_bundler
# Bind unix domain socket
bind ENV.fetch('PUMA_SOCK') { 'unix://tmp/sockets/puma.sock' }
# Use `path` as the file to store the server info state. This is
# used by `pumactl` to query and control the server.
#
state_path ENV.fetch('SERVER_STATE') { 'tmp/pids/puma.state' }
# === Puma control rack application ===
# Start the puma control rack application on `url`. This application can
# be communicated with to control the main server. Additionally, you can
# provide an authentication token, so all requests to the control server
# will need to include that token as a query parameter. This allows for
# simple authentication.
#
# Check out https://github.com/puma/puma/blob/master/lib/puma/app/status.rb
# to see what the app has available.
#
activate_control_app ENV.fetch('PUMACTL_SOCK') { 'unix://tmp/sockets/pumactl.sock' }
# Store the pid of the server in the file at `path`.
#
pidfile 'tmp/pids/puma.pid'
# Change the default timeout of workers
#
worker_timeout 10
# Allow puma to be restarted by `rails restart` command.
plugin :tmp_restart config/deploy.rb after 'deploy:published', 'puma:phased-restart' After $ ps aux | grep puma
webapp 18304 0.0 0.8 368084 8324 ? Sl 04:17 0:00 puma 4.0.0 (unix://tmp/sockets/puma.sock) [20190630041534]
webapp 30011 0.2 11.7 591616 118548 ? Sl 04:54 0:01 puma: cluster worker 0: 18304 [20190630041534]
webapp 30082 0.2 11.8 591596 119296 ? Sl 04:54 0:01 puma: cluster worker 1: 18304 [20190630041534]
webapp 30871 0.0 0.2 226484 2756 pts/0 S+ 05:02 0:00 grep --color=auto puma
$ ls /srv/webapp/repository/current/tmp/sockets
pumactl.sock puma.sock is missing and Nginx can't establish connection |
@nateberkopec this isn't related to systemd pumactl -S tmp/pids/puma.state -F config/puma.rb phased-restart puma 4 is removes puma.sock, puma 3.12 doesn't |
Can you send us a PR? Is it possible to add a test for this behavior since it was working and then we broke it? I can't guarantee we won't break it again |
PR is here #1829 |
Using puma 4.0.1 that should have this fix? Socket still missing after restart. |
Well, that's not good. Moving discussion to #1842. |
Steps to reproduce
I followed instruction on the wiki to setup puma with systemd on Ubuntu 18.04 (https://github.com/puma/puma/blob/master/docs/systemd.md#socket-activation) but puma reload is not working as expected with the latest version of puma.
During a reload, I see these logs which highlight the problem (with
PUMA_DEBUG=1
):After the reload, the workers process are reloaded & running, but nginx isn't serving anything after that point since the
puma.sock
socket is removed (not only closed, but the file itself is not there anymore).I think that removing the
@own_binder
logic in #1685 may now cause puma to incorrectly unlink socket asPuma::Binder#close
does unlink unix socket files and is not doing so in the correct order during a phased restart (unlink occurs after the new workers are spawned, when old workers are stopped) :puma/lib/puma/binder.rb
Lines 51 to 54 in 200c95d
Expected behavior
Should keep working via unix socket after reload.
Actual behavior
Unix socket is removed after being activated by new restarted worker.
System configuration
Ruby version: 2.5.5
Rails version: 5.1.6.2
Puma version: master
Edit: Just for your information, sadly I am now using port binding with my systemd setup (without any issue). Thanks to everyone investigating & trying to fix those socket issues.
The text was updated successfully, but these errors were encountered: