Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-existing socket file removed when TERM is issued after USR2 (if puma is running in cluster mode) #2816

Closed
khustochka opened this issue Feb 1, 2022 · 2 comments · Fixed by #2817
Labels

Comments

@khustochka
Copy link
Contributor

khustochka commented Feb 1, 2022

Describe the bug
Puma, when running in cluster mode, deletes the socket file if a TERM signal is issued after USR2 signal. This can be a problem when using socket activation.

I am running puma on a Linux server, in cluster mode, preloaded app, using systemd, socket activation. I noticed Permission denied exceptions in my logs when restarting puma. When I moved the socket file into a writable folder, those exceptions stopped, but instead the socket file just started to disappear. Further investigation showed this only happens if puma was reloaded previously. (For clarity: restart here means systemd stopping the process and then starting it again; reload means I have issued USR2 signal or pumactl reload after deployment).

Environment:
I was able to reproduce this on various configurations:

OS: Linux, MacOS
Ruby: 3.1.0, 3.0.3
Puma: 5.6.1, 5.5.2

Puma config and systemd unit file do not really matter, because as shown below, this can be reproduced on a minimalistic app. The only things that matter are that puma is bound to a socket, and is run in cluster mode.

To Reproduce
I tried to reproduce this bug locally, on MacOS, Ruby 3.1.0, puma 5.6.1.

In a new folder I created hello.ru:

run lambda { |env| [200, {"Content-Type" => "text/plain"}, ["Hello World"]] }

and Gemfile:

source "https://rubygems.org"; gem "puma"

and ran bundle install

Because I encountered some inconsistent behaviour, I tried to test a matrix of different settings, performing the following steps:

  1. Socket file setup:
    No pre-existing socket file: I made sure to rm puma.sock
    Pre-existing socket file: I ran touch puma.sock, to create a socket file (removing it first if it existed).

  2. Run puma.

Single mode:

puma -b unix://puma.sock --pidfile ./puma.pid hello.ru

Cluster mode:

puma -b unix://puma.sock --pidfile ./puma.pid --workers 2 hello.ru
  1. Issue signals:

TERM

kill -TERM `cat puma.pid`

USR2+TERM

kill -USR2 `cat puma.pid`

followed by

kill -TERM `cat puma.pid`
  1. Check if puma.sock file is preserved or removed after termination.

Results

Single -- Cluster --
No pre-existing socket file TERM Socket file stays TERM Socket file removed
No pre-existing socket file USR2+TERM Socket file stays USR2+TERM Socket file removed
Pre-existing socket file TERM Socket file stays TERM Socket file stays
Pre-existing socket file USR2+TERM Socket file stays USR2+TERM Socket file removed

I do not know what is the correct behaviour for each one of these combinations, but the result of issuing USR2+TERM with a pre-existing socket file in cluster mode looks especially inconsistent compared to others.

Expected behaviour

I expect the pre-activated socket not being removed after I reload my app at some point and then later try restarting it.

Additional notes:

Here is my production log, showing where exactly an attempt to remove the socket file happens:

Jan 31 06:27:22 neue systemd[1]: Starting Quails App (Puma)...
Jan 31 06:27:24 neue ruby[15922]: % ENV['LISTEN_FDS'] "1"  env_hash['LISTEN_PID'] "15922"
Jan 31 06:27:24 neue ruby[15922]: % Registered unix:/run/quails/puma.sock for activation from LISTEN_FDS
Jan 31 06:27:24 neue ruby[15922]: [15922] * Enabling systemd notification integration
Jan 31 06:27:24 neue ruby[15922]: [15922] Pinging systemd watchdog every 15.0 sec
Jan 31 06:27:24 neue ruby[15922]: [15922] Puma starting in cluster mode...
Jan 31 06:27:24 neue ruby[15922]: [15922] * Puma version: 5.6.1 (ruby 3.1.0-p0) ("Birdie's Version")
Jan 31 06:27:24 neue ruby[15922]: [15922] *  Min threads: 2
Jan 31 06:27:24 neue ruby[15922]: [15922] *  Max threads: 2
Jan 31 06:27:24 neue ruby[15922]: [15922] *  Environment: production
Jan 31 06:27:24 neue ruby[15922]: [15922] *   Master PID: 15922
Jan 31 06:27:24 neue ruby[15922]: [15922] *      Workers: 2
Jan 31 06:27:24 neue ruby[15922]: [15922] *     Restarts: (✔) hot (✖) phased
Jan 31 06:27:24 neue ruby[15922]: [15922] * Preloading application
Jan 31 06:27:31 neue ruby[15922]: [15922] * Activated unix:///run/quails/puma.sock
Jan 31 06:27:31 neue ruby[15922]: [15922] Use Ctrl-C to stop
Jan 31 06:27:31 neue ruby[15922]: [15922] - Worker 0 (PID: 15927) booted in 0.01s, phase: 0
Jan 31 06:27:31 neue ruby[15922]: [15922] - Worker 1 (PID: 15931) booted in 0.0s, phase: 0
Jan 31 06:27:31 neue systemd[1]: Started Quails App (Puma).
Jan 31 06:28:35 neue systemd[1]: Reloading Quails App (Puma).
Jan 31 06:28:36 neue ruby[15922]: [15922] - Gracefully shutting down workers...
Jan 31 06:28:36 neue ruby[15965]: Command restart sent success
Jan 31 06:28:36 neue systemd[1]: Reloaded Quails App (Puma).
Jan 31 06:28:36 neue ruby[15922]: [15922] * Restarting...
Jan 31 06:28:37 neue ruby[15922]: % ENV['LISTEN_FDS'] "1"  env_hash['LISTEN_PID'] "15922"
Jan 31 06:28:37 neue ruby[15922]: % Registered unix:/run/quails/puma.sock for activation from LISTEN_FDS
Jan 31 06:28:37 neue ruby[15922]: [15922] * Enabling systemd notification integration
Jan 31 06:28:37 neue ruby[15922]: [15922] Pinging systemd watchdog every 15.0 sec
Jan 31 06:28:37 neue ruby[15922]: [15922] Puma starting in cluster mode...
Jan 31 06:28:37 neue ruby[15922]: [15922] * Puma version: 5.6.1 (ruby 3.1.0-p0) ("Birdie's Version")
Jan 31 06:28:37 neue ruby[15922]: [15922] *  Min threads: 2
Jan 31 06:28:37 neue ruby[15922]: [15922] *  Max threads: 2
Jan 31 06:28:37 neue ruby[15922]: [15922] *  Environment: production
Jan 31 06:28:37 neue ruby[15922]: [15922] *   Master PID: 15922
Jan 31 06:28:37 neue ruby[15922]: [15922] *      Workers: 2
Jan 31 06:28:37 neue ruby[15922]: [15922] *     Restarts: (✔) hot (✖) phased
Jan 31 06:28:37 neue ruby[15922]: [15922] * Preloading application
Jan 31 06:28:43 neue ruby[15922]: [15922] * Inherited unix:///run/quails/puma.sock
Jan 31 06:28:43 neue ruby[15922]: [15922] Use Ctrl-C to stop
Jan 31 06:28:43 neue ruby[15922]: [15922] - Worker 0 (PID: 15970) booted in 0.01s, phase: 0
Jan 31 06:28:43 neue ruby[15922]: [15922] - Worker 1 (PID: 15974) booted in 0.0s, phase: 0
Jan 31 06:28:55 neue systemd[1]: Stopping Quails App (Puma)...
Jan 31 06:28:55 neue ruby[15974]: [15974] ! Detected parent died, dying
Jan 31 06:28:55 neue ruby[15970]: [15970] ! Detected parent died, dying
Jan 31 06:28:55 neue ruby[15922]: /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/binder.rb:464:in `unlink': Permission denied @ apply2files - /run/quails/puma.sock (Errno::EACCES)
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/binder.rb:464:in `block in close_listeners'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/binder.rb:459:in `each'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/binder.rb:459:in `close_listeners'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/launcher.rb:220:in `close_binder_listeners'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/cluster.rb:325:in `block in setup_signals'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/cluster.rb:437:in `wait_readable'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/cluster.rb:437:in `run'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/launcher.rb:182:in `run'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/lib/puma/cli.rb:81:in `run'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/gems/puma-5.6.1/bin/puma:10:in `<top (required)>'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/bin/puma:25:in `load'
Jan 31 06:28:55 neue ruby[15922]:         from /home/vk/www/quails/shared/bundle/ruby/3.1.0/bin/puma:25:in `<main>'
Jan 31 06:28:55 neue systemd[1]: quails.service: Main process exited, code=exited, status=1/FAILURE

Here you also can see a message Detected parent died, dying which I occasionally saw locally, but was not able to reliably reproduce.

There were previously other issues with socket removal, e.g. #1988.

Thank you!

@nateberkopec
Copy link
Member

I expect the pre-activated socket not being removed after I reload my app at some point and then later try restarting it.

Seems reasonable. I think there's probably a race or interaction going on here between the two signal handlers, since a USR2 restart can take a considerable amount of time to complete.

The next step here will be converting your (very good!) description of the problem using a simple app into a test. We do have a lot of existing tests around restarts and signals which would be a good place to start.

khustochka added a commit to khustochka/puma that referenced this issue Feb 2, 2022
The issue was that when inheriting the socket it was added to the @unix_paths list even if it was pre-existing.
khustochka added a commit to khustochka/puma that referenced this issue Feb 2, 2022
The issue was that when inheriting the socket it was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
khustochka added a commit to khustochka/puma that referenced this issue Feb 2, 2022
The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
@khustochka
Copy link
Contributor Author

Thank you! I have created a test, and also wrote a bash script to bisect the commits. I think I have a fix, submitted a PR.

khustochka added a commit to khustochka/puma that referenced this issue Feb 16, 2022
The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
khustochka added a commit to khustochka/puma that referenced this issue Feb 19, 2022
The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
khustochka added a commit to khustochka/puma that referenced this issue Feb 19, 2022
The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
nateberkopec pushed a commit that referenced this issue Feb 20, 2022
The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed #2816.
MSP-Greg pushed a commit to MSP-Greg/puma that referenced this issue Apr 2, 2022
)

The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
MSP-Greg pushed a commit to MSP-Greg/puma that referenced this issue Apr 2, 2022
)

The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
MSP-Greg pushed a commit to MSP-Greg/puma that referenced this issue Apr 3, 2022
)

The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
nateberkopec pushed a commit that referenced this issue Aug 22, 2022
The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed #2816.
JuanitoFatas pushed a commit to JuanitoFatas/puma that referenced this issue Sep 9, 2022
)

The issue was that inherited socket was added to the @unix_paths list even if it was pre-existing.
Test added.

Closed puma#2816.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
@khustochka @nateberkopec and others