Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No longer do phased-restart on systemd but works on run locally #2476

Closed
gencer opened this issue Nov 2, 2020 · 14 comments
Closed

No longer do phased-restart on systemd but works on run locally #2476

gencer opened this issue Nov 2, 2020 · 14 comments

Comments

@gencer
Copy link

gencer commented Nov 2, 2020

Describe the bug
When I run:

bundle exec puma -e production -C config/puma/production.rb

Phased restart is available and I am able to send phased-restart request from Systemd or pumactl. However, If this app started via systemd systemctl start api.target then phased restart is not available.

Puma config:

threads_count = ENV.fetch('RAILS_MAX_THREADS') { 5 }.to_i
threads threads_count, threads_count

bind 'tcp://127.0.0.1:' + ENV.fetch('PORT') { 3003.to_s }
environment ENV.fetch('RAILS_ENV') { 'production' }

app_dir = '/disk5/api.com/current' # Rails.root.join('config','database_mail.yml')
shared_dir = '/disk5/api.com/shared'
directory app_dir
tag 'api'
pidfile "#{shared_dir}/tmp/pids/puma.pid"
state_path "#{shared_dir}/tmp/pids/puma.state"
stdout_redirect "#{shared_dir}/log/puma.stdout.log", "#{shared_dir}/log/puma.stderr.log", true
activate_control_app "unix://#{shared_dir}/tmp/sockets/pumactl.sock", { no_token: true }
workers ENV.fetch('WEB_CONCURRENCY') { 20 }


prune_bundler
fork_worker

before_fork do
  if defined?(Sequel)
    ::Sequel::Model.db.disconnect
  end
end

on_worker_boot do
  ::SequelRails.setup Rails.env if defined?(SequelRails)
end

lowlevel_error_handler do |ex, env|
  Raven.capture_exception(
    ex,
    :message => ex.message,
    :extra => { :puma => env },
    :transaction => 'Puma',
  )
  # note the below is just a Rack response
  [500, {}, ["An internal error has occurred and we got the report.\n"]]
end

systemd.service:

[Unit]
PartOf=api.target

[Service]
User=gencer
Type=simple
WorkingDirectory=/disk5/api.com/releases/20201101064018
Environment=PORT=3003
Environment=PS=web.1
Environment="WEB_CONCURRENCY=20"
Environment="RAILS_MAX_THREADS=5"
Environment="PORT=3003"
Environment="RAILS_ENV=production"
Environment="MALLOC_ARENA_MAX=2"
ExecStart=/bin/bash -lc 'exec -a "api-web.1" /home/gencer/.rbenv/bin/rbenv exec bundle exec --keep-file-descriptors puma -e production -C config/puma/production.rb'
Restart=always
StandardInput=null
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=%n
ExecStop=/bin/kill -QUIT $MAINPID
ExecReload=/bin/kill -s SIGUSR1 $MAINPID
KillMode=mixed
TimeoutStopSec=5

syslog when systemctl start api.target is ran:

Nov  2 10:19:32 server-130 systemd[1]: Started api.1.service.
Nov  2 10:19:32 server-130 systemd[1]: Reached target bitpaket.target.
Nov  2 10:19:32 server-130 api.1.service[694185]: [694185] Puma starting in cluster mode...
Nov  2 10:19:32 server-130 api.1.service[694185]: [694185] * Version 5.0.4 (ruby 2.7.1-p83), codename: Spoony Bard
Nov  2 10:19:32 server-130 api.1.service[694185]: [694185] * Min threads: 5, max threads: 5
Nov  2 10:19:32 server-130 api.1.service[694185]: [694185] * Environment: production
Nov  2 10:19:32 server-130 api.1.service[694185]: [694185] * Process workers: 20
Nov  2 10:19:32 server-130 api.1.service[694185]: [694185] * Preloading application
Nov  2 10:19:33 server-130 api.1.service[694185]: I, [2020-11-02T10:19:33.225408 #694185]  INFO -- sentry: ** [Raven] Raven 3.1.1 ready to catch errors

Error in stdout when phased-restart is initiated:

[687615] * phased-restart called but not available, restarting normally.
[687615] - Gracefully shutting down workers...
[687615] * Restarting...
[687615] Puma starting in cluster mode...
[687615] * Version 5.0.4 (ruby 2.7.1-p83), codename: Spoony Bard
[687615] * Min threads: 5, max threads: 5
[687615] * Environment: production
[687615] * Process workers: 20
[687615] * Preloading application

To Reproduce
Just start with above config via systemd. No phased-restart. Then run this command from SSH Shell and phased restart is available.

Expected behavior
phased restart should be available when puma started from systemd.

Desktop (please complete the following information):

  • OS: Linux (Ubuntu20.04LTS)
  • Puma Version 5.0.4

Am I missing something here? It was working on v4.x. I re-read docs but it seems everything is correct. Configuration is also correct because when I run puma manually with above config, phased restart is working. But when do systemd start, it says preloading app. Strange.

@gencer
Copy link
Author

gencer commented Nov 2, 2020

P.S.: I also see some errors on syslog like:

Nov  2 10:25:32 server-130 api.1.service[695841]: [695841] ! WARNING: Detected 7 Thread(s) started in app boot:
Nov  2 10:25:32 server-130 api.1.service[695841]: [695841] ! #<Thread:0x000055fabac31058 /disk5/api.com/shared/bundle/ruby/2.7.0/gems/semantic_logger-4.7.2/lib/semantic_logger/appender/async.rb:77 sleep_forever> - /disk5/api.com/shared/bundle/ruby/2.7.0/gems/semantic_logger-4.7.2/lib/semantic_logger/appender/async.rb:150:in `pop'
Nov  2 10:25:32 server-130 api.1.service[695841]: [695841] ! #<Thread:0x000055fabb16f3d0 /disk5/api.com/shared/bundle/ruby/2.7.0/gems/concurrent-ruby-1.1.7/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:332 sleep_forever> - /disk5/api.com/shared/bundle/ruby/2.7.0/gems/concurrent-ruby-1.1.7/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:337:in `pop'
Nov  2 10:25:32 server-130 api.1.service[695841]: [695841] ! #<Thread:0x000055fabb16e4a8 /disk5/api.com/shared/bundle/ruby/2.7.0/gems/elastic-apm-3.11.0/lib/elastic_apm/transport/base.rb:140 sleep_forever> - /disk5/api.com/shared/bundle/ruby/2.7.0/gems/elastic-apm-3.11.0/lib/elastic_apm/transport/worker.rb:58:in `pop'
Nov  2 10:25:32 server-130 api.1.service[695841]: [695841] ! #<Thread:0x000055fabb16ce50 /disk5/api.com/shared/bundle/ruby/2.7.0/gems/concurrent-ruby-1.1.7/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:332 sleep> - /disk5/api.com/shared/bundle/ruby/2.7.0/gems/concurrent-ruby-1.1.7/lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:46:in `sleep'
Nov  2 10:25:32 server-130 api.1.service[695841]: [695841] ! #<Thread:0x000055fabb17cff8 /disk5/api.com/shared/bundle/ruby/2.7.0/gems/concurrent-ruby-1.1.7/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:332 sleep_forever> - /disk5/api.com/shared/bundle/ruby/2.7.0/gems/concurrent-ruby-1.1.7/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:337:in `pop'
Nov  2 10:25:32 server-130 api.1.service[695841]: [695841] ! #<Thread:0x000055fabb1dee60 sleep> -
Nov  2 10:25:32 server-130 api.1.service[695841]: [695841] ! #<Thread:0x000055fabb4c46e8 /disk5/api.com/shared/bundle/ruby/2.7.0/gems/elastic-apm-3.11.0/lib/elastic_apm/transport/connection/http.rb:107 sleep> - /disk5/api.bitpak

@gencer
Copy link
Author

gencer commented Nov 2, 2020

I believe I located the issue.

Environment called WEB_CONCURRENCY that I use to fill worker count causes this. I don't know why but I've changed it to PUMA_WORKERS and it worked. Phased restart is now available.

@dentarg
Copy link
Member

dentarg commented Nov 2, 2020

Am I missing something here? It was working on v4.x

That's because WEB_CONCURRENCY wasn't used in Puma before 5.0, see https://github.com/puma/puma/blob/d4d1ed374a72d0df020d9512707b9eea17344135/5.0-Upgrade.md#welcome-to-puma-5-spoony-bard and the linked post.

Environment called WEB_CONCURRENCY that I use to fill worker count causes this. I don't know why but I've changed it to PUMA_WORKERS and it worked. Phased restart is now available.

You have restored the behaviour you had before with 4.x. The environment variable PUMA_WORKERS isn't read by Puma: https://github.com/puma/puma/search?q=%22PUMA_WORKERS%22&type=code


It is still weird that you had these issues, from what I see here, the change that Puma 5.0 brought shouldn't have changed the behaviour for your setup... 🤔

@dentarg
Copy link
Member

dentarg commented Nov 2, 2020

Ah, you are using fork_worker... I'm not sure how well that plays with systemd, there a some related issues open right now: https://github.com/puma/puma/issues?q=is%3Aissue+is%3Aopen+systemd+fork_worker

@gencer
Copy link
Author

gencer commented Nov 2, 2020

@dentarg I've removed experimental fork_worker from configuration. To me, it seems phased-restart works with fork_worker at the moment but as it is an experimental feature, I've gone ahead and removed it. In fact I've added fork_worker today just to try it.

It may cause some serious problems in the future as stated on other issues.

P.S.: I've tested WEB_CONCURRENCY without fork_worker. Still has the same problem. Will use PUMA_WORKERS for now.

@nateberkopec
Copy link
Member

I think @dentarg might be misreading the config linked.

PUMA_WORKERS isn't doing anything.

When you say "removing WEB_CONCURRENCY" I assume you mean "I removed the line which sets WEB_CONCURRENCY in my systemd.service". Did you remove the workers line in puma.rb too?

When you boot Puma now, how many workers are started?

@nateberkopec
Copy link
Member

The question here is why is your app preloading.

You don't have preload_app anywhere and you're using prune_bundler, which should disable it.

@nateberkopec
Copy link
Member

You also posted a config with fork_worker in it, but there was no corresponding log output on startup.

I think you have an issue regarding what config file is being loaded in production.

@gencer
Copy link
Author

gencer commented Nov 2, 2020

I've removed this line from systemd service:

Environment=WEB_CONCURRENCY=10

Instead added this:

Environment=PUMA_WORKERS=10

And used this variable in config/puma/production.rb:

workers (ENV["PUMA_WORKERS"] || 20).to_i

This way, I got Phased Restart Available. Also it works via pumactl.

BTW, fork_worker seems does not produce any output. What kind of output should I expect when I used it? I want to try now to make sure (just in case)

When I run with this configuration, 10 workers started (instead of 20) and phased restart is available.

@gencer
Copy link
Author

gencer commented Nov 2, 2020

My final configurations:

config/puma/production.rb:

threads_count = ENV.fetch('RAILS_MAX_THREADS') { 10 }.to_i
threads threads_count, threads_count

app_dir = '/disk5/api.com/current' # Rails.root.join('config','database_mail.yml')
shared_dir = '/disk5/api.com/shared'
directory app_dir
tag 'api'
pidfile "#{shared_dir}/tmp/pids/puma.pid"
state_path "#{shared_dir}/tmp/pids/puma.state"
stdout_redirect "#{shared_dir}/log/puma.stdout.log", "#{shared_dir}/log/puma.stderr.log", true
activate_control_app "unix://#{shared_dir}/tmp/pids/pumactl.sock", { no_token: true }

bind "unix://#{shared_dir}/tmp/sockets/puma1.sock"

environment ENV.fetch('RAILS_ENV') { 'production' }

workers ENV.fetch('PUMA_WORKERS') { 40 }.to_i

prune_bundler
fork_worker # No log for this?

before_fork do
  if defined?(Sequel)
    ::Sequel::Model.db.disconnect
  end
end

on_worker_boot do
  ::SequelRails.setup Rails.env if defined?(SequelRails)
end
#

lowlevel_error_handler do |ex, env|
  Raven.capture_exception(
    ex,
    :message => ex.message,
    :extra => { :puma => env },
    :transaction => 'Puma',
  )
  # note the below is just a Rack response
  [500, {}, ["An internal error has occurred and we got the report.\n"]]
end

systemd/api.service:

[Unit]
Description = api Puma (instance %i)
PartOf = api.service
ReloadPropagatedFrom = api.service

[Service]
Type = simple
User = gencer
Group = www-data

Environment = PATH=/home/gencer/.rbenv/shims:/usr/local/bin:/usr/bin:/bin
Environment = RBENV_VERSION=2.7.1
Environment = RBENV_ROOT=/home/gencer/.rbenv
Environment = PWD=/disk5/api.com/current
Environment = MALLOC_ARENA_MAX=2
EnvironmentFile=/disk5/api.com/shared/.env # see below for content
WorkingDirectory = /disk5/api.com/current

ExecStart  = /home/gencer/.rbenv/bin/rbenv exec bundle exec pumactl start -F config/puma/production.rb
ExecReload  = /home/gencer/.rbenv/bin/rbenv exec bundle exec pumactl phased-restart -F config/puma/production.rb
ExecStop  = /home/gencer/.rbenv/bin/rbenv exec bundle exec pumactl stop -F config/puma/production.rb
# puma.%i.pid
PIDFile = /disk5/api.com/shared/tmp/pids/puma.pid
TimeoutStopSec = 22
Restart = always

[Install]
WantedBy = multi-user.target

.env:

RAILS_ENV=production
PUMA_WORKERS=20
RAILS_MAX_THREADS=5
MALLOC_ARENA_MAX=2

In this configuration here is the results:

● api.service - api Puma Service
     Loaded: loaded (/etc/systemd/system/api.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2020-11-02 15:55:16 UTC; 4min 22s ago
   Main PID: 957227 (ruby)
      Tasks: 273 (limit: 450000)
     Memory: 488.0M
     CGroup: /system.slice/api.service
             ├─957227 puma 5.0.4 (unix:///disk5/api.com/shared/tmp/sockets/puma1.sock) [api]
             ├─957258 puma: cluster worker 0: 957227 [api]
             ├─957364 puma: cluster worker 1: 957227 [api]
             ├─957367 puma: cluster worker 2: 957227 [api]
             ├─957371 puma: cluster worker 3: 957227 [api]
             ├─957374 puma: cluster worker 4: 957227 [api]
             ├─957379 puma: cluster worker 5: 957227 [api]
             ├─957383 puma: cluster worker 6: 957227 [api]
             ├─957394 puma: cluster worker 7: 957227 [api]
             ├─957401 puma: cluster worker 8: 957227 [api]
             ├─957418 puma: cluster worker 9: 957227 [api]
             ├─957430 puma: cluster worker 10: 957227 [api]
             ├─957449 puma: cluster worker 11: 957227 [api]
             ├─957464 puma: cluster worker 12: 957227 [api]
             ├─957480 puma: cluster worker 13: 957227 [api]
             ├─957496 puma: cluster worker 14: 957227 [api]
             ├─957518 puma: cluster worker 15: 957227 [api]
             ├─957537 puma: cluster worker 16: 957227 [api]
             ├─957549 puma: cluster worker 17: 957227 [api]
             ├─957554 puma: cluster worker 18: 957227 [api]
             └─957566 puma: cluster worker 19: 957227 [api]
"
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] * Pruning Bundler environment
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] Puma starting in cluster mode...
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] * Version 5.0.4 (ruby 2.7.1-p83), codename: Spoony Bard
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] * Min threads: 5, max threads: 5
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] * Environment: production
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] * Process workers: 20
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] * Phased restart available
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] * Listening on unix:///disk5/api.com/shared/tmp/sockets/puma1.sock
Nov 02 15:55:16 server-130 rbenv[957227]: [957227] Use Ctrl-C to stop

@nateberkopec
Copy link
Member

OK, I'm not sure why that worked for you, but it sounds like you've solved your issue and there's no problem with Puma?

@dentarg
Copy link
Member

dentarg commented Nov 7, 2020

The question here is why is your app preloading.

You don't have preload_app anywhere ...

@nateberkopec Puma 5 (#2143) changed the default, if there are more than 1 worker, preload_app! is the default (I've submitted #2481 to make this more clear in the upgrade guide)

@dentarg
Copy link
Member

dentarg commented Nov 7, 2020

Seems like it is only if number of workers is set via WEB_CONCURRENCY that preloading happens by default, see #2481 (comment)

That would explain why not using WEB_CONCURRENCY made it work.

@nateberkopec
Copy link
Member

nateberkopec commented Nov 9, 2020

Closing and moving future discussion to #2483 because the issue is now clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants