Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using supervisor to monitor forks #154

Open
troex opened this issue Jul 6, 2021 · 4 comments
Open

Using supervisor to monitor forks #154

troex opened this issue Jul 6, 2021 · 4 comments
Assignees

Comments

@troex
Copy link

troex commented Jul 6, 2021

Hi, I'm migrating puma->falcon for a quiet loaded app 200-500 req/s, average response time is below 30ms, once on falcon the memory usage started growing over time quiet fast, first suggestion was that's memory allocation/fragmentation and switching to MALLOC_ARENA_MAX=2 and later to jemalloc did help improve situation but didn't resolve the issue completely.

Metrics | Heroku 2021-07-06 13-39-01

The next idea was if we have a supervisor and it can report stats on siblings processes maybe it's possible extend it's functionality to restart forks if they start to bloat or at least report the memory leak.

My initial idea was that I can use something like:

  supervisor do
    Async do
      puts 'my async monitoring code here'
      sleep 5 # report to rollbar, dump heap to S3 for debug and other options & finally restart the fork
      puts 'report'
    end
    puts 'non-blocking'
  end

But probably it's not intended to be used like that, but maybe this is good idea if we could have multiple reactors inside supervisor including custom ones?

Next I tried code below which runs after fork, in my case it's config.ru:

Async do
  sleep 10 + rand(1..5) # just as an example
  puts "Restarting self: #{Process.pid}"
  Process.kill(:INT, Process.pid)
end
puts 'Non blocking'

This works however after around ~250 restarts it ended up with:

Restarting self: 85225
   52m     info: Async::Container::Forked [oid=0x161c] [ec=0x1630] [pid=75731] [2021-07-06 05:46:18 +0300]
               | #<Async::Container::Process Falcon::Service::Application for local.mydomain.com> exited with pid 85225 exit 0
   52m    error: Async::Container::Process [ec=0x1630] [pid=85228] [2021-07-06 05:46:18 +0300]
               |   Errno::EMFILE: Too many open files - pipe
               |   → /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-1.29.1/lib/async/reactor.rb:71 in `new'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-1.29.1/lib/async/reactor.rb:71 in `selector'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-1.29.1/lib/async/reactor.rb:74 in `initialize'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-1.29.1/lib/async/reactor.rb:52 in `new'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-1.29.1/lib/async/reactor.rb:52 in `run'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-1.29.1/lib/kernel/async.rb:28 in `Async'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/falcon-0.39.1/lib/falcon/service/application.rb:90 in `block in setup'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:90 in `block (2 levels) in fork'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:85 in `fork'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:85 in `block in fork'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:121 in `initialize'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:84 in `new'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:84 in `fork'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/forked.rb:39 in `start'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/generic.rb:166 in `block in spawn'
   52m    error: Async::Container::Forked [oid=0x161c] [ec=0x1630] [pid=75731] [2021-07-06 05:46:18 +0300]
               | pid 85228 exit 1
   52m    error: Falcon::Command [ec=0xf14] [pid=75731] [2021-07-06 05:46:18 +0300]
               |   Errno::EMFILE: Too many open files
               |   → /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/channel.rb:31 in `pipe'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/channel.rb:31 in `initialize'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:115 in `initialize'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:84 in `new'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/process.rb:84 in `fork'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/forked.rb:39 in `start'
               |     /Users/troex/.rbenv/versions/3.0.1/lib/ruby/gems/3.0.0/gems/async-container-0.16.11/lib/async/container/generic.rb:166 in `block in spawn'

I was expecting that fork should release all it's open files when restarting. One thing I noticed that if I send INT once to each forked process the memory usage seams to be stable and not growing over time. I'm still figuring out what is the issue of that memory bloating.

I know it's a lot for a story and probably not really an issue of the falcon but I think my use case I can really push falcon to it's edges in a good way but need to figure out how to do it properly.

To outline above I'm looking for two things here:

  1. How can I use supervise to implement my own custom monitoring logic?
  2. What is a proper way to restart a fork from inside itself or supervisor?
@ioquatix
Copy link
Member

ioquatix commented Jul 6, 2021

Thanks for trying this out and reporting all the details of your use case. I'll review it in more detail tomorrow.

@ioquatix ioquatix self-assigned this Jul 6, 2021
@troex
Copy link
Author

troex commented Jul 6, 2021

Update - happened right now again
Metrics | Heroku 2021-07-06 16-28-26
So my goal is to be able to prevent situations like that, I don't have exact logs ATM but basically I'd like to avoid situation like that when app crashes on heroku after couple of restarts

@10allday
Copy link

Any updates on this?

@ioquatix
Copy link
Member

I would love to allocate some time to investigating this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants