Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More elaborate exception handling. #2700

Merged
merged 2 commits into from Sep 16, 2021

Conversation

ioquatix
Copy link
Contributor

@ioquatix ioquatix commented Sep 15, 2021

Fixes #2699.

@ioquatix ioquatix changed the title More elaborate exception handling. Fixes #2699. More elaborate exception handling. Sep 15, 2021
@nateberkopec
Copy link
Member

Can you adapt the code you put in your issue report into a test? Otherwise LGTM

@ioquatix
Copy link
Contributor Author

ioquatix commented Sep 16, 2021

@nateberkopec please tell me any nits or other issues with this PR. I am not fully across the Puma test suite so I just cobbled something together based on some of the other integration tests.

@ioquatix ioquatix force-pushed the listen-loop-exception-handling branch from 2f5e6e1 to ccd0843 Compare September 16, 2021 02:32
@MSP-Greg
Copy link
Member

@ioquatix

I was waiting for CI. For a test, see:

https://github.com/MSP-Greg/puma/commit/c358b3b14d939ba04e800662b237d79fbeb9a24e.patch

Test & Master https://github.com/MSP-Greg/puma/actions/runs/1239898360
All appropriate runs failed 'TestIntegrationSingle#test_closed_listener'

Test & PR2700 https://github.com/MSP-Greg/puma/actions/runs/1239909432
All appropriate runs passed 'TestIntegrationSingle#test_closed_listener'

Windows is skipped due to the test using SIGTERM. As mentioned, I tested closing the server, which is where the infinite loop happened. I needed to add a timeout so the infinite loop was caught if it happened, like with master...

@ioquatix
Copy link
Contributor Author

ioquatix commented Sep 16, 2021

@MSP-Greg do you want to just add your commit to my commit? I am happy for you to delete my test changes.

If checked, users with write access to puma/puma can add new commits to your listen-loop-exception-handling branch. You can always change this setting later.

You should be able to directly push to my branch on ioquatix/puma.

@MSP-Greg MSP-Greg force-pushed the listen-loop-exception-handling branch from ccd0843 to 39a5cdd Compare September 16, 2021 03:07
@MSP-Greg
Copy link
Member

@ioquatix

Thanks. Sorry for the delay. I squashed to remove your test, I saved the code in a patch file if...

@ioquatix
Copy link
Contributor Author

My test was a total hack. I am happy you did something better :) Thanks for your commitment to Puma.

@ioquatix
Copy link
Contributor Author

Also it should be said, that there is a material change to the listen loop - rescue Object changed to rescue StandardError. This means non-StandardError exceptions will propagate out and I think that's desirable since these often represent critical failures like memory allocation errors, etc. Basically, at this point, we should let the instance fail. In our production bug... the load balancer continues to try and send traffic for a short while and we end up with a whole batch of failed requests. I'd rather have it fail fast.

@nateberkopec
Copy link
Member

Also it should be said, that there is a material change to the listen loop

Yeah, I think I'm willing to call this a bugfix rather than an API change.

@nateberkopec nateberkopec merged commit cb08034 into puma:master Sep 16, 2021
@ioquatix ioquatix deleted the listen-loop-exception-handling branch September 16, 2021 19:52
@ioquatix
Copy link
Contributor Author

ioquatix commented Sep 20, 2021

I tested this in production today and it does seem to be working.

However, I don't imagine this solved our underlying problem, but I was hoping it would mitigate some of the issue.

Maybe I'm misunderstanding what's going on, but we are now re-raising the exception on EBADF.

I would expect this to log from this line: https://github.com/puma/puma/blame/master/lib/puma/server.rb#L376 (permalink)

While we still see the occasional instance go down, I'm now not seeing EBADF errors either. I guess all bets are off if there is some kind of VM corruption, but this does seem at least slightly odd to me. I think we should at least see "Exception handling servers" in the logs somewhere. What we see instead is: "Gracefully shutting down workers..." (unexpectedly).

dentarg added a commit to dentarg/puma that referenced this pull request Oct 30, 2021
In an effort to have green CI more often.

It was introduced in puma#2700

I've seen it fail like this on macos-10.15 and Ruby
- 2.2: https://github.com/puma/puma/runs/4055009489?check_suite_focus=true#step:9:739
- 2.3: https://github.com/puma/puma/runs/4055009508?check_suite_focus=true#step:9:661
- macos-11 2.4: https://github.com/puma/puma/runs/4055009711?check_suite_focus=true#step:9:667

      1) Error:
    TestIntegrationSingle#test_closed_listener:
    EOFError: end of file reached
        /Users/runner/work/puma/puma/test/helpers/integration.rb:160:in `readpartial'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:160:in `block (2 levels) in read_body'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:159:in `loop'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:159:in `block in read_body'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:88:in `block in timeout'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `block in catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:103:in `timeout'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:158:in `read_body'
        /Users/runner/work/puma/puma/test/test_integration_single.rb:187:in `test_closed_listener'
        /Users/runner/work/puma/puma/test/helper.rb:82:in `block (4 levels) in run'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:88:in `block in timeout'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:98:in `call'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:98:in `timeout'
        /Users/runner/work/puma/puma/test/helper.rb:80:in `block (3 levels) in run'

And like this on
- macos-10.15 Ruby 2.4: https://github.com/puma/puma/runs/4055009526?check_suite_focus=true#step:9:618
- macos-11 Ruby 2.3: https://github.com/MSP-Greg/puma/runs/3616720219?check_suite_focus=true#step:9:574 (when it was introduced)

      1) Error:
    TestIntegrationSingle#test_closed_listener:
    TimeoutEveryTestCase::TestTookTooLong: execution expired
        /Users/runner/work/puma/puma/test/helpers/integration.rb:124:in `gets'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:124:in `connect'
        /Users/runner/work/puma/puma/test/test_integration_single.rb:187:in `test_closed_listener'
        /Users/runner/work/puma/puma/test/helper.rb:82:in `block (4 levels) in run'
        /Users/runner/hostedtoolcache/Ruby/2.4.10/x64/lib/ruby/2.4.0/timeout.rb:103:in `timeout'
        /Users/runner/work/puma/puma/test/helper.rb:80:in `block (3 levels) in run'
dentarg added a commit to dentarg/puma that referenced this pull request Oct 30, 2021
In an effort to have green CI more often.

It was introduced in puma#2700

I've seen it fail like this on macos-10.15 and Ruby
- 2.2: https://github.com/puma/puma/runs/4055009489?check_suite_focus=true#step:9:739
- 2.3: https://github.com/puma/puma/runs/4055009508?check_suite_focus=true#step:9:661
- macos-11 2.4: https://github.com/puma/puma/runs/4055009711?check_suite_focus=true#step:9:667

      1) Error:
    TestIntegrationSingle#test_closed_listener:
    EOFError: end of file reached
        /Users/runner/work/puma/puma/test/helpers/integration.rb:160:in `readpartial'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:160:in `block (2 levels) in read_body'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:159:in `loop'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:159:in `block in read_body'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:88:in `block in timeout'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `block in catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:103:in `timeout'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:158:in `read_body'
        /Users/runner/work/puma/puma/test/test_integration_single.rb:187:in `test_closed_listener'
        /Users/runner/work/puma/puma/test/helper.rb:82:in `block (4 levels) in run'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:88:in `block in timeout'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:98:in `call'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:98:in `timeout'
        /Users/runner/work/puma/puma/test/helper.rb:80:in `block (3 levels) in run'

And like this on
- macos-10.15 Ruby 2.4: https://github.com/puma/puma/runs/4055009526?check_suite_focus=true#step:9:618
- macos-11 Ruby 2.3: https://github.com/MSP-Greg/puma/runs/3616720219?check_suite_focus=true#step:9:574 (when it was introduced)

      1) Error:
    TestIntegrationSingle#test_closed_listener:
    TimeoutEveryTestCase::TestTookTooLong: execution expired
        /Users/runner/work/puma/puma/test/helpers/integration.rb:124:in `gets'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:124:in `connect'
        /Users/runner/work/puma/puma/test/test_integration_single.rb:187:in `test_closed_listener'
        /Users/runner/work/puma/puma/test/helper.rb:82:in `block (4 levels) in run'
        /Users/runner/hostedtoolcache/Ruby/2.4.10/x64/lib/ruby/2.4.0/timeout.rb:103:in `timeout'
        /Users/runner/work/puma/puma/test/helper.rb:80:in `block (3 levels) in run'
dentarg added a commit to dentarg/puma that referenced this pull request Oct 30, 2021
In an effort to have green CI more often.

It was introduced in puma#2700

I've seen it fail like this on macos-10.15 and Ruby
- 2.2: https://github.com/puma/puma/runs/4055009489?check_suite_focus=true#step:9:739
- 2.3: https://github.com/puma/puma/runs/4055009508?check_suite_focus=true#step:9:661
- macos-11 2.4: https://github.com/puma/puma/runs/4055009711?check_suite_focus=true#step:9:667

      1) Error:
    TestIntegrationSingle#test_closed_listener:
    EOFError: end of file reached
        /Users/runner/work/puma/puma/test/helpers/integration.rb:160:in `readpartial'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:160:in `block (2 levels) in read_body'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:159:in `loop'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:159:in `block in read_body'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:88:in `block in timeout'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `block in catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:32:in `catch'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:103:in `timeout'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:158:in `read_body'
        /Users/runner/work/puma/puma/test/test_integration_single.rb:187:in `test_closed_listener'
        /Users/runner/work/puma/puma/test/helper.rb:82:in `block (4 levels) in run'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:88:in `block in timeout'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:98:in `call'
        /Users/runner/hostedtoolcache/Ruby/2.2.10/x64/lib/ruby/2.2.0/timeout.rb:98:in `timeout'
        /Users/runner/work/puma/puma/test/helper.rb:80:in `block (3 levels) in run'

And like this on
- macos-10.15 Ruby 2.4: https://github.com/puma/puma/runs/4055009526?check_suite_focus=true#step:9:618
- macos-11 Ruby 2.3: https://github.com/MSP-Greg/puma/runs/3616720219?check_suite_focus=true#step:9:574 (when it was introduced)

      1) Error:
    TestIntegrationSingle#test_closed_listener:
    TimeoutEveryTestCase::TestTookTooLong: execution expired
        /Users/runner/work/puma/puma/test/helpers/integration.rb:124:in `gets'
        /Users/runner/work/puma/puma/test/helpers/integration.rb:124:in `connect'
        /Users/runner/work/puma/puma/test/test_integration_single.rb:187:in `test_closed_listener'
        /Users/runner/work/puma/puma/test/helper.rb:82:in `block (4 levels) in run'
        /Users/runner/hostedtoolcache/Ruby/2.4.10/x64/lib/ruby/2.4.0/timeout.rb:103:in `timeout'
        /Users/runner/work/puma/puma/test/helper.rb:80:in `block (3 levels) in run'

Can't see that `Puma::IS_OSX` has ever existed, so fix that (changed in puma#2576).
@dentarg
Copy link
Member

dentarg commented Nov 3, 2021

Maybe I'm misunderstanding what's going on, but we are now re-raising the exception on EBADF.

I would expect this to log from this line: https://github.com/puma/puma/blame/master/lib/puma/server.rb#L376 (permalink)

@ioquatix That showed up in CI, see #2748

JuanitoFatas pushed a commit to JuanitoFatas/puma that referenced this pull request Sep 9, 2022
* More elaborate exception handling. Fixes puma#2699.

* Add TestIntegrationSingle#test_closed_listener

Co-authored-by: MSP-Greg <Greg.mpls@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Puma listen thread can fail with infinite loop.
4 participants