Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PR 1952] 2st verification - updated tests & GitHub Actions #1956

Closed
wants to merge 5 commits into from

Conversation

MSP-Greg
Copy link
Member

@MSP-Greg MSP-Greg commented Sep 10, 2019

PR #1955 shows that the tests in this PR do not pass on current master.

Below is a summary of the changes here. The last commit adds two tests that use UNIXSockets for the bind address, verifying that unix files seem to be handled correctly during shutdowns and restarts.

lib files - binder.rb, cluster.rb, launcher.rb

  1. binder.rb - Add PR fix Binder#close #1886

  2. cluster.rb - Add PR Fix Worker external TERM signalling #1952, additional minor changes

  3. launcher.rb - #close_binder_listeners - close io's as fast as possible. Unlink files in separate iteration.

test_integration.rb

Request handling during server TERM - two tests

#test_term_closes_listeners_cluster_tcp
#test_term_closes_listeners_cluster_unix

using #term_closes_listeners_cluster

Send requests 10 per second. Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

#test_usr1_all_respond_cluster_tcp
#test_usr1_all_respond_cluster_unix

using #usr1_all_respond_cluster

Send requests 1 per second. Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

#test_stuck_external_term_spawn_cluster
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

#test_stuck_phased_restart_cluster
Tests whether 'stuck' workers are properly shutdown during phased-restart.

GitHub Actions - workflow.yml

  1. Install ragel for Ubuntu and macOS. Both intermittently failed compile without doing so.

  2. Windows - Add step to update MSYS2 gcc, install ragel and proper OpenSSL package.

  3. Actions CI seems stable, and it finishes well before either Travis or AppVeyor.

Misc test files changes

Added parallelize_me! to a few files, misc changes

1. binder.rb   Add PR puma#1886

2. cluster.rb  Add PR puma#1952, additional minor changes

3. launcher.rb #close_binder_listeners close io's as fast as possible
1. test_integration.rb - Rewrite so all tests create a server via IO.popen.

2. Update some tests for parallelize_me!

3. Add config/worker_shutdown_timeout_2.rb, rackup/sleep_pid.ru
cluster.rb - clean up duplicate term? method

workflow.yml - better step descriptions, add PR's

def setup
@port = 0
@port = UniquePort.call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just use port = 0 everywhere, I wonder?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. Only thing is that all the code would have to be changed for the 'clients', but we could set that in wait_for_server_to_boot?

I'm !AFK now, so as soon as the CI finishes, I'll have a look, as it's pretty 'fresh' in my mind...

@@ -217,11 +217,12 @@ def restart_args
end

def close_binder_listeners
@binder.listeners.each do |l, io|
io.close
# close binder listeners as fast as possible, so separate loop
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the listener sockets closed quickly once shutdown starts. Anything that requires checking the disk can slow down. Small issue.

@MSP-Greg
Copy link
Member Author

@nateberkopec

One typo I didn't explicitly mention:

def teardown
if defined?(@server) && @server
begin
Process.kill "INT", @server.pid
rescue
Errno::ESRCH

Line 25 should be combined with line 24. In case you've got another commit waiting.

@@ -50,14 +50,6 @@ def env(sock)

def close
@ios.each { |i| i.close }
@unix_paths.each do |i|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't fix the problem, as bind paths are still not removed after program stop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few days ago, I was really clear on what was where in terms of Binder @ios, @unix_paths, and @listeners. I should have added some comments...

Paths are closed in Binder#close_listeners, They aren't in Binder#close

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this assertion I've added to master. Anything that purports to fix this problem needs to pass this test (which is failing on master).

@@ -70,6 +74,34 @@ def run(*)

module TestSkips

# finds bad signals, value is an array of symbols
BAD_SIGNAL_LIST = begin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of this, what's wrong with Signal.list?

Copy link
Member Author

@MSP-Greg MSP-Greg Sep 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows Signal.list includes :TERM, but it generates an error when used. Wasn't sure if other OS's that aren't tested might also behave the same way.

Needed a process to kill to test for it...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the idea, but let's just remove :TERM from the list if on Windows, this spawning-a-process thing feels like the nuclear option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if :TERM doesnt work on windows, why do tests that use not fail? e.g. test_term_closes_listeners or test_term_suppress

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of of course, because you can't fork on windows. D'oh!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just remove :TERM from the list if on Windows, this spawning-a-process thing feels like the nuclear option

Understandable.

I really hate defining by OS, as opposed to actually checking for functionality. Maybe it (:TERM) could change in a year or something. Probably unlikely...

@nateberkopec
Copy link
Member

OK, let's try again with this:

  • 1 commit that adds the tests described in the OP
  • 1 commit each for the 3 changes described in the lib section of the OP (so, 3 more commits)
  • Save everything else, e.g. Github Actions, the signal test helper changes, etc for other PRs.

@MSP-Greg MSP-Greg mentioned this pull request Sep 11, 2019
6 tasks
@MSP-Greg
Copy link
Member Author

Of the above three items, '1 commit that adds the tests described in the OP' is the messiest. I'm happy to look at it or do the PR, but you may want to do it.

I was told I'd be receiving info about an ongoing project early this week, which will keep me busy for several days. Hence, the PR. I've now been told it will be later this week or next...

MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 13, 2019
MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 13, 2019
test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_cluster_tcp`
`#test_term_closes_listeners_cluster_unix`

using `#term_closes_listeners_cluster`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_cluster_tcp`
`#test_usr1_all_respond_cluster_unix`

using `#usr1_all_respond_cluster`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn_cluster`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart_cluster`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM
MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 13, 2019
test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_tcp`
`#test_term_closes_listeners_unix`

using `#term_closes_listeners`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_tcp`
`#test_usr1_all_respond_unix`

using `#usr1_all_respond`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM
MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 13, 2019
test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_tcp`
`#test_term_closes_listeners_unix`

using `#term_closes_listeners`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_tcp`
`#test_usr1_all_respond_unix`

using `#usr1_all_respond`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM
MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 14, 2019
test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_tcp`
`#test_term_closes_listeners_unix`

using `#term_closes_listeners`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_tcp`
`#test_usr1_all_respond_unix`

using `#usr1_all_respond`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM
MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 17, 2019
test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_tcp`
`#test_term_closes_listeners_unix`

using `#term_closes_listeners`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_tcp`
`#test_usr1_all_respond_unix`

using `#usr1_all_respond`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM
MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 18, 2019
test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_tcp`
`#test_term_closes_listeners_unix`

using `#term_closes_listeners`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_tcp`
`#test_usr1_all_respond_unix`

using `#usr1_all_respond`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM
@MSP-Greg MSP-Greg deleted the update-lib-test-actions branch September 18, 2019 15:59
MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 19, 2019
test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_tcp`
`#test_term_closes_listeners_unix`

using `#term_closes_listeners`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_tcp`
`#test_usr1_all_respond_unix`

using `#usr1_all_respond`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM
MSP-Greg added a commit to MSP-Greg/puma that referenced this pull request Sep 19, 2019
test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_tcp`
`#test_term_closes_listeners_unix`

using `#term_closes_listeners`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_tcp`
`#test_usr1_all_respond_unix`

using `#usr1_all_respond`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM
nateberkopec pushed a commit that referenced this pull request Sep 19, 2019
* Update test_integration files per PR #1956

test_integration_cluster.rb

Request handling during server TERM - two tests

`#test_term_closes_listeners_tcp`
`#test_term_closes_listeners_unix`

using `#term_closes_listeners`

Send requests 10 per second.  Send 10, then :TERM server, then send another 30.
No more than 10 should throw Errno::ECONNRESET.

Request handling during phased restart - two tests

`#test_usr1_all_respond_tcp`
`#test_usr1_all_respond_unix`

using `#usr1_all_respond`

Send requests 1 per second.  Send 1, then :USR1 server, then send another 24.
All should be responded to, and at least three workers should be used

Stuck worker tests - two tests

`#test_stuck_external_term_spawn`
Tests whether externally TERM'd 'stuck' workers are proper re-spawned.

`#test_stuck_phased_restart`
Tests whether 'stuck' workers are properly shutdown during phased-restart.

helper files/methods changes

1. helper file changes to allow binding to TCP or UNIX, see kwarg unix:
2. Skip on Windows for signal TERM

* Misc updates, debug output, cleanup

* Add comments

* fix test_int_signal_with_background_thread_in_jruby per review

* TestIntegrationCluster#term_closes_listeners - add interleaved assert

* cluster.rb - remove duplicate Worker#term? method
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants