Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove use of json gem from WorkerHandle#ping! #2473

Conversation

cjlarose
Copy link
Member

@cjlarose cjlarose commented Oct 31, 2020

Description

This change is the first in a series meant to address phased restart errors related to the json gem described in #2471. The changes for this specific PR are described in a comment in the same issue: #2471 (comment)

This PR essentially reverts the changes to the handling of puma worker stats introduced by #2124 in a way that intentionally avoids the use of the json gem.

I added an integration test to ensure that users can upgrade/downgrade the JSON gem as part of phased restart (it reliably reproduces the error reported in #2471). While I was in there, I added a similar integration test for upgrading/downgrading the nio4r gem. That test was effectively fixed by #2427.

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added an entry to History.md if this PR fixes a bug or adds a feature. If it doesn't need an entry to HISTORY.md, I have added [changelog skip] or [ci skip] to the pull request title.
  • I have added appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

@cjlarose cjlarose changed the title Remove use of json gem from worker handle ping Remove use of json gem from WorkerHandle#ping! Oct 31, 2020
@cjlarose cjlarose changed the title Remove use of json gem from WorkerHandle#ping! Remove use of json gem from WorkerHandle#ping! Oct 31, 2020
@nateberkopec
Copy link
Member

Really bad-ass tests here. Nice work.

@nateberkopec nateberkopec merged commit 5f8b5fc into puma:master Nov 1, 2020
@MSP-Greg
Copy link
Member

MSP-Greg commented Nov 2, 2020

@cjlarose

I was also looking at this the weekend, but added another phased restart, as the some people posting with the issue implied that the problem happened after more than a single phased restart.

So, I added the following to the bottom of TestWorkerGemIndependence#change_gem_version_during_phased_restart:

    # 2nd phased restart, already installed
    set_release_symlink File.expand_path(old_app_dir, __dir__)
    start_phased_restart

    connection = connect
    initial_reply = read_body(connection)
    assert_equal old_version, initial_reply

That failed. Also, I believe that $LOAD_PATH & ENV['BUNDLER_GEMFILE'] always use absolute/realdirpath entries, so that may also affect things. I haven't gotten much further.

When debugging, things got messy. I was looking at a lot of things, including $LOAD_PATH, which, if there are a lot entries, may add a lot of disk hits/lookups when code calls require. When multiple phased restart's are done with a symlinked directory, there may be alot of entries in $LOAD_PATH that point to deleted folders?

And lastly, the tests currently run under a Bundler context, so maybe pass a block to bundle_install so cli_server '--prune-bundler -w 1' can run within with_unbundled_env?

@MSP-Greg
Copy link
Member

MSP-Greg commented Nov 2, 2020

@cjlarose

What I said above could be interpreted a few ways. I was working on it intermittently, but my 'gut feeling' was that this is really messy. Hence, I'm asking for your opinion. I'm looking at a generic issue something like:

Puma may have dependencies on external gems and/or Ruby default gems. These may also be shared by the application. Given an indeterminate number of phased restarts, will changes to those gems work correctly in both Puma and the app?

@cjlarose
Copy link
Member Author

cjlarose commented Nov 2, 2020

When multiple phased restart's are done with a symlinked directory, there may be alot of entries in $LOAD_PATH that point to deleted folders?

I haven't observed this. What I normally see (assuming prune_bundler is on) is that the LOAD_PATH will contain the normal default Ruby stuff and entries for puma itself (and for json) from the directory in which the puma master process started. I haven't seen LOAD_PATH on the master process gain new entries after phased restarts (with the exception of the json gem before this PR)

And lastly, the tests currently run under a Bundler context, so maybe pass a block to bundle_install so cli_server '--prune-bundler -w 1' can run within with_unbundled_env?

Running cli_server in the unbundled env seems like the way to go. I'll open a PR for this.

Puma may have dependencies on external gems and/or Ruby default gems. These may also be shared by the application. Given an indeterminate number of phased restarts, will changes to those gems work correctly in both Puma and the app?

If there are any external gems on which the Puma master process relies, application developers will be unable to upgrade/downgrade that gem with a phased restart. If any of those gems have native extensions, operators will be unable to delete those gems on disk for old releases. Given this, I think our best path forward is to make it so that the Puma master process has no dependencies on any external gems. The good news is that we're almost there. #2427 removed the dependency on nio4r in the master process and soon we'll get rid of json. I think that's the last one.

@MSP-Greg
Copy link
Member

MSP-Greg commented Nov 2, 2020

@cjlarose

I haven't observed this.

At present, not being able to use fork locally (Windows), I always assumed that the $LOAD_PATH in fork'd code was the same as the master process $LOAD_PATH. Never output in CI, so I don't know.

By chance, have you tried adding an additional phased-restart?

Thanks, Greg

@cjlarose
Copy link
Member Author

cjlarose commented Nov 2, 2020

By chance, have you tried adding an additional phased-restart?

Doing an additional phased restart passes just fine, as long as you change the start_phased_restart helper to expect the new phase number

diff --git a/test/test_worker_gem_independence.rb b/test/test_worker_gem_independence.rb
index 05385884..9ce63b57 100644
--- a/test/test_worker_gem_independence.rb
+++ b/test/test_worker_gem_independence.rb
@@ -56,6 +56,14 @@ class TestWorkerGemIndependence < TestIntegration
     connection = connect
     new_reply = read_body(connection)
     assert_equal new_version, new_reply
+
+    # back to old version
+    set_release_symlink File.expand_path(old_app_dir, __dir__)
+    start_phased_restart expected_phase: '2'
+
+    connection = connect
+    initial_reply = read_body(connection)
+    assert_equal old_version, initial_reply
   end
 
   def current_release_symlink
@@ -67,10 +75,10 @@ class TestWorkerGemIndependence < TestIntegration
     FileUtils.symlink target_dir, current_release_symlink, force: true
   end
 
-  def start_phased_restart
+  def start_phased_restart(expected_phase: '1')
     Process.kill :USR1, @pid
 
-    true while @server.gets !~ /booted, phase: 1/
+    true while @server.gets !~ /booted, phase: #{expected_phase}/
   end
 
   def with_unbundled_env

@cjlarose
Copy link
Member Author

cjlarose commented Nov 2, 2020

At present, not being able to use fork locally (Windows), I always assumed that the $LOAD_PATH in fork'd code was the same as the master process $LOAD_PATH. Never output in CI, so I don't know.

Yeah, $LOAD_PATH is the same after a fork. In a puma worker process, $LOAD_PATH will be the same as the puma master process up until it executes require 'bundler/setup' (or the application modifies its $LOAD_PATH manually or via gem).

@MSP-Greg
Copy link
Member

MSP-Greg commented Nov 2, 2020

long as you change the

How many stupid mistakes do I get a month?

Re $LOAD_PATH, makes sense. I should be running Ubuntu sometime soon...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants