Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster.rb - rearrange SIGURG & fork_worker code, small test fix [changelog skip] #2449

Merged
merged 1 commit into from Oct 28, 2020

Conversation

MSP-Greg
Copy link
Member

Description

fork_worker and SIGURG should be independent. Rearrange code for that. Small test fix.

@wjordan - agree? I was working on some refactoring, wanted to add an 'http' based 'refork' command in ControlCLI, and noticed it...

Your checklist for this pull request

  • I have reviewed the guidelines for contributing to this repository.
  • I have added an entry to History.md if this PR fixes a bug or adds a feature. If it doesn't need an entry to HISTORY.md, I have added [changelog skip] or [ci skip] to the pull request title.
  • I have added appropriate tests if this PR fixes a bug or adds a feature.
  • My pull request is 100 lines added/removed or less so that it can be easily reviewed.
  • If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
  • If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
  • I have updated the documentation accordingly.
  • All new and existing tests passed, including Rubocop.

@nateberkopec
Copy link
Member

Not sure I agree, as now the process can respond to sigurg even when you didn't configure it to...

@nateberkopec nateberkopec added the waiting-for-review Waiting on review from anyone label Oct 25, 2020
@MSP-Greg
Copy link
Member Author

MSP-Greg commented Oct 25, 2020

I think SIGURG is about 'reforking' workers when you desire. fork_worker is about an automatic behavior that does the same when a worker reaches a given 'handled request' count?

@MSP-Greg
Copy link
Member Author

JFYI, SIGURG / refork is currently only available as a signal in control_cli. I've got additional code that adds a 'request' type command. It also adds two tests that run it with both command styles.

@nateberkopec nateberkopec merged commit ff5f145 into puma:master Oct 28, 2020
@MSP-Greg MSP-Greg deleted the fork-worker branch October 28, 2020 14:09
@cjlarose
Copy link
Member

cjlarose commented Oct 28, 2020

I'm not sure this makes sense.

SIGURG was introduced in order to support "reforking" when using the fork_worker option in #2099. fork_worker is a totally different mechanism by which workers fork from worker 0 instead of from the puma master process. It happens to also include behavior for automatically reforking after a number of requests, but that's not the main point of the option.

Issuing SIGURG to a cluster that isn't configured to use fork_worker would technically work, but it's awkward since in that configuration, it's basically the same as a phased restart except worker 0 doesn't get replaced (it sticks around). There isn't really any value in doing that for clusters that have fork_worker turned off (which is the default).

@MSP-Greg
Copy link
Member Author

I may have misinterpreted the code.

I thought fork_worker was an option that automatically reforked workers (index > 0) from worker0, based on the number of requests that they've processed. SIGURG does the same thing, but is triggered manually.

Hence, it seemed that they could be 'decoupled'. One calls fork_worker! automatically, one calls it manually?

JFYI, at present, I can't use fork locally, so the fork code/options aren't quite as clear. Hopefully that will change soon...

@cjlarose
Copy link
Member

I thought fork_worker was an option that automatically reforked workers (index > 0) from worker0, based on the number of requests that they've processed. SIGURG does the same thing, but is triggered manually.

It's kinda confusing since the fork_worker option is both a different way to perform worker forking and allows you to specify how frequently to perform a refork operation.

In retrospect, it would have been more clear if the fork_worker was a boolean option and then there was a separate refork_worker_frequency that was an integer that specified the number of requests to wait between reforks. The refork_worker_frequency would only be configurable if fork_worker was on.

Another idea that might be cool is if the automatically-refork-workers-after-a-number-of-requests feature was just implemented as a plugin.

Of course if we did this now, it'd be a breaking change.

@cjlarose
Copy link
Member

I think it might make sense to revert this. As soon as a change like this lands in a release, it's difficult to remove since it'd be a breaking change for anyone that starts using SIGURG, but doesn't have fork_worker enabled.

@MSP-Greg
Copy link
Member Author

Confused.

breaking change for anyone that starts using SIGURG, but doesn't have fork_worker enabled.

In 5.0.4, what does SIGURG do if fork_worker isn't enabled?

@cjlarose
Copy link
Member

In 5.0.4, what does SIGURG do if fork_worker isn't enabled?

I think it would do nothing. But if the change proposed in this PR is merged, then it SIGURG would perform a phased restart except for worker 0 even for clusters that don't have fork_worker enabled.

So if users start relying on that signal to do that, it'd be awkward to remove that functionality without a major version bump.

@cjlarose
Copy link
Member

I think I understand where I wasn't totally clear. My claim is not that this change introduces a breaking change. My claim is instead that reverting this change after it makes it into a release would be.

@MSP-Greg
Copy link
Member Author

idk.

With this change, a user can perform a fork_worker! without also adding automatic calls to it based on request count.

So, it seems that it should either stay as is with this PR, or SIGURG (and/or fork_worker?) should be removed? Confused.

@MSP-Greg
Copy link
Member Author

Or, why is fork_worker needed as a switch to allow SIGURG to function?

@cjlarose
Copy link
Member

Ok I think we're on the same page as to what effect this PR has for users with fork_worker turned off, but we just don't agree on what the behavior should be.

Or, why is fork_worker needed as a switch to allow SIGURG to function?

Technically, fork_worker isn't needed to allow the behavior that SIGURG enables. But for users with fork_worker off, it enables functionality that those users would not reasonably want. The risk is that a user would start using SIGURG with fork_worker off, observe that it restarts some of their workers in testing, but later find out that what they really wanted was a true phased restart with SIGUSR1. They could, for example, unknowingly push an update to their application and issue a SIGURG, thinking it was a phased restart, but end up with one worker running the old version of their application.

In short, I think enabling SIGURG for clusters with fork_worker off is an unnecessary footgun. A reasonable compromise, I think, might be to change the SIGURG functionality as follows

  1. If fork_worker is on, execute fork_worker!.
  2. Otherwise, execute phased_restart.

Then there's no footgun.

@MSP-Greg
Copy link
Member Author

MSP-Greg commented Oct 28, 2020

I guess at some point users have to RTFM. We've got two different signals for fork_worker! and phased_restart. If users confuse the two, idk. Puma has a lot of options, and I've never thought about stupid/conflicting configurations, but I'm sure they exist.

I think before this PR, if fork_worker is set to 0, then SIGURG works, but no automatic behavior is added. I never liked weird 'exceptions' like that, but it does decouple the two.

That's really what I think should be decoupled, the automatic behavior from SIGURG. Which way is best?

@dentarg
Copy link
Member

dentarg commented Oct 28, 2020

They could, for example, unknowingly push an update to their application and issue a SIGURG, thinking it was a phased restart, but end up with one worker running the old version of their application.

I agree with @cjlarose, it doesn't make sense to allow users to restart all except one worker. Trying to debug the scenario quoted above doesn't sound fun.

@MSP-Greg
Copy link
Member Author

Given that SIGURG != SIGUSR1 and refork != phased_restart, I find it difficult to uderstand how someone might mix the two up. Also, the 'Fork Worker' document describes refork below, which also makes it clear that refork/SIGURG is not a means of restarting Puma.


'Refork' for additional copy-on-write improvements in running applications. Fork-worker mode introduces a new refork command that re-loads all nonzero workers by re-forking them from worker 0.


One item, which I can add, pumactl can only refork with a signal. It should work when pumactl is set up with a server using '--control-url`.

Re reverting this, I'm okay with it. I'll let @nateberkopec decide...

@nateberkopec
Copy link
Member

A lot to read here, will come back to this soon.

@nateberkopec
Copy link
Member

I think the best thing here is to revert. If we want SIGURG's behavior to be fork_worker if configured to || phased restart that can be a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance waiting-for-review Waiting on review from anyone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants