cluster.rb - rearrange SIGURG & fork_worker code, small test fix [changelog skip] #2449

MSP-Greg · 2020-10-24T20:54:11Z

Description

fork_worker and SIGURG should be independent. Rearrange code for that. Small test fix.

@wjordan - agree? I was working on some refactoring, wanted to add an 'http' based 'refork' command in ControlCLI, and noticed it...

Your checklist for this pull request

I have reviewed the guidelines for contributing to this repository.
I have added an entry to History.md if this PR fixes a bug or adds a feature. If it doesn't need an entry to HISTORY.md, I have added [changelog skip] or [ci skip] to the pull request title.
I have added appropriate tests if this PR fixes a bug or adds a feature.
My pull request is 100 lines added/removed or less so that it can be easily reviewed.
If this PR doesn't need tests (docs change), I added [ci skip] to the title of the PR.
If this closes any issues, I have added "Closes #issue" to the PR description or my commit messages.
I have updated the documentation accordingly.
All new and existing tests passed, including Rubocop.

nateberkopec · 2020-10-25T20:47:41Z

Not sure I agree, as now the process can respond to sigurg even when you didn't configure it to...

MSP-Greg · 2020-10-25T20:59:28Z

I think SIGURG is about 'reforking' workers when you desire. fork_worker is about an automatic behavior that does the same when a worker reaches a given 'handled request' count?

MSP-Greg · 2020-10-26T00:16:07Z

JFYI, SIGURG / refork is currently only available as a signal in control_cli. I've got additional code that adds a 'request' type command. It also adds two tests that run it with both command styles.

cjlarose · 2020-10-28T14:43:34Z

I'm not sure this makes sense.

SIGURG was introduced in order to support "reforking" when using the fork_worker option in #2099. fork_worker is a totally different mechanism by which workers fork from worker 0 instead of from the puma master process. It happens to also include behavior for automatically reforking after a number of requests, but that's not the main point of the option.

Issuing SIGURG to a cluster that isn't configured to use fork_worker would technically work, but it's awkward since in that configuration, it's basically the same as a phased restart except worker 0 doesn't get replaced (it sticks around). There isn't really any value in doing that for clusters that have fork_worker turned off (which is the default).

MSP-Greg · 2020-10-28T15:12:03Z

I may have misinterpreted the code.

I thought fork_worker was an option that automatically reforked workers (index > 0) from worker0, based on the number of requests that they've processed. SIGURG does the same thing, but is triggered manually.

Hence, it seemed that they could be 'decoupled'. One calls fork_worker! automatically, one calls it manually?

JFYI, at present, I can't use fork locally, so the fork code/options aren't quite as clear. Hopefully that will change soon...

cjlarose · 2020-10-28T15:33:49Z

I thought fork_worker was an option that automatically reforked workers (index > 0) from worker0, based on the number of requests that they've processed. SIGURG does the same thing, but is triggered manually.

It's kinda confusing since the fork_worker option is both a different way to perform worker forking and allows you to specify how frequently to perform a refork operation.

In retrospect, it would have been more clear if the fork_worker was a boolean option and then there was a separate refork_worker_frequency that was an integer that specified the number of requests to wait between reforks. The refork_worker_frequency would only be configurable if fork_worker was on.

Another idea that might be cool is if the automatically-refork-workers-after-a-number-of-requests feature was just implemented as a plugin.

Of course if we did this now, it'd be a breaking change.

cjlarose · 2020-10-28T15:35:26Z

I think it might make sense to revert this. As soon as a change like this lands in a release, it's difficult to remove since it'd be a breaking change for anyone that starts using SIGURG, but doesn't have fork_worker enabled.

MSP-Greg · 2020-10-28T15:39:50Z

Confused.

breaking change for anyone that starts using SIGURG, but doesn't have fork_worker enabled.

In 5.0.4, what does SIGURG do if fork_worker isn't enabled?

cjlarose · 2020-10-28T15:45:23Z

In 5.0.4, what does SIGURG do if fork_worker isn't enabled?

I think it would do nothing. But if the change proposed in this PR is merged, then it SIGURG would perform a phased restart except for worker 0 even for clusters that don't have fork_worker enabled.

So if users start relying on that signal to do that, it'd be awkward to remove that functionality without a major version bump.

cjlarose · 2020-10-28T15:47:14Z

I think I understand where I wasn't totally clear. My claim is not that this change introduces a breaking change. My claim is instead that reverting this change after it makes it into a release would be.

MSP-Greg · 2020-10-28T15:53:12Z

idk.

With this change, a user can perform a fork_worker! without also adding automatic calls to it based on request count.

So, it seems that it should either stay as is with this PR, or SIGURG (and/or fork_worker?) should be removed? Confused.

MSP-Greg · 2020-10-28T15:54:26Z

Or, why is fork_worker needed as a switch to allow SIGURG to function?

cjlarose · 2020-10-28T16:36:47Z

Ok I think we're on the same page as to what effect this PR has for users with fork_worker turned off, but we just don't agree on what the behavior should be.

Or, why is fork_worker needed as a switch to allow SIGURG to function?

Technically, fork_worker isn't needed to allow the behavior that SIGURG enables. But for users with fork_worker off, it enables functionality that those users would not reasonably want. The risk is that a user would start using SIGURG with fork_worker off, observe that it restarts some of their workers in testing, but later find out that what they really wanted was a true phased restart with SIGUSR1. They could, for example, unknowingly push an update to their application and issue a SIGURG, thinking it was a phased restart, but end up with one worker running the old version of their application.

In short, I think enabling SIGURG for clusters with fork_worker off is an unnecessary footgun. A reasonable compromise, I think, might be to change the SIGURG functionality as follows

If fork_worker is on, execute fork_worker!.
Otherwise, execute phased_restart.

Then there's no footgun.

MSP-Greg · 2020-10-28T16:54:42Z

I guess at some point users have to RTFM. We've got two different signals for fork_worker! and phased_restart. If users confuse the two, idk. Puma has a lot of options, and I've never thought about stupid/conflicting configurations, but I'm sure they exist.

I think before this PR, if fork_worker is set to 0, then SIGURG works, but no automatic behavior is added. I never liked weird 'exceptions' like that, but it does decouple the two.

That's really what I think should be decoupled, the automatic behavior from SIGURG. Which way is best?

dentarg · 2020-10-28T18:58:58Z

They could, for example, unknowingly push an update to their application and issue a SIGURG, thinking it was a phased restart, but end up with one worker running the old version of their application.

I agree with @cjlarose, it doesn't make sense to allow users to restart all except one worker. Trying to debug the scenario quoted above doesn't sound fun.

MSP-Greg · 2020-10-28T22:41:41Z

Given that SIGURG != SIGUSR1 and refork != phased_restart, I find it difficult to uderstand how someone might mix the two up. Also, the 'Fork Worker' document describes refork below, which also makes it clear that refork/SIGURG is not a means of restarting Puma.

'Refork' for additional copy-on-write improvements in running applications. Fork-worker mode introduces a new refork command that re-loads all nonzero workers by re-forking them from worker 0.

One item, which I can add, pumactl can only refork with a signal. It should work when pumactl is set up with a server using '--control-url`.

Re reverting this, I'm okay with it. I'll let @nateberkopec decide...

nateberkopec · 2020-10-29T13:12:59Z

A lot to read here, will come back to this soon.

nateberkopec · 2020-11-01T22:06:32Z

I think the best thing here is to revert. If we want SIGURG's behavior to be fork_worker if configured to || phased restart that can be a separate PR.

…fix (#2449)" This reverts commit ff5f145.

cluster.rb - rearrange SIGURG & fork_worker code, small test fix

2a681a4

nateberkopec added the maintenance label Oct 25, 2020

nateberkopec added the waiting-for-review Waiting on review from anyone label Oct 25, 2020

nateberkopec merged commit ff5f145 into puma:master Oct 28, 2020

MSP-Greg deleted the fork-worker branch October 28, 2020 14:09

nateberkopec added a commit that referenced this pull request Nov 1, 2020

Revert "cluster.rb - rearrange SIGURG & fork_worker code, small test …

39c36e1

…fix (#2449)" This reverts commit ff5f145.

dependabot bot mentioned this pull request Mar 18, 2021

Bump puma from 5.0.4 to 5.2.2 YoheiZuho/mastodon#321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster.rb - rearrange SIGURG & fork_worker code, small test fix [changelog skip] #2449

cluster.rb - rearrange SIGURG & fork_worker code, small test fix [changelog skip] #2449

MSP-Greg commented Oct 24, 2020

nateberkopec commented Oct 25, 2020

MSP-Greg commented Oct 25, 2020 •

edited

MSP-Greg commented Oct 26, 2020

cjlarose commented Oct 28, 2020 •

edited

MSP-Greg commented Oct 28, 2020

cjlarose commented Oct 28, 2020

cjlarose commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020

cjlarose commented Oct 28, 2020

cjlarose commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020

cjlarose commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020 •

edited

dentarg commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020

nateberkopec commented Oct 29, 2020

nateberkopec commented Nov 1, 2020

cluster.rb - rearrange SIGURG & fork_worker code, small test fix [changelog skip] #2449

cluster.rb - rearrange SIGURG & fork_worker code, small test fix [changelog skip] #2449

Conversation

MSP-Greg commented Oct 24, 2020

Description

Your checklist for this pull request

nateberkopec commented Oct 25, 2020

MSP-Greg commented Oct 25, 2020 • edited

MSP-Greg commented Oct 26, 2020

cjlarose commented Oct 28, 2020 • edited

MSP-Greg commented Oct 28, 2020

cjlarose commented Oct 28, 2020

cjlarose commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020

cjlarose commented Oct 28, 2020

cjlarose commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020

cjlarose commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020 • edited

dentarg commented Oct 28, 2020

MSP-Greg commented Oct 28, 2020

nateberkopec commented Oct 29, 2020

nateberkopec commented Nov 1, 2020

MSP-Greg commented Oct 25, 2020 •

edited

cjlarose commented Oct 28, 2020 •

edited

MSP-Greg commented Oct 28, 2020 •

edited