New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reactor refactor #2279
Reactor refactor #2279
Conversation
I'm gonna wait for Evan on this. One comment from me is that the Reactor class had a lot of docs before and I'd like to keep it a similar level of documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a minimum there need to be some simplifications. But even before that, I agree with @nateberkopec that it's probably too much to lose all that great documentation. You should readd docs to describe the algorithm and how it works.
42baa55
to
9b32780
Compare
Finished another pass, please take another look:
If the client error-handling consolidation is too much for this PR I could try to break it out into a separate PR- it's related to simplifying the request-buffering code path in general. I also did another pass on the documentation to preserve as much of the existing, relevant details as possible. Note however:
Finally- it's probably easier to review |
lib/puma/client.rb
Outdated
[@timeout_at - Time.now, 0].max | ||
end | ||
|
||
def <=>(other) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels weird to me conceptually. Clients are the same if they time out at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method was used by SortedSet
(which calls #sort!
under the hood to arrange its items), it wasn't meant to convey identity, only ordering. To avoid confusion I changed @timeouts
back to an Array that calls #sort_by!
to arrange its items in 5175792.
Left a quick comment but I'm gonna need more time to read the changes to reactor + server |
9b32780
to
1849516
Compare
Updated:
|
Two related refactoring thoughts that would go in slightly different directions from this PR:
|
It turns out it's quite possible to use I like how simple and readable the extension code ended up, though the underlying |
Yes, agreed (sometimes I find myself regretting nio4r, although it doesn't cause any new issues). Will come back to review everything else soon |
Let me know if you have a chance to review everything else, and if there's anything else I can do to help the next review pass on this. |
Based on the #2371 issue reports I'm assuming the answer to this question is that we want to revert to the original consistent behavior on this 😅 |
b830a1c
to
093f43c
Compare
Refactor Reactor into a more generic IO-with-timeout monitor, using a Queue to simplify the implementation. Move request-buffering logic into Server#reactor_wakeup. Fixes bug in managing timeouts on clients. Move, update and rewrite documentation to match updated class structure.
OK, rebased and ready for another review. Some changes done as part of the rebase:
|
@wjordan I found that this branch (tested on de2f108) actually re-introduces a problem previously fixed by your own #2122:
It seems like it's possible now for a thread in the ThreadPool to add a client to the Reactor after the Reactor has started to shutdown. That alone isn't a problem (there's even explicit code to handle this case), but it is possible for I have a reproducible test case here: https://github.com/cjlarose/puma-phased-restart-errors/tree/reactor_reactor This uses MRI on Linux in a Docker container. You might have to run it a while in order to produce the failure. I think re-introducing the |
Unrelated: This branch also fixes flakiness in a few tests like |
@cjlarose thanks for catching this! It may be a subtly different bug from the one in #2122, since I'm pretty sure there's a still-passing test that covers the issue described in that PR. I have a couple ideas on how to fix the issue (one of which is to leave the mutex in place), but I'll also spend some time on writing a test that might reliably trigger this bug to prevent future regressions. |
That'd be awesome. I've also written a test that does something similar: it just performs a bunch of hot restarts on a single-mode puma server while concurrently performing a bunch of requests. The expectation is that all clients eventually get a successful response. It doesn't pass on all platforms just yet because of various issues in puma, but I'm working on fixing those problems. If you come up with a way to test the |
- In `Reactor#shutdown`, `@selector` can be closed before the call to `#wakeup`, so catch/ignore the `IOError` that may be thrown. - `Reactor#wakeup!` can delete elements from the `@timeouts` array so calling it from an `#each` block can cause the array iteration to miss elements. Call @block directly instead. - Change `Reactor#add` to return `false` if the reactor is already shut down instead of invoking the block immediately, so a client-request currently being processed can continue, rather than re-adding to the thread-pool (which may already be shutting down and unable to accept new work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most recent concurrency fixes look good. Just a minor comment.
end | ||
# Wakeup all remaining objects on shutdown. | ||
@timeouts.each(&@block.method(:call)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can pass the block directly, no?
@timeouts.each(&@block)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes of course, silly me! Harmless enough since this has been merged, but worth slipping a fix into a future PR.
This is really top-notch work. I'm so happy this can be merged. |
) * Test adding connection to Reactor after shutdown Modifies `TestPumaServer#shutdown_requests` to pause `Reactor#add` until after shutdown begins, to ensure requests are handled correctly for this edge case. Adds unit-test coverage for the fix introduced in #2377 and updated in #2279. * Fix Queue#close implementation for Ruby 2.2 Allow `ClosedQueueError` to be raised when `Queue#<<` is called. * Pass `@block` directly instead of `@block.method(:call)`
Description
The
Reactor
class was getting pretty complicated, hard to reason about and had a tricky bug I was working on fixing (#2282), so this PR is a refactoring pass with a focus on simplicity and more carefully separating concerns between the related classes.The Reactor has a simple purpose- run a select loop on a collection of IOs, with the added feature of also waking up an IO when a specified timeout has been reached. With the help of
SortedSet
andQueue
and using the built-inSelector#wakeup
feature, the Reactor class can focus on this one task while being a bit easier to understand.Your checklist for this pull request
[changelog skip]
the pull request title.[ci skip]
to the title of the PR.#issue
" to the PR description or my commit messages.