New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meter accept(2) while taking into account races around thread pool mutex #1563
Conversation
As part of the work in puma#1278 we refactored the conditional in `ThreadPool#wait_until_not_full` because we found it difficult to reason about at the RailsConf '17 roundtable. We broke it up a bit and everyone went away happier. However it turns out that you need to account for both `@waiting` and `@todo.size` together to make a determination on whether to accept more work from the socket and the refactor was not equivalent to the old code. There are no guarantees that request threads that have been signalled due to work being added to the thread pool's queue will become active, take the mutex and decrement `@waiting` before the main worker's thread runs `ThreadPool#wait_until_not_full`. If we run that before the request thread decrements @waiting we end up hitting the early exit of `return @waiting > 0`, taking more work off the socket than the pool's capacity. The correct way to determine the number of requests we have queued in the pool that are outstanding is `@todo.size - @waiting`, so I've reverted to the old behaviour and augmented the comment above it.
/cc @nateberkopec @schneems @evanphx We encountered this issue in development and it slowed down requests on a four-worker, single-threaded setup. Adding to the impact it even caused deadlocks when the request made a request to the same API endpoint and that internal request was picked up by the same worker 😬We monkeypatched it to the old behaviour and it's been working the past few months without issue. If you wanna chat about this one I'm at RailsConf again this year 👍 To debug this, I turned into a puts debugger and put loads of lines around the 3 operations - Using the test case from @ivanpesin in this comment I got this when everything worked normally:
And then I ran the same producer of requests again and I encountered the bug. One of the worker processes took the first two requests and handed the third one to the second process 🤔 Here's the different output:
The difference here is that we ran
|
cc @jasquat if you wanted to try this out |
Thanks @josler and @dannyfallon. This does seem to have fixed my issue. |
Do you know what if any effect this will have on the backlog metric?
…On Thu, Apr 19, 2018 at 10:14 AM jasquat ***@***.***> wrote:
Thanks @josler <https://github.com/josler> and @dannyfallon
<https://github.com/dannyfallon>. This does seem to have fixed my issue.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1563 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AADpYLjAUaltP5lfErkYtKx09gEK7Otnks5tqKm4gaJpZM4TY7FV>
.
|
👋 This should have a tiny effect on the backlog size, and mainly in a positive way - it should become more accurate. When a running process falls foul of this behaviour the We don't want to take requests off the socket and put them into the thread pool unless we have capacity in the pool but because of the mutex sequencing in this race condition, caused by threads not waking up fast enough/before the server enters So in a single-threaded worker with a naive request shape (i.e. the request is ready to be served and doesn't get put into the reactor after being added to the pool) I expect that we'll never have |
Thanks for the work and the explanation! It looks like #1471 provides a test case for this behavior. Can you confirm the issue is gone with this patch?
Ahh, so if there is a free thread to process requests, there shouldn't ever be a request waiting in the backlog. Makes sense. I'm currently using backlog as a proxy for "are we keeping up with the work". It sounds like this won't impact that? |
The owner of that issue confirms this PR fixes it in the above comment 👍
Yep, and when the thread pool is full with no available threads the backlog builds on the socket not in the process.
Just for clarity here we're speaking about Puma's You should continue to monitor the backlog stat because the server process can still end up with a backlog: for example if a the server takes requests off the socket that weren't fully ready they end up in the As a bit of a tangent on backlog monitoring I think if you want to know if you're keeping up with the work you also need to monitor the socket's backlog - if the socket backlog grows you're under capacity, not just when the threadpool(s) have backlogs. |
In a somewhat desperate attempt for better performance, we put this into production on Monday and it's been working great. |
One issue to consider is the "slow client" attack problem. If we're only ever allowing the same number of connections into the reactor as are in the current thread pool then it will be easy to exhaust that and force the server to become unresponsive. |
@schneems, does using nginx mitigate that? |
Yes. Nginx has its own reactor that will handle slow clients. However many people are using puma over unicorn explicitly for the slow client protection (so they get it without also having to run Nginx). We could consider vendoring in nginx like passenger. Also I think there was one proposal to run Puma via passenger awhile ago. I do not think this is the kind of thing we can just quietly drop support for. |
Requests spinning in the Reactor until they're ready don't affect or inflate the threadpool's Absolutely agree with you on slow clients but I don't believe that is something that the original #1278 PR nor this one is trying to solve? Unless I'm mistaken this PR doesn't affect any of these client/slow-client interactions and I'm just trying to restore the thread pool's full capacity check to the one that was in Puma < 3.9.0 because, due to the way Puma puts request threads that are waiting for work to sleep, we can accidentally suck up more requests from the server than we can process after the thread(s) are woken up. |
Okay I've read through a lot of the Puma source code and added docs in #1575 and #1576. So now I think i actually understand what's going on in Puma (in general). At least enough to say more comprehensible things about this PR. Modifying this does not impact the ability to deal with slow clients at all, yay! Because the Based on my new found understanding of how puma works, I think this PR should be merged. 🎉🎉🎉 Honestly this is a tangent but since i'm thinking out loud i'll mention it here: This does present a problem for measuring the backpressure of the server as with this patch the backlog will always be at or close to zero. I don't think we can get metrics from the TCP socket as to how many requests are waiting. Even if we could, they might not be fully buffered requests, so that wouldn't really be a good indicator anyway. I'll open up another issue. Thanks for the PR. I'll see what I can do to get this merged. |
Thanks a ton for the PR and bearing with me while I figure out how Puma works!! |
🙌 Thanks for not leaving this one in purgatory @schneems and for documenting all the gnarly bits you didn't understand while you got enough context - I wish I'd done it myself after my own voyage of discovery too 😄 |
Thought I'd jump in here and note that it is technically possible to get info on waiting requests in the TCP socket through the I've found that monitoring this metric closely provides a very good early-warning for potential cascading failures in a production system. |
As part of the work in #1278 we refactored the conditional in
ThreadPool#wait_until_not_full
because we found it difficult to reason aboutat the RailsConf '17 roundtable. We broke it up a bit and everyone went away
happier. However it turns out that you need to account for both
@waiting
and@todo.size
together to make a determination on whether to accept more workfrom the socket and the refactor was not equivalent to the old code.
There are no guarantees that request threads that have been signalled due to
work being added to the thread pool's queue will become active, take the mutex
and decrement
@waiting
before the main worker's thread runsThreadPool#wait_until_not_full
. If we run that before the request threaddecrements
@waiting
we end up hitting the early exit ofreturn @waiting > 0
,taking more work off the socket than the pool's capacity.
The correct way to determine the number of requests we have queued in the pool
that are outstanding is
@todo.size - @waiting
, so I've reverted to the oldbehaviour and augmented the comment above it.
Fixes #1471