New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accepting more connections than pool availability #1453
Comments
Any chance that this issue is behind #1405 ? |
Update: I played with this a bit with nginx and jmeter (fast and slow clients), tweaking when puma decides to accept new connections. After being out of the weeds for a bit, the thoughts that remain:
A Reactor refactor that threats these differently comes to mind (slight 👀 regarding #1403 (comment)):
|
I dont think I can be of much help here, but if I may ask - what do you mean by connection is "not ready yet"? I'm reading https://github.com/puma/puma/blob/master/docs/architecture.md and I dont see how any connection picked up by a worker thread can be "not ready". |
@jf great question (and was mine too before digging into this). High level explanation of "not ready" yet is that the HTTP request has not come over the wire completely yet. In other words, the connection is established but we can't start work on the request yet (hand off to web application) because we haven't read all the bytes of the request yet. There are many scenarios, some where this probably wont happen (simple This is over-simplified since there are a lot of variables here (socket type, nginx config, etc), but that's a high-level explanation of my understanding. |
@eprothro last time I checked (2/3 minor versions ago), puma performs both reads and writes from the socket in worker threads, so this is also work for them. Slow clients emitting the request bytes slowly will one way or another fill the waiting sockets buffer and increase pool contention as much as ready requests. In most deployment scenarios in the wild this won't impact much, as most reverse proxy standard configs just buffer the whole request and dump it to the back-end in 1/2 packets, but in the exceptional cases, I would not see the benefit of the accept then queue approach. |
@HoneyryderChuck I remember thinking that too and was surprised when I contrived a local scenario that proved me wrong. From the first comment:
So if this is unchanged, and I was right to begin with (neither guaranteed!), slow clients may be able to fill the reactor with more requests than the pool size. My instinct was/is the same as yours, this is probably not a significant problem in the real world. On the other hand, I and others have seen some failure profiles that make me scratch my head...
|
Fixed in #2079. |
#1278 Was a fantastic improvement, allowing back pressure to the client via the socket backlog. Failure modes are much more graceful now with Puma. Thank you!
However, recently I was surprised to learn that accepted connections are placed in the thread pool work queue prior to getting placed "in" the
Reactor
(which would then put them back in the work queue when ready).I assume this is because in the common case and/or with a fast client (e.g. certain reverse proxy configurations), connections will be ready for read immediately and not need the trip to the Reactor and back into the work queue. This makes sense.
However, with the current logic this means that the
Server
can still suck up more requests than its pool has actual availability for, especially for large requests or slow-clients.Imagine the scenario below with 1 worker and 1 thread for simplicity of example (it applies just as well to multi-process and/or multi-thread).
Reactor
and the thread finishesThis will continue accepting connections without limit, until the thread(s) are all busy with "real" work.
I verified this experimentally, with a slow client (JMeter with low
httpclient.socket.http.cps
), and watched requests get sucked up continuously until enough reads completed and all workers were busy generating/writing responses.I don't think this is an enormous problem, just wanted to mention it since the goal as of #1278 is to not suck up more requests than intended.
One potential solution would be:
Reactor
method returning the count of connections being monitored+=
currently makes these changes opaque to@sockets
, here)@waiting
when determining when to accept connectionsHowever, after prototyping with something like that for a bit, I'm realizing that keep-alive functionality is coupled to this "overcommiting" the worker pool, so "fixing" this is likely a significant refactor.
The text was updated successfully, but these errors were encountered: