New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Requests to Puma hanging due to issues with Keep-Alive, Content-Length, and HTTP_VERSION headers #1565
Comments
This may be related to the fact that the request has both a |
I have confirmed that this issue is avoided if the request does not include a |
Interesting bug. HTTP_VERSION shouldn't be used to keep the http version header (not required by spec, at least). However puma, unicorn and thin (at least) set it. And I bet that some middlewares rely on it. Good luck 👍 |
It comes from here, set via this and lives in Ragel here and here As part of the HTTP spec every client-provided header gets transformed to uppercase, has The problem is we're already setting Though it's not in the Rack spec, this is also a problem in Rack itself (rack/rack#970). We can fix Puma and we can open PRs on Thin/Unicorn too but I'd love some thoughts from the maintainers of Puma on it before I do |
Hi everyone, Is there any news on that subject ? Can we help ? :) |
Adding some research here from my exploration... There may be some interaction with chunked encodings, as was the case for this issue with node.js: nodejs/node-v0.x-archive#940 I did try playing with the chunked header but it did not appear to help ab -k numbers at least on JRuby + Rails set up. As another really weird data point, @noahgibbs "RSB" benchmarks seem to work just fine with ab -k in the rack benchmark app:
Throwing it at a generated Rails app has the hanging behavior. I'm not sure what's different between these. |
And for completeness here's ab -k against a simple generated Rails app:
|
I have a (maybe?) JRuby-specific variant on this problem to report. When I run a very simple Rack server and use the wrk testing tool (https://github.com/wg/wrk), I get 13335 reqs/sec with CRuby 2.6, but 246.65 reqs/sec with JRuby 9.2.5.0. One way to repro this bug is to start from public AMI ami-088f9a7191b2befa7. After creating and instance based on it, log in as the "ubuntu" user. You'll have RVM and Ruby available. There's a trivial Rack "Hello, world" app in ~ubuntu/config.ru. From there: rvm use jruby-9.2.50
puma
# on a different login, or after backgrounding Puma
wrk -d 5 http://localhost:9292 If you substitute "rvm use 2.6.0" for "rvm use jruby-9.2.5.0" you should get around 13,000 reqs/sec. With JRuby it's reliably between 247 reqs/sec and 250 reqs/sec. The AMI isn't doing anything terribly complicated. It builds its own "wrk" binary because wrk isn't reliably packaged. But it's using normal Ubuntu and RVM. |
Just FYI, in Rack 2.1 we use |
Turned out this was: GHSA-7xx3-m584-x994 |
I found this issue when debugging puma's performance curve https://github.com/socketry/falcon-benchmark - I live streamed the investigation here https://www.youtube.com/watch?v=2u6JRvKh7Dg - at the time I didn't really realise it was a serious issue since. So, sorry about that. Basically, when you look at puma's "latency" in response to changes in concurrency, any server with a fixed pool size should have a linear increase in latency as the number of simultaneous requests increases. If you have 8 threads, and you make 8 requests that take 10ms each, the same configuration with 16 requests should mean the first 8 requests take 10ms each, and the 2nd 8 requests must wait 10ms. If
The second batch of connections, the request was sent, but no response was received, Unfortunately When you look at this on a graph (which you can see I was making on falcon-benchmark), where you'd normally expect latency to increase as the number of concurrent connections to increase, it stayed the same for puma which was super confusing to me. If you throw 8 connections at Puma and it has 8 threads, you expect the latency to be equal to the latency of the request itself. But if you throw more connections at puma, you will encounter queuing latency too, and that wasn't showing up. I made a fork of |
Neither did we! So don't worry about it. Thanks for the writeup. |
Issue
I have a situation in which requests to a Puma server are hanging when the
Connection: Keep-Alive
header was passed. I noticed that the response Puma was returning to the client didn't contain aContent-Length
, and also didn't have chunked transfer encoding. I don't believe that's a valid response, so that's the first issue.Digging a bit deeper into Puma's source, I noticed a strange thing on line 672 of
server.rb
. Even though theVersion
header containedHTTP/1.1
, somehow the request on the Puma side showed the valueHTTP/1.1, HTTP/1.1
. See the image:Since line 672 of
server.rb
checks for exact equality (http_11 = if env[HTTP_VERSION] == HTTP_11
), it doesn't believe the request is HTTP/1.1. It then goes on to respond without either a chunked transfer encoding or a content length, which I believe is invalid. Unfortunately I don't know how to dig deeper to figure out where the "doubled" version is coming from. When I use Wireshark to look at the data on the wire, I don't see the doubling of the HTTP version.Here's the request:
Here's the response. Notice that the response is HTTP/1.0, the "Connection: Keep-Alive" is still there, but there's no Content-Length.
Sorry I don't have an easy way to reproduce this, but I've tried to include as much information as I can. There seem to be two issues on Puma's side:
I can work around this issue by not passing
Connection: Keep-Alive
to the server, but it seems like there's an underlying issue in Puma that should probably be addressed.System configuration
Ruby version: 2.3.5
Rails version: 4.2.10
Puma version: 3.11.4
The text was updated successfully, but these errors were encountered: