New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Puma::IOBuffer can be more efficient and used in more places #2457
Comments
Certainly! For inclusion, we'll be looking at changes in results in the |
I rewrote the With this in mind, I changed it back the original implementation but:
On the |
Anyone ever try the NIO::ByteBuffer class? |
Linking the original removal of the C extension here: #1980 |
@MSP-Greg I haven't used the NIO class in particular but this kind of buffers/stringbuilders were what inspired the issue in the first place. As I mentioned above I could not get the Ruby implementation to keep up with the C appending functions for @nateberkopec Interesting to see the discussion on the other topic. For some reason it got into my head that this was pretty old code but it's only been a year or so. In the discussions it's mentioned that we have to support all the way back to Ruby 2.2, is there a list somewhere of supported Rubies? I'm not advocating we ruthlessly drop support but I do want to point out that even 2.4 is no longer supported by the core Ruby team and supporting older versions implicitly encourages people to run potentially unsecure Ruby versions. After a good nights sleep I had some more ideas and will do some extra tests later this week. EDIT: After some more digging, it seems that TCP_NODELAY is NOT set by default for most apps, since even though |
The test suite is the spec - if it ain't in the test suite, we don't support it. We'll investigate the Ruby support issue again when we start thinking about Puma 6.0, but that is at least 1 month away. |
In terms of the supported version: Line 30 in b127d4c
As Nate mentioned, CI gives a good idea, as it may show platform info that can't be placed in the gemspec. Sorry for mentioning NIO::ByteBuffer. At present, string operations aren't considered a bottleneck in request processing. As to |
I finally got around to benchmarking a few more variants:
That leaves the few % to be gained from optimizing how the underlying String for the IOBuffer is initialized. Since Ruby 2.2 is still supported that won't work either since setting the initial capacity was only added later. I'll just add it to my longterm todo list and come back in a few years 🙂 |
I'll mark as "6.0" since we'll consider dropping 2.2 support at that time. |
Benchmark shows the current append implementation is slower in
|
In Passenger we use writev() to perform a gathered write. This way we don't even have to allocate that final string. Not sure whether Ruby core has support for writev nowadays. In Passenger we wrote our own support for it, through a native extension. |
Thanks for the post.
Given that, and also issues with encoding and bodies that need Puma to support 'chunking', I personally thought using a Again personally, I think a lot of code re IO and sockets was developed when memory and bandwidth were much lower than they are today. @wjordan and others are looking at other options, like |
Yes. I'll add to the comments |
At the moment,
Puma::IOBuffer
is just a subclass ofString
, with the append method overwritten andreset
aliased toclear
. Strangely, the overwrittenappend
behavior seems to implement the same behavior as a normalString
, but that is not important for this issue.The main use of
IOBuffer
is during request serving where it is used to accumulate lines for the response status line and headers. During the course of a request, several lines are appended to the empty string and all in the end they are all fed tofast_write
in a single call. Because aString
in Ruby is implemented as a dynamic array of bytes representing codepoints, these repeated additions will cause the underlying array to be doubled in size and copied whenever theString
requires more space. Because the initial size is very low and all the headers together can easily be a few hundred bytes, this doubling occurs several times, incurring allocations and copying each time. It's still amortized O(1), but the constant is higher than it could be.A simple way to prevent this would be to change
IOBuffer
class so that it maintains atotal_size
along with an array ofString
s that have been added so far. At the end whento_s
is called on it, it allocates a singleString
with the correct capacity straight away and copies the entire array of strings into that.The early hints construction in
str_early_hints
could probably also benefit from being buffered this way. I can create a PR if desired?The text was updated successfully, but these errors were encountered: