-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_write_to_socket is changing input, which breaks write #962
Comments
Looks like a duplicate of #961, but with better info. I was planning to investigate this first thing in the morning. But if you got a repro, I can try to squeeze a release fix tonight, but it's really late, not sure I'm sharp enough. |
yikes, lets not super duper rush this, I will work on a PR today so you have something easy peasy to merge first thing in your morning! It will take me a few hours to set this up... a failing test kind of relies on side effects and the speed of write_nonblock ... |
More generally I see how the mutation is a big mistake, there should be a |
I totally get this |
Oh I think I get it now:
That's going to be tricky to reproduce, and worse to test, we'd need something like Toxiproxy. |
Yeah, fixing is kind of easy, testing super hard... but I am willing to give it a bit of a shot... we could cheat with mocking but I really hate that. Also... ideally we can amend this pipeline to not make string copies ... but it may be tricky. |
There is also the discrepancy between slice! and byteslice in play, there may be an edge case there. |
IIRC |
Seems like I did... We need to copy that buffer in |
it's fine if you cast the string to not be >> "\x03\u3042\xff".b.slice(0,3).bytesize
=> 3 I think that's our fix. I'm almost tempted to release it now, but without a repro it's kind of risky. |
I wonder if there is any way to get around with less copying, kind of wish there a write non block that takes offsets I think under some conditions partial string copies can be really cheap ? |
Not as far as I know. The only real optimization in strings is duping a frozen string, that creates a "shared string":
But a slice is always a fully new string: >> puts ObjectSpace.dump(s[0..40])
{"address":"0x7fc8d3abdc10", "type":"STRING", "class":"0x7fc8d40c7598", "bytesize":41, "value":"foobarbazfoobarbazfoobarbazfoobarbazfooba", "encoding":"UTF-8", "memsize":82, "flags":{"wb_protected":true}} Anyway, right now we mostly want a quick fix, we can see if there's some more performance to squeeze. But AFAIK using a mutable binary string as a buffer is the fastest method. |
sure I support this, I can give a shot at a repro here. But I agree from everything I can tell about this that a simple |
I just pushed cbcb700, I'm very tempted to release it. How worse can it be? |
I support releasing it, I am confident it will improve things. |
Done. I pushed a |
I made this PR that adds a test and moves stuff around a tiny bit Up to you if you want to merge or rework it. |
So I wrote the following benchmark: require 'benchmark/ips'
class FakeIO
def write_nonblock(buffer, exception: false)
raise 'wtf' if buffer.empty?
[buffer.bytesize, 65_527].min # popular socket buffer size
end
end
class CurrentBufferedWriter
def initialize(io)
@io = io
end
def _write_to_socket(data)
total_bytes_written = 0
loop do
case bytes_written = @io.write_nonblock(data, exception: false)
when :wait_readable
unless wait_readable(@write_timeout)
raise Redis::TimeoutError
end
when :wait_writable
unless wait_writable(@write_timeout)
raise Redis::TimeoutError
end
when nil
raise Errno::ECONNRESET
when Integer
total_bytes_written += bytes_written
if bytes_written < data.bytesize
data.slice!(0, bytes_written)
else
return total_bytes_written
end
end
end
end
def write(data)
data = data.b
length = data.bytesize
total_count = 0
loop do
count = _write_to_socket(data)
total_count += count
return total_count if total_count >= length
data = data.byteslice(count..-1)
end
end
end
class MutatingBufferedWriter
def initialize(io)
@io = io
end
def write(data)
buffer = data.b
length = data.bytesize
total_bytes_written = 0
loop do
case bytes_written = @io.write_nonblock(buffer, exception: false)
when :wait_readable
unless wait_readable(@write_timeout)
raise Redis::TimeoutError
end
when :wait_writable
unless wait_writable(@write_timeout)
raise Redis::TimeoutError
end
when nil
raise Errno::ECONNRESET
when Integer
total_bytes_written += bytes_written
return total_bytes_written if total_bytes_written >= length
buffer.slice!(0, bytes_written)
end
end
end
end
class CopyingBufferedWriter
def initialize(io)
@io = io
end
def write(data)
length = data.bytesize
total_bytes_written = 0
loop do
case bytes_written = @io.write_nonblock(data, exception: false)
when :wait_readable
unless wait_readable(@write_timeout)
raise Redis::TimeoutError
end
when :wait_writable
unless wait_writable(@write_timeout)
raise Redis::TimeoutError
end
when nil
raise Errno::ECONNRESET
when Integer
total_bytes_written += bytes_written
return total_bytes_written if total_bytes_written >= length
data = data.byteslice(bytes_written..-1)
end
end
end
end
current = CurrentBufferedWriter.new(FakeIO.new)
mutating = MutatingBufferedWriter.new(FakeIO.new)
copying = CopyingBufferedWriter.new(FakeIO.new)
PAYLOAD = ("\u3042" * 32_000).freeze
Benchmark.ips do |x|
x.config(:time => 10, :warmup => 2)
x.report('current') { current.write(PAYLOAD) }
x.report('mutating') { mutating.write(PAYLOAD) }
x.report('copying') { copying.write(PAYLOAD) }
x.compare!
end On 2.7.1 it get:
The difference is so big that I can hardly believe it. |
Holy cow, turns out byteslize can create shared strings:
Ok, let's go with that. |
@casperisfine wow the copying is so cool, I did not know about byteslice being so efficient! |
This is an incredibly tricky area of Ruby I/O and has some rough edge cases. There is an issue discussing introducing https://bugs.ruby-lang.org/issues/13626 Maybe we should introduce optional timeouts for read/write and other blocking operations. |
I think we can close this now. Thanks all! |
This code is amending the input of the data
redis-rb/lib/redis/connection/ruby.rb
Line 87 in d896ae2
And so is this code:
redis-rb/lib/redis/connection/ruby.rb
Line 106 in d896ae2
Simplest fix here is simply to stop the
slice!
call in_write_to_socket
This is very urgent @byroot latest release is broken. Ideally we should have some sort of test to catch this.
End result is that if you are lucky, partial things are shipped to redis, if you are unlucky stuff explodes on:
The text was updated successfully, but these errors were encountered: