Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

open_buffer corrupts the buffer it's given #512

Closed
Taywee opened this issue Nov 29, 2021 · 2 comments
Closed

open_buffer corrupts the buffer it's given #512

Taywee opened this issue Nov 29, 2021 · 2 comments
Assignees
Milestone

Comments

@Taywee
Copy link

Taywee commented Nov 29, 2021

This is maybe related to #280, but in my case, the archive doesn't just have its order changed, but is completely corrupted. Here is a complete example, including the archive embedded (Just a simple zip with a file foo.txt with the contents "FOO!\n" and bar.txt with the contents "BAR?\n"):

#!/usr/bin/env ruby

require 'zip'

archive = "PK\x03\x04\n\x00\x02\x00\x00\x00\x8Cr}S\x9F$j\xC5\x05\x00\x00\x00\x05\x00\x00\x00\a\x00\x1C\x00foo.txtUT\t\x00\x03\x17D\xA5a'D\xA5aux\v\x00\x01\x04\xE8\x03\x00\x00\x04\xE8\x03\x00\x00FOO!\nPK\x03\x04\n\x00\x02\x00\x00\x00\x90r}S\x90\xFA\x8A\x10\x05\x00\x00\x00\x05\x00\x00\x00\a\x00\x1C\x00bar.txtUT\t\x00\x03 D\xA5a'D\xA5aux\v\x00\x01\x04\xE8\x03\x00\x00\x04\xE8\x03\x00\x00BAR?\nPK\x01\x02\x1E\x03\n\x00\x02\x00\x00\x00\x8Cr}S\x9F$j\xC5\x05\x00\x00\x00\x05\x00\x00\x00\a\x00\x18\x00\x00\x00\x00\x00\x01\x00\x00\x00\xA4\x81\x00\x00\x00\x00foo.txtUT\x05\x00\x03\x17D\xA5aux\v\x00\x01\x04\xE8\x03\x00\x00\x04\xE8\x03\x00\x00PK\x01\x02\x1E\x03\n\x00\x02\x00\x00\x00\x90r}S\x90\xFA\x8A\x10\x05\x00\x00\x00\x05\x00\x00\x00\a\x00\x18\x00\x00\x00\x00\x00\x01\x00\x00\x00\xA4\x81F\x00\x00\x00bar.txtUT\x05\x00\x03 D\xA5aux\v\x00\x01\x04\xE8\x03\x00\x00\x04\xE8\x03\x00\x00PK\x05\x06\x00\x00\x00\x00\x02\x00\x02\x00\x9A\x00\x00\x00\x8C\x00\x00\x00\x00\x00"

Zip::File.open_buffer(archive) do |archive|
  raise RuntimeError.new unless archive.read('bar.txt') == "BAR?\n"
end

puts 'First read worked'

Zip::File.open_buffer(archive) do |archive|
  raise RuntimeError.new unless archive.read('bar.txt') == "BAR?\n"
end

puts 'Second read worked'

This fails on attempting to read the second time, because the buffer has been corrupted:

$ ./corrupt.rb
First read worked
/usr/lib64/ruby/gems/2.6.0/gems/rubyzip-2.3.2/lib/zip/file.rb:406:in `get_entry': No such file or directory - bar.txt (Errno::ENOENT)
	from /usr/lib64/ruby/gems/2.6.0/gems/rubyzip-2.3.2/lib/zip/file.rb:259:in `get_input_stream'
	from /usr/lib64/ruby/gems/2.6.0/gems/rubyzip-2.3.2/lib/zip/file.rb:295:in `read'
	from corrupt.rb:14:in `block in <main>'
	from /usr/lib64/ruby/gems/2.6.0/gems/rubyzip-2.3.2/lib/zip/file.rb:156:in `open_buffer'
	from corrupt.rb:13:in `<main>'

I'm not against the buffer being rewritten for whatever purpose, and I know that you can avoid this by either duplicating the buffer or freezing it to cause the write to fail, but it definitely shouldn't be corrupted and unable to be treated as an equivalent zip buffer after opening, especially when no changes have been made. In my production use-case, I got a zip file that was unable to be read at all, and I didn't even know it was corrupt until long after the initial zip read, when it had been stored in a database and later retrieved. I'm currently working around this by just freezing the buffer, which is an effective solution for me.

@hainesr hainesr self-assigned this Nov 30, 2021
@hainesr hainesr added the bug label Nov 30, 2021
@hainesr hainesr added this to the 3.0 milestone Nov 30, 2021
@hainesr
Copy link
Member

hainesr commented Nov 30, 2021

Hi @Taywee, many thanks for this, and for your comprehensive report.

This is a bug. Just reading a buffer shouldn't rewrite anything, and even if it did, if nothing changes it should just write out the same data. So maybe it's two bugs.

Stopping it changing the buffer if there's no change to the archive is easy enough, so I'll get that sorted ASAP.

@hainesr
Copy link
Member

hainesr commented Nov 30, 2021

Right, this is basically fixed in master now. The buffer is no longer rewritten if nothing in the archive changes.

I'm going to leave this issue open though, because I want to debug why the data was changing even if nothing in the archive had changed. It's going to take a bit more investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants