New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docx files are corrupted when zipped using RubyZip #449
Comments
I was able to mitigate this issue by using Nokogiri to parse the XML while zipping the files. Here is the gist for what I was able to do: https://gist.github.com/tinabel/ddd5cc9b0dd762986918520a132800d2 |
I'd love to have a crack at investigating this issue, but I know nothing about Word documents and I'm struggling to reproduce it. What I've tried:
zipfile = Zip::File.open("/opt/windows/word.zip", true)
zipfile.add("zipped.docx", "/opt/windows/test.docx")
zipfile.close
$ zipinfo /opt/windows/word.zip
Archive: /opt/windows/word.zip
Zip file size: 9683 bytes, number of entries: 1
-rw-r--r-- 5.2 unx 12278 t- defN 20-May-29 18:38 zipped.docx
1 file, 12278 bytes uncompressed, 9563 bytes compressed: 22.1%
$ diff /opt/windows/test.docx /opt/windows/zipped.docx
This all works. I'm using I'm sure I'm missing something important! |
I have now tried downloading the gist linked above and running it on a directory of assorted I'm using ruby 2.7.2 and have repeated this with RubyZip 2.3.x and 3.0 (HEAD). If someone can send me a file that is corrupted when zipped with RubyZip I'd love to investigate this further, or is there another way I should be testing this? Otherwise I wonder if files are getting corrupted in another step before RubyZip gets hold of them? I cannot reproduce this issue with the current information available, sorry. |
@hainesr I'd open the discussions tab on the repo, and move this issue there. |
Hi there -- I'm having an issue with docx file headers being corrupted whenever I zip them up using RubyZip. I've tried to use the write_buffer solution, but I'm also zipping files recursively and it's not quite working right. Will there be any solution for docx that doesn't involve using write_buffer?
The text was updated successfully, but these errors were encountered: