-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Archive with 6 entries only yields 1 #510
Comments
I noticed that |
Hi @mttkay, thanks for this. I have no idea what is going on with the For the TL;DR: I think you're seeing #493, which is fixed but not released yet. But even with that fix, you can't read files created by Archive with Archive doesn't construct Zip archives like most tools out there do. It streams files through its compressor and directly to disk in place, and then doesn't go back and fill in local header information for each file. This is fine in its own way, and is allowed for in the spec, but it's really for when you can't seek around in a file (i.e. when you really are streaming). The result is that there's no information in the local header about how big the compressed data is, so you can't decompress it using the local header alone - you need the Central Directory to fill in the gaps, because you do know how big the compressed data is for each entry by the time to get to writing out the CD at the end. So you can't use Aside: Archive gives us headaches all the time, and we keep seeing new ones. It's amazing how much effort is wasted as a knock-on effect of one (unfortunately wildly popular) tool taking a lax attitude towards implementing standards, but we are where we are. Think Different I guess? Rant over 😆 Sorry that was all a bit rambling, but hopefully makes some sense. |
I remember that Thanks for the detailed explanation -- that does sound difficult to deal with. 🤔 Happy to close this as a duplicate, though I'm still not sure whether the two issues merely coincided or are separate issues altogether. If it helps at all, here is the MR on GitLab where the issue was first detected by the reviewer: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/74525 Here is a link to the archive they used for testing: https://gitlab.com/gitlab-org/gitlab/uploads/167723f76729d7a2e6342f6bbae2a786/Archive.zip It could be interesting to run the (unreleased)
I'm saying this because with that same file created by |
Apologies for not coming back to you on this for so long. Thank you again for such a detailed write-up. I also can't reproduce this locally on Linux. I'll drop the zip file you link to above into a branch and push it through CI and see what happens. In the meantime, the interesting bit of that zipinfo output you link to is this:
Which is printed for each entry, except the first one. I have no idea what is going on there yet... |
No worries at all, it's not urgent. We have mitigated the problem by taking a product related trade-off and applying stricter limits. We are also looking to potentially move some of the heavier tasks of this kind into a Golang binary that we can shell out to, but that is just a proposal at this point. |
I don't have a Mac so CI will have to do...
I can't seem to reproduce this in CI. The below are links into the MacOS runs where I've printed out the
Unless I've mis-interpreted the issue you describe above - please let me know if so. Thank you for your help in getting to the bottom of this! |
Yes that looks correct, very odd 🤔 It was tough for me to investigate because I do not have access to a mac myself. Thank you for adding this test; this is great. I just realized I did not specify in the issue description that GitLab is still on RubyZip 2.0.0, so not the latest version. When first investigating the issue where this popped up I remember running against the latest version on my machine to rule out any differences due to version drift, but the person who stumbled on this on their Mac during a code review did not IIRC. I'm happy to close this out if this isn't reproducible. We can always investigate on our end again should it occur again, and re-file with a reproducible test case if necessary. It's always tough to tackle issues that have no reproducible test case attached. |
I don't have a Mac so CI will have to do...
I don't have a Mac so CI will have to do...
Ah, OK, I've now moved that commit onto the
Those lines show that So it looks like I can't reproduce this at the moment, sorry. |
If we have passing tests and no test case to reproduce this issue, I am fine with just closing this out. We can always re-open or re-file should we find out how to actually reproduce this reliably. |
Thanks @mttkay, I agree. I'll close this for now and we can come back to it if we get a reproducible example. |
I'm puzzled by this behavior and can't make sense of it: we have an archive with 6 files that was created with the
Archive
tool on macOS:It was just put together for testing. While working on #506, we found that the EOCD entry count for this archive is completely off, by several orders of magnitude (the field
total_number_of_entries_in_cdir_on_this_disk
), but only when running rubyzip over it on macOS (the count is correct on Linux).Moreover, when iterating this archive via
Zip::InputStream
, it only ever yields a single entry, and this happens on Linux, too (maybe these problems are unrelated):This prints 1.
When inspecting which entry it picked up:
Why is only a single entry being yielded and all other files are ignored? Does anything stand out about this archive? Here is what
zipinfo -v
has to say about it: https://gist.github.com/mttkay/9732ad4bb2d204716b15e04ecf880378The text was updated successfully, but these errors were encountered: