Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split feature is non-compliant #98

Open
jstanley0 opened this issue Sep 25, 2013 · 3 comments
Open

split feature is non-compliant #98

jstanley0 opened this issue Sep 25, 2013 · 3 comments
Labels
Milestone

Comments

@jstanley0
Copy link
Contributor

The split feature implemented in Pull request 75 doesn't conform to the standard, in that:

  1. Local and central directory header records must never be split across a segment boundary. (8.5.2) The rubyzip split code doesn't even look at the zip file content; it just blindly chops its up into fixed-size pieces.
  2. The central directory entry for each file should indicate a disk number where the file starts (4.3.12); this is hard-coded to 0 in rubyzip.
  3. The end-of-central-directory record should indicate disk numbers where the central directory begins and ends, and also the number of entries located on the last disk in addition to the total number of entries. (4.3.16) rubyzip is again hard-coded to assume there is only one disk.

(section numbers refer to version 6.5.2 of http://www.pkware.com/documents/casestudies/APPNOTE.TXT)

In addition, the test code doesn't test whether anyone might be able to read the split archive; it merely strings the pieces back together and tests reading the reconstituted file. The zip specification is, however, designed not to require stringing pieces of split archives together--or even generating a single big archive to begin with, as fogeys like me who remember spanning archives across floppy disks would know. It's designed so that the central directory and any particular file can be located in-place in their segment (or "disk"). rubyzip doesn't accomplish this.

@jstanley0
Copy link
Contributor Author

FYI: This feature actually isn't important to me. I just happen to be in the middle of implementing zip64 write support, which adds more fields related to disk numbers (as the zip64 end of central directory record itself can be split across disks). And I am continuing to hard-code everything to assume a single disk. This behavior makes me feel bad unless the inconsistency with the split feature is noted.

@simonoff
Copy link
Member

I don't like implementation of split in current version too but just now haven't time to do it proper way.

@hainesr hainesr added the bug label May 29, 2021
@hainesr
Copy link
Member

hainesr commented May 29, 2021

I've been looking at the splitting code myself (and well remember plugging multiple floppies in to unzip large files). I do wonder if this feature is relevant anymore - but as we have it, it should at least follow the standard. I'm going to highlight the distinction between spanning [1] and splitting [2] and assume we don't need to support spanning anymore!

Anyway, this is to say I will try and get round to this but probably won't hit v3.0.

[1] "segmenting a ZIP file across multiple removable media"
[2] "does not require writing each segment to a unique removable medium and instead supports placing all pieces onto local or non-removable locations such as file systems, local drives, folders, etc"

@hainesr hainesr added this to the Future milestone May 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants