Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZipContainer cannot properly handle zip files with DOS attributes #134

Open
belltailjp opened this issue Sep 9, 2020 · 0 comments
Open
Labels
cat:bug Bug report or fix.

Comments

@belltailjp
Copy link
Member

belltailjp commented Sep 9, 2020

I found that the current pfio ZipContainer cannot handle a Zip file with DOS-compatible external attributes.

# Creation of hello_dos.zip is explained later.
$ zipinfo hello_dos.zip
Archive:  hello_dos.zip
Zip file size: 318 bytes, number of entries: 2
drwx---     2.0 fat        0 bx stor 20-Sep-09 16:34 FOO/
-rw----     2.0 fat        6 tx stor 20-Sep-09 16:34 FOO/HELLO.TXT
2 files, 6 bytes uncompressed, 6 bytes compressed:  0.0%

$ python
>>> import pfio
>>> container = pfio.open_as_container('hello_dos.zip')
>>> container.isdir('FOO')
False                  # <--------- Supposed to be True because FOO is a directory in the zip

# Since it cannot recognize the directory, it cannot list the contents of the directory either
>>> list(container.list('FOO/'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "xxxxxxxxxxxxxxxxxxxxx/pfio/containers/zip.py", line 190, in list
    "{} is not a directory".format(path_or_prefix))
NotADirectoryError: FOO is not a directory
...

Reproduction

Assuming Linux (Ubuntu 18.04) environment.

(1) Create a few test data

$ mkdir foo
$ echo hello > foo/hello.txt

# OK case: Unix attributes (I refer UNIX zip)
$ zip -r hello_unix.zip foo
  adding: foo/ (stored 0%)
  adding: foo/hello.txt (stored 0%)

# NG case: DOS attributes
$ zip -rk hello_dos.zip foo
  adding: FOO/ (stored 0%)
  adding: FOO/HELLO.TXT (stored 0%)

Here, -k option to zip command stands for having DOS-like attribute in the external attribute field in a ZIP file.

-k
--DOS-names
Attempt to convert the names and paths to conform to MSDOS, store only the MSDOS attribute (just the user write attribute from Unix), and mark the entry as made under MSDOS (even though it was not); for compatibility with PKUNZIP under MSDOS which cannot handle certain names such as those with two dots.

What actually happens is explained in the ZIP file format specification

If the external file attributes are compatible with MS-DOS and can be read by PKZIP for DOS version 2.04g then this value will be zero
https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT

(2) Check by zipfile

ZIP file created in the second way has different external_attr, making ZipInfo object slightly different.

>>> zipfile.ZipFile('hello_unix.zip').getinfo('foo/')
 <ZipInfo filename='foo/' filemode='drwxrwxr-x' external_attr=0x10>

>>> zipfile.ZipFile('hello_dos.zip').getinfo('FOO/')
<ZipInfo filename='FOO/' external_attr=0x10>

Note that we cannot see filemode in the ZipInfo object for the DOS zip.
Since filemode is parsed from the first 2 bytes in the external_attr (cf zipfile.py#L393-L396), but in DOS zip it's filled by zero.

(3) pfio

Since pfio (ZipContainer.stat) relies on the external attributes to distinguish a specified name is a directory or not, it cannot properly handle the directory.

>>> pfio.open_as_container('hello_unix.zip').stat('foo')
<ZipFileStat filename="foo/" mode="drwxrwxr-x">
>>> pfio.open_as_container('hello_unix.zip').isdir('foo')
True

>>> pfio.open_as_container('hello_dos.zip').stat('FOO')
<ZipFileStat filename="FOO/" mode="?---------">
>>> pfio.open_as_container('hello_dos.zip').isdir('FOO')
False

Reason

Directory check is done by simply checking the trailing '/' in CPython and the previous pfio but I (yes, I did! 😇) introduced the dependency with the external attrs field by #114...

@belldandyxtq belldandyxtq added the cat:bug Bug report or fix. label Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:bug Bug report or fix.
Projects
None yet
Development

No branches or pull requests

2 participants