New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty dirs change introduced is incompatible with distlib library #294
Comments
Easy way to reproduce: In [1]: import distlib.wheel
In [3]: w=distlib.wheel.Wheel("D:\\azure_mgmt_iothub-0.8.0-py2.py3-none-any.whl")
In [4]: w.verify()
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-4-00bb035fd43f> in <module>()
----> 1 w.verify()
D:\VEnvs\testiot\Lib\site-packages\distlib\wheel.py in verify(self)
795 if u_arcname.endswith('/RECORD.jws'):
796 continue
--> 797 row = records[u_arcname]
798 if row[2] and str(zinfo.file_size) != row[2]:
799 raise DistlibException('size mismatch for '
KeyError: 'azure/' |
What do you suggest as a fix? If I simply revert the change, then those empty PEP 420 namespace packages will continue to be left out of wheels. |
Answering my own question: it's not. Perhaps the solution is to:
|
I've created a possible solution for the problem in the |
I have a fairly strong preference that |
I'm -1 on changing the format of the RECORD. |
I don't think anybody is suggesting a format change for RECORD. But I'm beginning to think that maybe a PEP amendment is in order to address the issue of empty directories in a wheel, one way or another. In the meantime, how do you feel about my proposed fix? It would restore old behavior for 99.99% of projects. |
By the way, can you point out a project that is suffering from the lack of empty directories in wheels? I've been unable to produce even an sdist of such a thing. |
Here's the example, built on pypa/sample-namespace-packages:
As you can see, building the |
I think this is reasonable. Tools obviously have different expectations, and the only real reason that I'm aware of for this change is to work around the Python issue that zipimport doesn't handle namespace imports if the directory entry isn't there. I don't think there's any real issue over what the spec specifies - requiring directory entries seems fine to me if that helps some people. But IMO the "zipfile tools all do this" argument is fairly weak - after all Python's own zipfile module doesn't, or this issue would never have arisen. Let's either lock down a specific behaviour in the spec, or accept that wheels without directory entries are entirely valid. |
Python's zipfile module does do this when building a zipfile from a directory tree:
The reason this issue arose was because |
But not when adding files individually, "by hand". Honestly, though, the "why" of this isn't important. The reality is that the wheel spec doesn't require compatible wheels to include directory entries. It would be pretty simple to change the spec, at which point fixing the tools is non-controversial. Why waste energy debating what's "right" when just proposing a spec change will likely go through with little or no problem? |
@jaraco Seems that the example you provided is not valid due to:
This indicates that |
I see two immediate ways forward:
I am also curious about what prompted the patch in the first place. Was there a real world project out there which was having problems? |
In that example,
Thanks - corrected.
This issue stems from pypa/packaging-problems#212.
To consider, neither option (1) nor option (2) will help the situation for wheels that were released (or end up being released) with 0.33.2. Option (2), the My preference would be:
|
I am thoroughly confused. The issue here was about empty directories, correct? Since the |
I took a more thorough look at the example and I see the problem now: the wheels are used directly without being unpacked first, which causes the problem. That said, until we get distlib sorted out, it would probably be prudent to retract this change since it's breaking things for people right now. |
I agree. Does that mean to pull 0.33.2 also? |
I'm not sure which is better: yanking the affected releases from PyPI or releasing a 0.33.4. People who have 0.33.2/3 installed right now might continue to have a bad time unless a new release is made. Or is this what you mean for me to do anyway? |
It's your call. I might try just pulling 0.33.2/3 and see if that's sufficient. If workflows have cached/saved versions, an 0.33.4 might be called-for, and you may want to do that in advance just to head off that possibility. |
I've yanked both 0.33.2 and 0.33.3. I'm releasing 0.33.4 to replace them. |
Thanks @agronholm for addressing this issue! |
I remember making the decision to leave out directories. It is easier. If it was git or mercurial you'd have to put a .keep file in there, would that work? |
@dholth It wouldn't. The issue isn't that directories aren't indicated. It's that they're implicit. You have an entry like this:
But you don't have any entries for But if you were to unzip and rezip any wheel using popular zip tools, it no longer has the issue. |
An updated distlib was released a few hours ago which addresses the directory entries handling issue. The next question is, when will it be sufficiently widespread to allow me to reintroduce the patch? |
What does older distlib think about the directory entry plus a hash for the empty string?
…On Tue, May 14, 2019, at 4:03 PM, Alex Grönholm wrote:
An updated distlib was released a few hours ago which addresses the directory entries handling issue. The next question is, when will it be sufficiently widespread to allow me to reintroduce the patch?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#294?email_source=notifications&email_token=AABSZEWP6WK6BIHCYE5UA23PVMLJXA5CNFSM4HMGYKC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVMUBWQ#issuecomment-492388570>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABSZER4MVY6VXLKHKQR3XDPVMLJXANCNFSM4HMGYKCQ>.
|
It may not like it or it may ignore it. I would recommend against having directory entries in RECORD or a hash of the empty string anywhere. |
Inside the wheel RECORD is something you can hash to check the entire wheel. If you leave out directory entries then two wheels with the same RECORD hash could install a different set of namespace packages. To preserve that feature the directory entries must be included in RECORD, but the hash of the empty string would not be necessary. |
You're right, although I fail to see any potential attack vector for malicious packages here. |
Good point. I hadn't thought of that. But it only affects completely empty directories, which I don't think are at stake here. Another way we may want to think about it: perhaps wheels could disallow empty directories or only include RECORD for completely empty directories. There's no need for a RECORD of directories that are there for the purpose of containing other modules and packages, which is what namespace packages typically do, as in that case, the RECORD already implicitly defines the namespace package. In other words, the presence of Of course, that is a seemingly valid use-case outside of zip files:
But not in zip files, even with directory entries:
That seems like an unnecessary variance, but also probably ignorable for the more common case of namespace packages containing anything. |
Unexpected present or missing directories wouldn't be a very good attack, no. Maybe it is better to think about it as an "is this the same?" question instead of an "is this malicious?" question. |
These details are getting fiddly enough that I think it's essential that whatever gets finally agreed is recorded in the wheel spec. |
It looks like 0.33.2 change is incompatible with latest
distlib
package.If I try to do
distlib.wheel.install()
it throws an error because folder ZipInfo object filename is not in the records:https://bitbucket.org/pypa/distlib/src/3b0fd333c8fb15bc04e570a23ee4836caefb7951/distlib/wheel.py#lines-521
I don't see any details about Folders in PEP (https://www.python.org/dev/peps/pep-0427/), so not clear if this is something that should be addressed by
wheel
ordistlib
package.The text was updated successfully, but these errors were encountered: