Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore pre 22.3 wheel cache pathing #11538

Closed
wants to merge 3 commits into from

Conversation

cboylan
Copy link
Contributor

@cboylan cboylan commented Oct 20, 2022

Prior to bad03ef wheel cache paths incorporated source material hashes in their paths. This commit which ended up in 22.3 stopped including that information. This is problematic for two reasons. First our cache is no longer encoding data integrity information that was previously intentionally included. Second it means anyone upgrading from < 22.3 to 22.3 will have orphaned wheel cache entries.

The fix here is to update the Link object to set Link.link_hash in the Link.from_json method. Otherwise the hash information is simply missing.

This will cause anyone upgrading from 22.3 to newer to have orphaned wheels, but that seems worthwhile considering 22.3 hasn't been around as long as the previous implementation and we get stronger data integrity controls out of it.

This fixes #11527

@cboylan
Copy link
Contributor Author

cboylan commented Oct 20, 2022

Please feel free to edit/update/rewrite this PR as necessary.

@cboylan
Copy link
Contributor Author

cboylan commented Oct 24, 2022

I believe these test failures are caused by an update to git to fix a security issue and are not related to my change. I don't see an issue or PR for this yet. Does anyone know if this is being addressed or is this something that needs to be done?

FWIW my understanding of the issue is that you are only exposed if you don't trust the git repo that is being added as a submodule. If you do trust it you can do something like:

git -c 'protocol.file.allow=always' submodule add /path/to/submodule

It looks like pip's tests may build the git repo from scratch in which case it would be trusted. But I'm not familiar enough with pip's test suite to be confident of that.

@pfmoore
Copy link
Member

pfmoore commented Oct 24, 2022

I think you're right in your analysis. Would you be interested in creating a PR to fix the test?

@sbidoul do you know why we have a dedicated test for submodules? It feels like we're going into too much detail on "git stuff" - the test feels like it's simply checking that git works the way it should, not testing any pip specific functionality...

@sbidoul
Copy link
Member

sbidoul commented Oct 25, 2022

@pfmoore actually, pip does have dedicated code to handle git submodules and I think that test is to exercise that part.

for hash_name, hash_value in hashes.items():
if hash_name in _SUPPORTED_HASHES:
link_hash = LinkHash(name=hash_name, value=hash_value)
break
Copy link
Member

@sbidoul sbidoul Oct 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should get the first hash found by iterating _SUPPORTED_HASHES so as to get a deterministic result that does not depend on the order of hashes provided by the index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. If the index starts to support newer hashes we'll end up orphaning cache entries, but we'll only do so when the hashes become stronger (due to the order of _SUPPORTED_HASHES entries ) and not randomly due to index changes.

For updates like this do you prefer a follow on commit or a squash update and force push?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to find squashes cleaner (you can click force-pushed to see what changed), but I don't think we have any strong enforced preferences in the code base.

@sbidoul
Copy link
Member

sbidoul commented Oct 25, 2022

Thanks for this PR, @cboylan. I'm not super familiar with pip's hash checking code but, echoing what you said in #11527 (comment), I'm a bit worried that this bug may have affected something worse than the cache and that our test suite did not catch it.

@edmorley
Copy link
Contributor

edmorley commented Nov 3, 2022

@cboylan Hi! The failing submodule test has been marked as an expected failure to fix CI as of #11555 - could you rebase on main to pick that change up? It would be great to see if this can be included in a 22.3.1 release (presuming one occurs for eg #11547) :-)

Prior to bad03ef wheel cache paths
incorporated source material hashes in their paths. This commit which
ended up in 22.3 stopped including that information. This is problematic
for two reasons. First our cache is no longer encoding data integrity
information that was previously intentionally included. Second it means
anyone upgrading from < 22.3 to 22.3 will have orphaned wheel cache
entries.

The fix here is to update the Link object to set Link.link_hash in the
Link.from_json method. Otherwise the hash information is simply missing.

This will cause anyone upgrading from 22.3 to newer to have orphaned
wheels, but that seems worthwile considering 22.3 hasn't been around as
long as the previous implementation and we get stronger data integrity
controls out of it.

This fixes pypa#11527
@pradyunsg
Copy link
Member

@pfmoore it would be good to include this in the bugfix release as well. Any concerns (other than the news fragment)?

@pfmoore
Copy link
Member

pfmoore commented Nov 3, 2022

As long as someone reviews, approves and commits it before the weekend, I'm fine with it. But I note @sbidoul's comment above:

I'm a bit worried that this bug may have affected something worse than the cache and that our test suite did not catch it.

Are we sure that the fix here is correct? I do not want to merge this, release 22.3.1, and then face calls that we need a 22.3.2 because there's a bigger problem here...

I will note that we've had basically nobody else complaining that #11527 is a significant issue, nor have we had any other bug reports that I'm aware of that are related to this. So I don't consider this a serious issue that needs a bugfix release in its own right. So personally, I'd rather we ensure that we get it right rather that rush a fix just to get into 22.3.1. If people with more expertise in the cache and hash checking mode than I have are comfortable it's OK, though, I'm happy to accept their word on the matter.

@sbidoul
Copy link
Member

sbidoul commented Nov 3, 2022

#11557 was reported as a hash checking regression. But I've not investigated myself. I don't have much time at hand at the moment, unfortunately.

@cboylan
Copy link
Contributor Author

cboylan commented Nov 3, 2022

Are we sure that the fix here is correct? I do not want to merge this, release 22.3.1, and then face calls that we need a 22.3.2 because there's a bigger problem here...

I will note that we've had basically nobody else complaining that #11527 is a significant issue, nor have we had any other bug reports that I'm aware of that are related to this. So I don't consider this a serious issue that needs a bugfix release in its own right. So personally, I'd rather we ensure that we get it right rather that rush a fix just to get into 22.3.1. If people with more expertise in the cache and hash checking mode than I have are comfortable it's OK, though, I'm happy to accept their word on the matter.

For whatever it is worth, addressing this isn't urgent for me either. I do think this is an improvement and addresses a regression, but I'm happy to take the time to make sure it is correct as well.

This was suggested in code review.

Co-authored-by: Stéphane Bidoul <stephane.bidoul@acsone.eu>
news/11527.bugfix.rst Outdated Show resolved Hide resolved
This came up in code review. It was pointed out that the news entry appears in the changelog so needs to be written for that audience.

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
@n1ngu
Copy link

n1ngu commented Jan 4, 2023

Fixes #11692 as well!

@sbidoul
Copy link
Member

sbidoul commented Jan 8, 2023

This one would be superseded by #11696, but we probably want keep the new test from this PR.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Wheel cache is not forward/backward compatible between 22.2.x and 22.3
7 participants