Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash-checking mode breaks after new distributions are added #5874

Closed
di opened this issue Oct 10, 2018 · 7 comments
Closed

Hash-checking mode breaks after new distributions are added #5874

di opened this issue Oct 10, 2018 · 7 comments
Labels
auto-locked Outdated issues that have been locked by automation C: requirement file Using `requirements.txt` type: enhancement Improvements to functionality

Comments

@di
Copy link
Sponsor Member

di commented Oct 10, 2018

What's the problem this feature will solve?
Currently, it's possible to add new distributions to an existing release on PyPI at any point after the release was initially made.

Occasionally when this happens, the new distribution is preferred by pip over previous distributions. For example, initially there is only a source distribution, then 1 week later the maintainer adds a built distribution.

For users who have included hashes in their requirements files, this occasionally leads to breakage (THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE.) because pip is preferring a distribution that didn't exist when the user added the hashes, and (correctly) refusing to install it.

The current behavior is roughly:

  • find the most ideal distribution
  • check if that distribution's hash is in the list of hashes
  • install if yes, fail if no

Describe the solution you'd like
The behavior of pip could be this:

  • find any installable distributions
  • sort them from most to least ideal (i.e., platform-specific wheels before source distributions)
  • install the first one that has a matching hash
  • fail if it can't install any of them

This would maintain the same security that hash-checking mode provides, while avoiding the potential for breakage when new distributions are added to a release.

Alternative Solutions
Some alternate things we could do:

  • PyPI could require that maintainers upload their distributions all at once
    • This would add an additional maintenance burden for the maintainers who build distributions on a variety of different platforms
  • PyPI could require that maintainers upload their distributions within some arbitrary window
    • This has all the same problems we currently have, just in a narrower window of time
  • We could tell users that they need to use --only-binary and --no-binary flags
    • This is an additional burden to users
    • This could still break if the maintainer deletes the .tar.gz source distribution and uploads a .zip source distribution (or vice versa)
    • This could still break if the maintainer adds a universal wheel when there used to be only a platform-specific wheel.
  • We could tell users that they need to specify direct links to distributions on pythonhosted.org instead
    • This is an additional burden to users
    • This doesn't always work if the dependencies need to be installable across multiple platforms

Additional context
Some additional discussion at pypa/packaging.python.org#564.

@dstufft
Copy link
Member

dstufft commented Oct 10, 2018

I think it's a good idea to do the solution described here, to limit our installation to only things we have hashes for. Ideally I think we'd print a warning of some kind if we're not installing the preferred file due to it not having a hash listed though.

@uranusjr
Copy link
Member

I believe this is the same as #3634.

@bennuttall
Copy link

I opened a similar issue (which I've now closed in favour of this) #6395 describing this problem but for distributions on additional indexes. We had some reports of issues from people trying to install certbot on Raspberry Pi and picking up wheels from piwheels.org but having a hash-mismatch.

Also see related but different #6394 - proposal to add a flag to remove additional indexes.

alexbecker added a commit to alexbecker/pip that referenced this issue May 5, 2019
Changes the behavior of `pip install` in hash-checking mode to filter
out any candidates whose hashes (obtained via URL) do not match the
hashes provided. This prevents a HashMismatch error when a more
preferred binary distribution is upload for a release after a user
pins the hashes for that release.

Note that a second hash comparison is performed when the candidate is
downloaded. This is important because the first check is not secure: it
trusts that the hash in the URL is the same as the hash in the content,
and it also does not error when the user is in hash-checking mode but
has not provided a hash.
@alexbecker
Copy link

I think #6464 solves this and addresses @dstufft's concern by logging a warning when a file is skipped due to not having hashes. Before I polish it up (add tests, fix 1 test that broke for reasons orthogonal to correctness), can I get approval-in-principal on the approach?

@indygreg
Copy link

On quick glance, the behavior in #6464 - ignoring artifacts for which we don't have a pinned hash - seems like a reasonable solution to this issue! Thank you for coding it up!

cjerdonek added a commit that referenced this issue Jul 14, 2019
Address #5874: Prefer candidates with allowed hashes
@cjerdonek
Copy link
Member

This has now been addressed by PR #6699. In a subsequent PR, I'll do what @dstufft suggested above (I'll file a separate issue about that in a bit):

Ideally I think we'd print a warning of some kind if we're not installing the preferred file due to it not having a hash listed though.

There is also a pending PR #6714 that adds some additional debug logging.

@di
Copy link
Sponsor Member Author

di commented Jul 16, 2019

Thank you @cjerdonek!

@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Aug 15, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Aug 15, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: requirement file Using `requirements.txt` type: enhancement Improvements to functionality
Projects
None yet
Development

No branches or pull requests

7 participants