Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reject packages without a Requires-Python #3889

Open
pganssle opened this issue May 3, 2018 · 15 comments
Open

Reject packages without a Requires-Python #3889

pganssle opened this issue May 3, 2018 · 15 comments

Comments

@pganssle
Copy link
Contributor

pganssle commented May 3, 2018

A problem I've come across many times now is that a package will drop support for an old version and they'll even add a python_requires directive to their setup.py, only to cut a release using an old version of setuptools that doesn't actually understand the python_requires directive and thus silently fails. See, e.g. sphinx. This causes a bunch of problems because the erroneously uploaded packages now pollute pip installs of that package unless they are removed from PyPI or a post release is made.

Since there is no good reason to upload a package without explicitly specifying a Requires-Python directive, I think we can assume that this is an error and reject packages that don't have a Requires-Python.

My recommendation for migrating to the "always throw an error" version:

  1. Start throwing a warning when Requires-Python is missing, with a link to the documentation on how to add Requires-Python
  2. Update twine and all tools that upload packages (and setuptools, even though it's deprecated) to automatically transform that warning into an error overridable by --allow-no-python-requires
  3. Add a backwards-compatibility API that allows you to upload packages without Requires-Python by configuring your upload endpoint to something not the default
  4. Upgrade the warning to an error, dropping support for --allow-no-python-requires.

I think we could swap the order of 2 and 3 pretty easily - I'm guessing that updating the upload tools would be easier to do which is why I put them in this order, but 2 could be implemented in terms of 3.

In terms of time frame, I don't know how aggressive you (the Warehouse team) want to be. I think the first 3 can happen as soon as an implementation is available. It's probably a big ask to have all package maintainers switch over in a relatively short period of time, but I think a long deprecation period will be harmful given that it's likely a large number of packages are going to start dropping support for Python 2 soon, which will probably cause a ton of headaches if people aren't including Requires-Python in their metadata. Maybe a 6 months or so?

@di
Copy link
Member

di commented May 3, 2018

Since there is no good reason to upload a package without explicitly specifying a Requires-Python directive, I think we can assume that this is an error and reject packages that don't have a Requires-Python.

One use case would be a package that is compatible (or at least, thinks it's compatible) with all versions of Python. What should this field be for those packages, something like Requires-Python: >0?

  1. Start throwing a warning when Requires-Python is missing, with a link to the documentation on how to add Requires-Python

I don't believe displaying a warning for a successful upload is possible at the moment without making changes to twine and the upload API, but this would probably be a beneficial change to make in general.

Worth mentioning that #726 might help with this a bit, because maintainers would be able to verify that the Requires-Python metadata they set actually made it to PyPI.

@pganssle
Copy link
Contributor Author

pganssle commented May 3, 2018

@di Yes, agreed that #726 would be very helpful.

One use case would be a package that is compatible (or at least, thinks it's compatible) with all versions of Python. What should this field be for those packages, something like Requires-Python: >0?

Presumably python_requires="*.*" or something like that? >0 would also work, though I really doubt there are any packages out there that actually support Python<2.0 anyway.

don't believe displaying a warning for a successful upload is possible at the moment without making changes to twine and the upload API, but this would probably be a beneficial change to make in general.

Hm. That's a bit of a problem, because half the issue here is that people tend to install setuptools or twine once and then never upgrade it. The people with updated versions of twine and setuptools are the least in need of this change.

Other possible options:

  1. Rolling blackouts as was done for TLS and the PyPI.org transition, skipping the "raise a warning" phase entirely.
  2. Send an e-mail to people who upload new packages without a Requires-Python directive (this might be way more e-mail than can be done safely)
  3. Do some verification to check if the source distribution includes a python_requires directive, but the metadata failed to upload and send an e-mail only to those people (with instructions on how to fix it).
  4. Implement the "warning" as a "fail only the first time" - parse the metadata and if it doesn't contain Requires-Python, hash the payload and store it, then return an error explaining why it failed. Uploading the exact same payload a second time would actually allow the upload to proceed.

If #726 is happening soon, option 4 could be implemented as "if you upload and it doesn't have Requires-Python, you must go through the 2-phase upload process and check a big red box that says, "I don't want my packages to properly advertise its requirements" before allowing it to proceed.

@edmorley
Copy link

It seems that the other half of this issue is "how can we make sure users update setuptools/twine more often?" - which would help for more than just this case.

For example:

  • making twine perform version checks periodically and output a warning if it or setuptools/wheel out of date
  • actively working with CI providers to ensure the tooling they use is up to date (it looks like Travis' PyPI deploy addon already self-updates, but there are likely other common workflows that don't use it, as well as other providers)
  • updating as many of the "how to publish your Python package" docs/guides to begin with a pip install -U ...
  • (plus the idea of adding support for passing back a warning message on successful upload mentioned above, so at least new installs are more likely to stay up to date in the future)

Regarding users already on older broken versions - less disruptive options (vs brownouts/blocking) might be:

  • displaying a banner on the package page or when the user logs in
  • emailing the user after upload

@Carreau
Copy link
Contributor

Carreau commented Aug 25, 2018

It seems that the other half of this issue is "how can we make sure users update setuptools/twine more often?"

Have it send a header with which tool and version. Slowly refuse upload from old-versions, tools that don't set these headers.

sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
sgnn7 added a commit to cyberark/cyberark-conjur-cli that referenced this issue May 24, 2019
We now use twine and can publish our package to PyPI!

Twine is required to properly push metadata to PyPI irrelevant of the
underlying python versions

Upstream: pypi/warehouse#3889
@brainwane
Copy link
Contributor

We discussed this today in IRC. It sounds like we do want to do this because -- among other things -- it would help with the new pip resolver rollout pypa/pip#6536 and with stuff @techalchemy has to deal with in Pipenv.

I would love stats on how many projects have, as their most recent release, a project whose metadata has no Requires-Python. Is it possible to get that % or number?

Although it would be really nice to have #726 before implementing this, I'm ok with possibly going ahead and doing it without that feature anyway.

@dstufft
Copy link
Member

dstufft commented Apr 3, 2020

Yea that should be doable, I suspect the number is really low though.

@pfmoore
Copy link
Contributor

pfmoore commented Apr 4, 2020

it would help with the new pip resolver rollout

I'm not clear how it would help with the new resolver rollout. Can you clarify?

As far as I am aware, Python-Requires is essentially just another dependency that the resolver (new or old) has to deal with. Mandating it doesn't make any difference to pip (other than meaning that we'll now never have a case of a package with no dependencies, which makes a lot of our tests a little bit unrealistic, albeit not in a way that matters in practice 😉)

I'm a bit worried that this proposal will result in projects setting over-strict constraints (requiring only what they test on and support, for example, when older versions may well still work on a "don't come to us if something goes wrong" basis).

@brainwane
Copy link
Contributor

@pradyunsg I think you were in that discussion (in IRC today about Python-Requires) and you seemed actively interested in getting Warehouse to mandate it, but maybe I misread you - could you answer @pfmoore's question? If I misunderstood and we don't particularly need this for the resolver work, please do say so and I'll happily back off.

@Carreau
Copy link
Contributor

Carreau commented Apr 7, 2020

I'm a bit worried that this proposal will result in projects setting over-strict constraints (requiring only what they test on and support, for example, when older versions may well still work on a "don't come to us if something goes wrong" basis).

Well but if you are an advanced user and really want to get this package on an old Python version you probably can figure it out. Though if you are a new user and pip install foo and it does not work you at a loss. So I would see over-enforcing as better than being too lax.

And the all point of Python-Requires is also exactly for those old-python-users not to install a too-recent version of the package. Mostly I implemented it in Warehouse/PIP because IPython wanted to drop 2.x and couldn't without breaking for old user. I would even expect the opposite of you, and have project to drop support ini CI, break on old python, and forget to update Python-Requires.

As far as I am aware, Python-Requires is essentially just another dependency that the resolver (new or old) has to deal with

I have not been part of some of the discussion about the resolver so pardon me if I'm mistaken. But, unlike the other dependencies, pip cannot change the current version of Python right ? Requiring Python-Requires may dramatically reduce the search space of the resolver as the Python version is not a variable. It would even be strange to me if the resolver even knows about Python-Requires, wouldn't you just immediately filter out any package that is incompatible with the current Python and give the resulting set to the resolver ?

@pradyunsg
Copy link
Contributor

I think this is closer to "PyPI should have better, more strict metadata checks" than it is to "would help with the dependency resolver".

It'd nice to have this metadata mandatory for making packages actively consider this question prior to publishing, I don't think this directly affects the new resolver's rollout directly but, like all PyPI metadata stuff, this would have an indirect effect on the resolver.

@techalchemy
Copy link

Requiring Python-Requires may dramatically reduce the search space of the resolver as the Python version is not a variable. It would even be strange to me if the resolver even knows about Python-Requires, wouldn't you just immediately filter out any package that is incompatible with the current Python and give the resulting set to the resolver ?

Exactly this -- and without clean and enforced metadata in many cases we may not know which versions of python the uploader intended for their package to work on. As someone who maintains a tool that might run on python 3 but be asked to resolve dependencies against python 2, for example, it is incredibly useful to know whether a package is intended to work in python 3 or python 2 -- sometimes the act of resolution itself requires that you execute the package and in that case I tend to prefer learning from metadata than during failed execution.

To elaborate a but further, in pipenv we create lockfiles which might be compatible with multiple versions of python. In an ideal universe, you would be able to, using only one runtime, generate a lockfile with a resolved set of packages for multiple versions of python based purely on metadata, simply by reading manifests and including the relevant markers.

And yes, I approach this from the perspective of dependency resolution, but I do also think that having good, robust metadata is important for many reasons.

I don't think this directly affects the new resolver's rollout

Everyone will still have to deal with the world as it is until rules are in place and enforced and there is some useful data to leverage, but my goal when I talk about this is to envision the world as it could be in the future. Just turning the switch on tomorrow won't fix the problem right away, but if we had done it two years ago my life would be significantly easier, even if there were still some old packages without properly populated fields.

@pfmoore
Copy link
Contributor

pfmoore commented Apr 29, 2020

my goal when I talk about this is to envision the world as it could be in the future.

Agreed. My point above (which I didn't state well) was that Requires-Python metadata, while useful, can (in the most general sense) only be evaluated at the same time as the other dependency constraints on a package, after we've read the metadata. And making it mandatory doesn't really change that code.

It's true that on PyPI the Requires-Python data is available from the index, and so can reduce the search space - and this is great. But pip (and any other real-world Python resolver) has to handle other indexes, and even local directories containing distribution files. For those, there's no saving and Requires-Python has to be processed in the constraint evaluation phase, not the file selection phase. So we do the check in both places - one for efficiency, the other for correctness.

Don't get me wrong - I'm 100% in favour of better, more strictly enforced, and static metadata. I just don't think having this piece will materially affect how we code the resolver in pip.

@Carreau
Copy link
Contributor

Carreau commented Apr 29, 2020

But pip (and any other real-world Python resolver) has to handle other indexes, and even local directories containing distribution files.

Ah, thanks I forget to consider this case. Though I'm guessing in these case the number of local packages will be way smaller than PyPI likely...

Warehouse also recently implemented "yank", which I see as a move toward cleaner metadata as you can "yank" packages with incorrect metadata, and publishing crates with incorrect metadata (including Python-Requires) is part of life, and can be fixed by yank-and publish patch release.

Also I think we can add a step 0 in @pganssle proposal, we can actually get twine and other packaging tool have warnings even before warehouse does it.

@brainwane
Copy link
Contributor

@pfmoore said:

Don't get me wrong - I'm 100% in favour of better, more strictly enforced, and static metadata. I just don't think having this piece will materially affect how we code the resolver in pip.

OK! Thanks for the consensus, pip developers - that helps me know that I should not nudge to try to get this implemented for the resolver work's sake.

@itamarst
Copy link

The motivating failure mode happened again this week with pyrsistent, impacting many packages that still support Python 2: tobgu/pyrsistent#208

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants