Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespace packages with dots in them fail to upload to PyPI #743

Closed
Bengt opened this issue Mar 15, 2021 · 17 comments · Fixed by #745
Closed

Namespace packages with dots in them fail to upload to PyPI #743

Bengt opened this issue Mar 15, 2021 · 17 comments · Fixed by #745

Comments

@Bengt
Copy link

Bengt commented Mar 15, 2021

Your Environment

  1. Operating system:

python:slim-buster

  1. Python version:

3.7.8

  1. twine installation method:

pip from official PyPI

  1. Installed twine version:

twine-3.4.0-py3-none-any.whl

$ twine --version
twine version 3.4.0 (importlib_metadata: 3.7.3, packaging: 20.9, pkginfo:
1.7.0, requests: 2.25.1, requests-toolbelt: 0.9.1, tqdm: 4.59.0)
  1. Package repository target
  • The package's PKG-INFO file
Metadata-Version: 2.1
Name: mosaik.SimConfig
Version: 0.1.0rc20210315231626
Summary: The missing implementation of mosaik's SimConfig dictionary.
Home-page: https://gitlab.com/offis.energy/mosaik/mosaik.simconfig
Author: Bengt Lüers
Author-email: bengt.lueers@gmail.com
License: UNKNOWN
Description: # mosaik SimConfig
        
        The missing implementation of mosaik's SimConfig dictionary.
        
        [...]

Platform: UNKNOWN
Classifier: License :: OSI Approved :: GNU Lesser General Public License v2 (LGPLv2)
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: Implementation :: PyPy
Description-Content-Type: text/markdown

.pypirc file

cat: /root/.pypirc: No such file or directory

The Issue

When I run twine, I get an error:

$ twine upload --verbose dist/*
Using configuration from /root/.pypirc
Uploading distributions to https://upload.pypi.org/legacy/
  dist/mosaik.SimConfig-0.1.0rc20210315223057-py2.py3-none-any.whl (14.0 KB)
username set by command options
password set by command options
username: Bengt
password: <hidden>
Uploading mosaik.SimConfig-0.1.0rc20210315223057-py2.py3-none-any.whl
100%|██████████| 20.7k/20.7k [00:00<00:00, 31.8kB/s]
Content received from server:
<html>
 <head>
  <title>400 Start filename for 'mosaik-simconfig' with 'mosaik-simconfig'.</title>
 </head>
 <body>
  <h1>400 Start filename for 'mosaik-simconfig' with 'mosaik-simconfig'.</h1>
  The server could not comply with the request since it is either malformed or otherwise incorrect.<br/><br/>
Start filename for &#x27;mosaik-simconfig&#x27; with &#x27;mosaik-simconfig&#x27;.
 </body>
</html>
HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/
Start filename for 'mosaik-simconfig' with 'mosaik-simconfig'.

Source: https://gitlab.com/offis.energy/mosaik/mosaik.simconfig/-/jobs/1100239627

Additional occurrences:

https://gitlab.com/mosaik.hdf5-storage/mosaik.hdf5-storage/-/jobs/1100176050
https://gitlab.com/offis.energy/mosaik/mosaik.scenario-tools/-/jobs/1100175841
https://gitlab.com/offis.energy/mosaik/mosaik.eid/-/jobs/1100176922
https://gitlab.com/offis.energy/mosaik/mosaik.householdsim_semver/-/jobs/1100175569

Steps to Reproduce

        -   python setup.py bdist_wheel
        -   pip install --quiet --upgrade twine
        -   twine --version  # required for reporting bugs
        -   twine upload --verbose dist/*

Source: https://gitlab.com/offis.energy/mosaik/mosaik.simconfig/-/blob/master/.gitlab-ci.yml#L88

@godlygeek
Copy link

The breaking change appears to be 0bd26af#diff-30ee60b48548622c54dfa266e095021f403ad23ab4848c74b2deed5df329ae00R62

>>> import pkg_resources
>>> pkg_resources.safe_name("mosaik.SimConfig")
'mosaik.SimConfig'
>>> import packaging.utils
>>> packaging.utils.canonicalize_name("mosaik.SimConfig")
'mosaik-simconfig'

@kgreav
Copy link

kgreav commented Mar 15, 2021

Same problem here, just broke my build. Downgrading to 3.3.0 as a workaround solves the issue.

@bhrutledge
Copy link
Contributor

@jaraco Are you able to look into this?

@bhrutledge
Copy link
Contributor

bhrutledge commented Mar 16, 2021

In the meantime, if someone could open a PR with a failing test that reproduces this (maybe in test_package.py), it would be greatly appreciated.

pablogsal added a commit to pablogsal/twine that referenced this issue Mar 16, 2021
Commit 0bd26af introduced a regression
that was causing some namespace packages with dots in them fail to
upload to PyPI. The reason is because
`packaging.utils.canonicalize_name()` is not equivalent to
`pkg_resources.safe_name()` as the former transforms the "." into "-",
while the later only does it for non-alphanumeric/. characters.

Closes: pypa#743
pablogsal added a commit to pablogsal/twine that referenced this issue Mar 16, 2021
Commit 0bd26af introduced a regression
that was causing some namespace packages with dots in them fail to
upload to PyPI. The reason is because
`packaging.utils.canonicalize_name()` is not equivalent to
`pkg_resources.safe_name()` as the former transforms the "." into "-",
while the later only does it for non-alphanumeric/. characters.

Closes: pypa#743
pablogsal added a commit to pablogsal/twine that referenced this issue Mar 16, 2021
Commit 0bd26af introduced a regression
that was causing some namespace packages with dots in them fail to
upload to PyPI. The reason is because
`packaging.utils.canonicalize_name()` is not equivalent to
`pkg_resources.safe_name()` as the former transforms the "." into "-",
while the later only does it for non-alphanumeric/. characters.

Closes: pypa#743
pablogsal added a commit to pablogsal/twine that referenced this issue Mar 16, 2021
Commit 0bd26af introduced a regression
that was causing some namespace packages with dots in them fail to
upload to PyPI. The reason is because
`packaging.utils.canonicalize_name()` is not equivalent to
`pkg_resources.safe_name()` as the former transforms the "." into "-",
while the later only does it for non-alphanumeric/. characters.

Closes: pypa#743
pablogsal added a commit to pablogsal/twine that referenced this issue Mar 16, 2021
Commit 0bd26af introduced a regression
that was causing some namespace packages with dots in them fail to
upload to PyPI. The reason is because
`packaging.utils.canonicalize_name()` is not equivalent to
`pkg_resources.safe_name()` as the former transforms the "." into "-",
while the later only does it for non-alphanumeric/. characters.

Closes: pypa#743
pablogsal added a commit to pablogsal/twine that referenced this issue Mar 16, 2021
Commit 0bd26af introduced a regression
that was causing some namespace packages with dots in them fail to
upload to PyPI. The reason is because
`packaging.utils.canonicalize_name()` is not equivalent to
`pkg_resources.safe_name()` as the former transforms the "." into "-",
while the later only does it for non-alphanumeric/. characters.

Closes: pypa#743
pablogsal added a commit to pablogsal/twine that referenced this issue Mar 16, 2021
Commit 0bd26af introduced a regression
that was causing some namespace packages with dots in them fail to
upload to PyPI. The reason is because
`packaging.utils.canonicalize_name()` is not equivalent to
`pkg_resources.safe_name()` as the former transforms the "." into "-",
while the later only does it for non-alphanumeric/. characters.

Closes: pypa#743
@pablogsal
Copy link
Contributor

pablogsal commented Mar 16, 2021

Ah, I forgot that I cand easily DDoS an issue by mentioning the issue number in the commit message :)

In the meantime, if someone could open a PR with a failing test that reproduces this (maybe in test_package.py), it would be greatly appreciated.

Opened PR here: #745

@pradyunsg
Copy link
Member

pradyunsg commented Mar 16, 2021

Looks like PyPI wasn't ready for this change, since it also uses pkg_resources.safe_name: https://github.com/pypa/warehouse/blob/ea21a674b6165d61340ae976da67a08729902f6a/warehouse/forklift/legacy.py#L1193

I guess a decent plan would be to merge #745, and then start allowing both canonical / safe_name on warehouse's end point and then revert #745.

/cc @ewdurbin @di

@bhrutledge
Copy link
Contributor

It's clear that packaging.utils.canonicalize_name and pkg_resources.safe_name do different things, but it's not clear to me that canonicalize_name is the "correct" way. If it is, and if we follow @pradyunsg's suggestion, would that result in changing/breaking existing project's names?

@pradyunsg
Copy link
Member

I'm pretty sure we want to get to the point where we're using canonicalize_name everywhere -- it's backed by a PEP (503, I think?) -- but safe_name is purely implementation defined. AFAIK this won't cause any changes to the served index pages, because the names served by the index are normalised using canonicalize_name (or something equivalent).

@ewdurbin
Copy link
Member

I agree that 0bd26af#diff-30ee60b48548622c54dfa266e095021f403ad23ab4848c74b2deed5df329ae00L61-R62 is an invalid change.

As a utility it seems that pkg_resources.safe_name is generating a valid distribution name per PEP 508 from an arbitrary string.

packaging.utils.canonicalize_name certainly has a different purpose per PEP 503 which is to normalize names for the Simple Index and PyPI URLS only.

The possible result of packaging.utils.canonicalize_name is a subset of valid project names, which seems to be what is causing the issue here. An otherwise valid distribution name passed through this function may be modified.

@ewdurbin
Copy link
Member

I'm not positive warehouse's behavior is correct here either. Seems we might need a validator utility in packaging that ensures a given distribution name is valid? Currently warehouse defines the regex from PEP 508 inline at https://github.com/pypa/warehouse/blob/ea21a674b6165d61340ae976da67a08729902f6a/warehouse/forklift/legacy.py#L199-L201

@ewdurbin
Copy link
Member

It may make sense for warehouse to use canonicalize name in the comparison noted by @pradyunsg at https://github.com/pypa/warehouse/blob/ea21a674b6165d61340ae976da67a08729902f6a/warehouse/forklift/legacy.py#L1192-L1193, however I'm concerned that this might paper over the impedance mismatch noted in #743 (comment)

@pradyunsg
Copy link
Member

pradyunsg commented Mar 16, 2021

Hmm... I wonder how disruptive it would be to require the incoming names (going "into" PyPI) to be canonical as PEP 503 defines.

The benefit is that PyPI (and publishing tooling) would only use PEP 503 names (other than for validation / legacy projects) and pip (and other consumption tooling) would need to deal with PEP 508 stuff.

@ewdurbin
Copy link
Member

ewdurbin commented Mar 16, 2021

That would certainly break valid distribution names like namespace.foobar. I don't think it's a great idea. Remember, 503 is about the Simple Index specification, not metadata.

@ewdurbin
Copy link
Member

I guess one final thought here specifically is that what we're calling namespaces here should probably be spoken as "pseudonamespaces" since there is really no implication to them in the project name currently.

Actual namespace support is something that has been discussed and is likely to see fruition in the near future.

bhrutledge added a commit that referenced this issue Mar 17, 2021
…ity (#745)

* Fix the "safe_name" attribute of PackageFile for backwards compatibility

Commit 0bd26af introduced a regression
that was causing some namespace packages with dots in them fail to
upload to PyPI. The reason is because
`packaging.utils.canonicalize_name()` is not equivalent to
`pkg_resources.safe_name()` as the former transforms the "." into "-",
while the later only does it for non-alphanumeric/. characters.

Closes: #743

* Update docstring of safe_name()

Co-authored-by: Brian Rutledge <brian@bhrutledge.com>

Co-authored-by: Brian Rutledge <brian@bhrutledge.com>
@bhrutledge
Copy link
Contributor

The fix has been released: https://pypi.org/project/twine/3.4.1/

Thanks @pablogsal for the quick fix, and @Bengt and @godlygeek for the initial report.

@Bengt
Copy link
Author

Bengt commented Mar 17, 2021

I can confirm that 3.4.1 fixes all issues I was having. Thanks for all the discussion and the quick fix, everybody.

@sigmavirus24
Copy link
Member

@Bengt thanks for the excellent report that made it easy to track down and fix this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants