Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_make_user_agent_string() failing in Debian build #870

Closed
stefanor opened this issue Feb 5, 2022 · 12 comments · Fixed by #871
Closed

test_make_user_agent_string() failing in Debian build #870

stefanor opened this issue Feb 5, 2022 · 12 comments · Fixed by #871
Labels

Comments

@stefanor
Copy link
Contributor

stefanor commented Feb 5, 2022

Your Environment

  1. Your operating system: Debian/unstable
  2. Version of python you are running: 3.9.10 and 3.10.2.
  3. How did you install twine? Building from the sdist.
  4. Version of twine you have installed: 3.8.0

The Issue

Please describe the issue that you are experiencing.

    def test_make_user_agent_string(default_repo):
        """Add twine and its dependencies to User-Agent session header."""
        assert "User-Agent" in default_repo.session.headers
    
        user_agent = default_repo.session.headers["User-Agent"]
        packages = (
            "twine/",
            "requests/",
            "requests-toolbelt/",
            "pkginfo/",
            "importlib_metadata/",
        )
        for p in packages:
>           assert p in user_agent
E           AssertionError: assert 'importlib_metadata/' in 'twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/24.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.9.10'

On a sailboat with terrible Internet connectivity right now, I don't have the bandwidth to dig around. But:

Not sure exactly what changed here to cause importlib-metadata to not become importlib_metadata It seems that the change to list_dependencies_and_versions() in #858 expected importlib_metadata and that isn't being delivered in this run.

Maybe the difference is that in my environment importlib_metadata's metadata is in egg-info form, not dist-info form.

Steps to Reproduce

If the issue is predictable and consistently reproducible, please list the steps here.

  1. Run the tests with python3-importlib-metadata from the Debian archive.
@bhrutledge
Copy link
Contributor

@stefanor It's not clear to me why you're trying to run the tests in the environment that you described, or why you've even got an environment set up that way. However, it does sound like it's deviating from what I consider to be Twine's "standard" development/test environment, i.e. virtual environments, managed by tox, where Twine and its dependencies are installed via pip install -e .. Unless there's a compelling reason, I don't think we'll do much to support other setups.

That said, I'd be surprised if #858 is the root cause, because it doesn't change anything related to the version of importlib_metadata.

@sigmavirus24
Copy link
Member

I suspect that this is because importlib_metadata isn't going through the right processes to normalize the module name in PKGINFO. We rely on that to generate the user-agent string. If that isn't generated properly, I don't think there's anything for us to do.

@stefanor
Copy link
Contributor Author

stefanor commented Feb 7, 2022

It's not clear to me why you're trying to run the tests in the environment that you described

We (Debian) like to run the test-suites of packages when we build them (and when any of their dependencies change), so that we can be sure that we're shipping something that's working.

or why you've even got an environment set up that way

importlib_metadata is currently built with setup.py in Debian. That will probably change in the future, but that's how things stand right now.

That said, I'd be surprised if #858 is the root cause

Now that I'm off an island with better cell coverage, and can play around, it looks simpler than I thought:

$ python3 -m venv testve
$ testve/bin/python -m pip install wheel
$ testve/bin/python -m pip install twine
...
Successfully installed Pygments-2.11.2 SecretStorage-3.3.1 bleach-4.1.0 certifi-2021.10.8 cffi-1.15.0 charset-normalizer-2.0.11 colorama-0.4.4 cryptography-36.0.1 docutils-0.18.1 idna-3.3 importlib-metadata-4.10.1 jeepney-0.7.1 keyring-23.5.0 packaging-21.3 pkginfo-1.8.2 pycparser-2.21 pyparsing-3.0.7 readme-renderer-32.0 requests-2.27.1 requests-toolbelt-0.9.1 rfc3986-2.0.0 six-1.16.0 tqdm-4.62.3 twine-3.8.0 urllib3-1.26.8 webencodings-0.5.1 zipp-3.7.0
$ testve/bin/python -m twine --version
twine version 3.8.0 (pkginfo: 1.8.2, readme-renderer: 32.0, requests: 2.27.1, requests-
toolbelt: 0.9.1, urllib3: 1.26.8, tqdm: 4.62.3, importlib-metadata: 4.10.1, keyring: 23.5.0,
rfc3986: 2.0.0, colorama: 0.4.4)
$ testve/bin/python -m pip install twine==3.7.1
$ testve/bin/python -m twine --version
twine version 3.7.1 (importlib_metadata: 4.10.1, pkginfo: 1.8.2, requests: 2.27.1, requests-
toolbelt: 0.9.1, tqdm: 4.62.3)

That's a pretty standard setup, and the name changed, I'd assume that's #858.

How much do you care about the User-Agent format?

@bhrutledge
Copy link
Contributor

Ah, I see that I misspoke; #858 clearly changes from an explicit list of package names to using Twine's metadata, which could result in the observed behavior (which I was able to reproduce from the previous comment). Given that, I don't understand why the test is passing in Twine's CI, and I don't know how to reproduce the failing test.

How much do you care about the User-Agent format?

Not very much, I think.

@sigmavirus24
Copy link
Member

How much do you care about the User-Agent format?

PyPI can parse it and use it for the purpose of debugging why a request went bad. Since PyPI can parse both importlib_metadata and it's normalized importlib-metadata as the same thing by normalizing them, it shouldn't matter. I doubt other barely supported package repositories care about the U-A string.

@sigmavirus24
Copy link
Member

I'll also point out that we were sending a static, constrained list of dependency names and versions. Now we seem to be including things that aren't strictly necessary to send which means that from a privacy stand-point we're potentially sending more information than a user might be comfortable (given users are probably already uncomfortable that we're sending packages and versions in the U-A).

Would it be plausible to filter what we send to PyPI to what it would need to understand why a thing failed? I don't imagine PyPI should care that we're using readme-renderer, bleach, SecretStorage, keyring, colorama, etc. It might care about requests, requests-toolbelt, urllib3, pkginfo, and importlib-metadata though

@sigmavirus24
Copy link
Member

We probably also want to see if PyPI even uses any of this and if it's worth sending in the U-A. When I added it, I had talked to Donald about the potential value but I don't know if it's ever been realized

@stefanor
Copy link
Contributor Author

stefanor commented Feb 9, 2022

After a little more playing around:

From what I can tell, the difference in behaviour is that when tox runs it has the twine sources in the current working directory, and prefers those to the twine module installed in the venv.
That means it is reading twine's .egg-info instead of its .dist-info, and getting the non-normalized module name.

So, I'd suggest that the test is checking for the wrong string, but as noted above, this isn't a big deal.

For the moment, I'm carrying a patch in Debian to change the checked string.

@bhrutledge
Copy link
Contributor

I'll also point out that we were sending a static, constrained list of dependency names and versions.
Would it be plausible to filter what we send to PyPI to what it would need to understand why a thing failed?
We probably also want to see if PyPI even uses any of this and if it's worth sending in the U-A.

This all sounds reasonable to me, so I'm reopening this issue.

@bhrutledge bhrutledge reopened this Feb 9, 2022
@bhrutledge bhrutledge added the bug label Feb 9, 2022
stefanor added a commit to stefanor/twine that referenced this issue Feb 9, 2022
stefanor added a commit to stefanor/twine that referenced this issue Feb 9, 2022
stefanor added a commit to stefanor/twine that referenced this issue Feb 9, 2022
@stratakis
Copy link

We also got the same issue on Fedora while trying to build the latest version as an rpm and running the tests.

stefanor added a commit to stefanor/twine that referenced this issue Feb 27, 2022
Revert back to a manual list of packages reported in the User-Agent
string and --version. Reverting the part of pypa#858 that switched to
parsing twine's requires.

See: pypa#870
@bhrutledge
Copy link
Contributor

bhrutledge commented Feb 27, 2022

We probably also want to see if PyPI even uses any of this and if it's worth sending in the U-A. When I added it, I had talked to Donald about the potential value but I don't know if it's ever been realized

@di Before I go searching the warehouse code, do you have a quick answer to whether or not PyPI uses the packages and versions that Twine is sending in the User-Agent string?

@bhrutledge
Copy link
Contributor

A code search in warehouse seems to indicate that User-Agent is just used to set the uploaded_via field on the File and Release models, which only seems to be used in the metadata for BigQuery.

So, it seems fine to remove the dependencies from the User-Agent string:

dependencies = cli.list_dependencies_and_versions()
user_agent_string = (
user_agent.UserAgentBuilder("twine", twine.__version__)
.include_extras(dependencies)
.include_implementation()
.build()
)

bhrutledge added a commit that referenced this issue Feb 28, 2022
* Canonicalize dependencies

Closes: #870

* Explicitly select packages for User-Agent inclusion

Revert back to a manual list of packages reported in the User-Agent
string and --version. Reverting the part of #858 that switched to
parsing twine's requires.

See: #870

* Update package list

* Remove dependencies from User-Agent

* Add changelog entry

Co-authored-by: Brian Rutledge <brian@bhrutledge.com>
bhrutledge added a commit that referenced this issue Mar 2, 2022
* Canonicalize dependencies

Closes: #870

* Explicitly select packages for User-Agent inclusion

Revert back to a manual list of packages reported in the User-Agent
string and --version. Reverting the part of #858 that switched to
parsing twine's requires.

See: #870

* Update package list

* Remove dependencies from User-Agent

* Add changelog entry

Co-authored-by: Brian Rutledge <brian@bhrutledge.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants