Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backtracking tries to download all versions of package #2044

Open
mireq opened this issue Jan 13, 2024 · 5 comments
Open

Backtracking tries to download all versions of package #2044

mireq opened this issue Jan 13, 2024 · 5 comments
Labels
docs Documentation related question User question resolver Related to dependency resolver support User support

Comments

@mireq
Copy link

mireq commented Jan 13, 2024

Let's assume, i have already this requirements.txt file:

urllib3==2.1.0

Now i add boto3 to requirements.in:

urllib3
boto3

Now command pip-compile --resolver=backtracking -v requirements.in tries to download every metadata file or every package file (if i am using pip servers directly, not caching servers on my local network):

pip-compile --resolver=backtracking -v requirements.in 
Using indexes:
  https://pypi.org/simple

                          ROUND 1                           
  Collecting urllib3 (from -r requirements.in (line 1))
    Obtaining dependency information for urllib3 from https://files.pythonhosted.org/packages/96/94/c31f58c7a7f470d5665935262ebd7455c7e4c7782eb525658d3dbf4b9403/urllib3-2.1.0-py3-none-any.whl.metadata
    Using cached urllib3-2.1.0-py3-none-any.whl.metadata (6.4 kB)
  Collecting boto3 (from -r requirements.in (line 2))
    Obtaining dependency information for boto3 from https://files.pythonhosted.org/packages/e3/f7/93a4ba1cd2cc4ee95f871b0890e4ed60e52365110a074e7265279750a736/boto3-1.34.18-py3-none-any.whl.metadata
    Using cached boto3-1.34.18-py3-none-any.whl.metadata (6.6 kB)
  Collecting botocore<1.35.0,>=1.34.18 (from boto3->-r requirements.in (line 2))
    Obtaining dependency information for botocore<1.35.0,>=1.34.18 from https://files.pythonhosted.org/packages/ec/79/cc5e52bfc3cf7c26ba452c348fe6a765a888730b692d783e64a175243572/botocore-1.34.18-py3-none-any.whl.metadata
    Using cached botocore-1.34.18-py3-none-any.whl.metadata (5.6 kB)
  Collecting jmespath<2.0.0,>=0.7.1 (from boto3->-r requirements.in (line 2))
    Using cached jmespath-1.0.1-py3-none-any.whl (20 kB)
  Collecting s3transfer<0.11.0,>=0.10.0 (from boto3->-r requirements.in (line 2))
    Obtaining dependency information for s3transfer<0.11.0,>=0.10.0 from https://files.pythonhosted.org/packages/12/bb/7e7912e18cd558e7880d9b58ffc57300b2c28ffba9882b3a54ba5ce3ebc4/s3transfer-0.10.0-py3-none-any.whl.metadata
    Using cached s3transfer-0.10.0-py3-none-any.whl.metadata (1.7 kB)
  Collecting python-dateutil<3.0.0,>=2.1 (from botocore<1.35.0,>=1.34.18->boto3->-r requirements.in (line 2))
    Using cached python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
  INFO: pip is looking at multiple versions of botocore to determine which version is compatible with other requirements. This could take a while.
  Collecting boto3 (from -r requirements.in (line 2))
    Obtaining dependency information for boto3 from https://files.pythonhosted.org/packages/3b/34/0bdbd20d688438a46f8e9255e9e9a06ef350689a05e9a7233babff554978/boto3-1.34.17-py3-none-any.whl.metadata
    Using cached boto3-1.34.17-py3-none-any.whl.metadata (6.6 kB)
  Collecting botocore<1.35.0,>=1.34.17 (from boto3->-r requirements.in (line 2))
    Obtaining dependency information for botocore<1.35.0,>=1.34.17 from https://files.pythonhosted.org/packages/4a/ed/9f3ec1754d1a444997d7d18aeaef2611a3694783b6baf062c2e74627c4b3/botocore-1.34.17-py3-none-any.whl.metadata
    Using cached botocore-1.34.17-py3-none-any.whl.metadata (5.6 kB)
  Collecting boto3 (from -r requirements.in (line 2))
    Obtaining dependency information for boto3 from https://files.pythonhosted.org/packages/8b/dc/26c1c654cb6a177fc0b7ca7f916cd61daf045a42ca091fce44906d65be9f/boto3-1.34.16-py3-none-any.whl.metadata
    Using cached boto3-1.34.16-py3-none-any.whl.metadata (6.6 kB)
  Collecting botocore<1.35.0,>=1.34.16 (from boto3->-r requirements.in (line 2))
    Obtaining dependency information for botocore<1.35.0,>=1.34.16 from https://files.pythonhosted.org/packages/6d/84/36a78ba9d992baf3ed48dc0ad2bdb711d27033de0b088d71f4cfcd698bde/botocore-1.34.16-py3-none-any.whl.metadata
    Using cached botocore-1.34.16-py3-none-any.whl.metadata (5.6 kB)
...

In this case it takes hours of downloading.

From persective of resolver, is correct to try other versions.

I don't know if this can be solved in some smart way. I might suggest to not check older versions if installed package version is already higher than maximal version defined in botocore (in this case), but this would not be correct.

This is more discussion topic, than real bug report, because i think, pip-tools behaves correctly. Maybe best solution would be extend pip server metadata to include all dependencies of package on single request and correctly grouped to version ranges, so it would be not one gigantic file. Instead of this, it would contain sections like '>=1.0,<1.2': {'dependencies': ....

@webknjaz
Copy link
Member

I sometimes add a separate file with extra constraints at the top of the input file. As in -c broken-version-constraints.txt where I have some of the transitive deps tightened up. Maybe, it'll work for you too.

@webknjaz webknjaz added docs Documentation related question User question support User support resolver Related to dependency resolver labels Jan 24, 2024
@mireq
Copy link
Author

mireq commented Jan 27, 2024

@webknjaz i have script which temporary removes problematic dependencies. They are automatically added with correct version in dependency build process. I can solve this problem easily when i know what exactly is the problem, but it's not always so obvious.

I have solved this problem for my case. My intent is to start discussion to make pypi package registry more efficient.

Let's look how fast can npm resolve packages. It's because npm registry sends package metadata for all versions in one request. This is good start. I would like to see version grouping, so if some range has same dependencies, then it is not necessary to send that same list for all versions.

@webknjaz
Copy link
Member

First of all, pip-tools is not a place to discuss how indexes work.
Second, there's package metadata that is served per-version already, it's standardized, there's a PEP. And it's implemented in Warehouse.
Finally, it's simply impossible for the PyPI to know all the metadata, because sdists have dynamic nature and may produce different dependencies in different cases. Study https://dustingram.com/articles/2018/03/05/why-pypi-doesnt-know-dependencies/ for details.

You can force pip to disregard sdists using --only-binary and --prefer-binary CLI options, that can be passed from pip-tools too. But that's about it.

@mireq
Copy link
Author

mireq commented Jan 29, 2024

@webknjaz Yes, metadata is served per version, this is part of the problem. NPM for example serves all dependencies for packages in single request. Instead of this, pip-tools (pip) needs hundreds of request just to check package dependencies (this is in my first message).

Second problem is something, which should be eliminated in future. It was really bad decision to allow code execution even for mostly static metadata.

I don't know if this is best place to discuss this, but it's problem which directly affects pip-tools. It can't be properly fixed by pip-tools, but it's possible to reduce backtracking depth, maybe try lower dependents versions eagerly instead of just trying downgrade first problematic package to any old version.

@webknjaz
Copy link
Member

It's not going to be possible to eliminate building sdists. Like ever. The reasons are explained in the article. It'd simply break the ecosystem.

If you want to draft a PR with more concrete ideas, you can try. But I don't see anything actionable here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related question User question resolver Related to dependency resolver support User support
Projects
None yet
Development

No branches or pull requests

2 participants