pip-compile doesn't cache local requirements correctly. #1545

mattdee123 · 2021-12-17T22:37:31Z

I have a large repository with multiple .in files that depend on each other with -r. There is a shared library which is included with -e, and it is included by a number of sub-files. This makes generating files take a long time.

It seems to me like this is due to the caching in _get_ireq_with_name using the full object. Despite all the imports being of the same library, they have some different fields which cause them to be unequal. I suspect that caching on ireq.local_file_path would speed things up.

Environment Versions

OS Type: macOS 11.6
Python version: Python 3.8.10
pip version: pip 21.3.1
pip-tools version: 6.4.1

Steps to replicate

Create requirements file req.in in the parent directory of this repository (or any local package; just using this one as a convenient example)

-e file:./pip-tools
-e file:./pip-tools
-e file:./pip-tools
-e file:./pip-tools
-e file:./pip-tools

(obviously this kind of file should never happen in real life, but it's possible with multiple files included via -r to repeat)
2. Run pip-compile --verbose req.in

Expected result

This should run fairly quickly

Actual result

This takes longer than I'd expect, and with the --verbose flag, you can see that in each round it spends most of its time while printing something like


                        ROUND 1
Obtaining file:///./pip-tools (from -r req.in (line 2))
  Preparing metadata (setup.py) ... done
Obtaining file:///./pip-tools (from -r req.in (line 3))
  Preparing metadata (setup.py) ... done
Obtaining file:///./pip-tools (from -r req.in (line 4))
  Preparing metadata (setup.py) ... done
Obtaining file:///./pip-tools (from -r req.in (line 1))
  Preparing metadata (setup.py) ... done
Obtaining file:///./pip-tools (from -r req.in (line 5))
  Preparing metadata (setup.py) ... done

This would be much faster if it didn't have to repeatedly do all this work for the same package. That should be possible, since this is just happening inside of _get_ireq_with_name to get the name of the package.

Happy to submit a PR using ireq.local_file_path as a cache key, though I'm not sure if that's correct.

The text was updated successfully, but these errors were encountered:

richafrank · 2021-12-18T19:38:03Z

Thanks @mattdee123 . FWIW #1519, which is awaiting review, replaces _get_ireq_with_name with a more robust solution. It looks like it improves this caching issue as well, in that ROUND 1 still has the same output you pasted, but subsequent rounds no longer do. Certainly nice from this speed perspective, though I haven't looked into whether it's a feature or a bug of the change...

mattdee123 · 2021-12-21T14:06:57Z

Nice, that sounds like it'll help somewhat. Though having to load the file multiple times in ROUND 1 is still slower than I would think is technically required.

Feels to me like a technically correct solution might be to add a layer of caching somewhere along the call path?

Locally, I've tried it with the following monkeypatch to add a cache around piptools.repositories.pypi.PyPIRepository.get_dependencies and it seems to work just fine. Though of course I'm not sure if there are some edge cases where it wouldn't work

from piptools.repositories import pypi

unpatched_get_dependencies = pypi.PyPIRepository.get_dependencies
local_dep_cache = {}

def patched_get_dependencies(self, ireq):
    cached = local_dep_cache.get(ireq.local_file_path, None)
    if cached:
        # get_dependencies both returns the dependencies and populates data in the input
        # we cache the "prepared" object and copy its fields over to replicate this.
        (dependencies, prepared_ireq) = cached
        for k, v in ireq.__dict__.items():
            if not v:
                ireq.__dict__[k] = prepared_ireq.__dict__[k]
        return dependencies
    result = unpatched_get_dependencies(self, ireq)
    if ireq.local_file_path:
        local_dep_cache[ireq.local_file_path] = (result, ireq)
    return result

pypi.PyPIRepository.get_dependencies = patched_get_dependencies

deifactor · 2022-04-28T02:15:52Z

This is biting me as well; we have a bunch of local packages that use pyproject.toml, and this makes building extremely painfully slow. I'm resorting to building wheels and editing the generated requirements.txt to patch the -e pack in.

AndydeCleyre · 2022-04-28T03:00:06Z

You may wish to try #1539, adding --resolver=backtracking or setting PIP_TOOLS_RESOLVER=backtracking.

deifactor · 2022-04-28T22:30:04Z

@AndydeCleyre That works great! Hopefully that lands soon.

atugushev · 2022-06-30T09:53:15Z

This has been fixed in #1539 with the backtracking resolver, try pip-compile --resolver backtracking. The resolver is released as part of pip-tools v6.8.0. Please let us know if it doesn't resolve your issue. Thanks!

richafrank added the cache Related to dependency cache label Jan 8, 2022

AndydeCleyre added the resolver Related to dependency resolver label Apr 28, 2022

atugushev closed this as completed Jun 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pip-compile doesn't cache local requirements correctly. #1545

pip-compile doesn't cache local requirements correctly. #1545

mattdee123 commented Dec 17, 2021

richafrank commented Dec 18, 2021

mattdee123 commented Dec 21, 2021

deifactor commented Apr 28, 2022

AndydeCleyre commented Apr 28, 2022

deifactor commented Apr 28, 2022

atugushev commented Jun 30, 2022

pip-compile doesn't cache local requirements correctly. #1545

pip-compile doesn't cache local requirements correctly. #1545

Comments

mattdee123 commented Dec 17, 2021

Environment Versions

Steps to replicate

Expected result

Actual result

richafrank commented Dec 18, 2021

mattdee123 commented Dec 21, 2021

deifactor commented Apr 28, 2022

AndydeCleyre commented Apr 28, 2022

deifactor commented Apr 28, 2022

atugushev commented Jun 30, 2022