Consistently use `get_requirement` to cache `Requirement` construction #12663

notatallshaw · 2024-04-30T02:38:53Z

I have been investigating other easy performance improvements when encountering long backtracking times, so I profiled pip install --dry-run --ignore-installed apache-airflow[all] where all packages are cached and with #12657 already applied.

Before call graph

For this scenario _parseNoCache is taking almost 28% of the run time during profiling, partly because it is being called many times by constructing Requirement in a loop

Caching the construction of Requirement the performance of apache-airflow[all] went from 174 seconds to 143 seconds and _parseNoCache reduced to 3% of the run time during profiling:

After call graph

Further, testing performance of the highly pathological requirements "urllib3>2" "boto3<1.20" total time went from 559 seconds to 117 seconds.

Testing very simple requirements such as requests and pandas performance appeared to be about 1-3% better, but they were within the margin of error. However I did observe for these requirements all Requirements objects were constructed at least twice, e.g. for requests the final CacheInfo was CacheInfo(hits=11, misses=11, maxsize=None, currsize=11), and for pandas the final one was CacheInfo(hits=86, misses=86, maxsize=None, currsize=86). So it does make sense there would be a small saving for even requirements that don't backtrack.

I set the maxsize to 1024 preemptively for questions about unlimited memory usage, out of the requirements I tested only apache-airflow[all] exceeded the cache size, so I tested a few data points:

128 maxsize: CacheInfo(hits=2176, misses=60088, maxsize=128, currsize=128)
256 maxsize: CacheInfo(hits=2402, misses=59862, maxsize=256, currsize=256)
512: maxsize: CacheInfo(hits=55408, misses=6856, maxsize=512, currsize=512)
1024 maxsize: CacheInfo(hits=56087, misses=6127, maxsize=1024, currsize=1024)
2048 maxsize: CacheInfo(hits=56307, misses=5906, maxsize=2048, currsize=2048)
None maxsize: CacheInfo(hits=59185, misses=3079, maxsize=None, currsize=3079)

It appears there's some arbitrary point, which depends on the dependency graph, where once the maxsize is over the hit rate goes up significantly. Out of all the graphs I tested, 1024 seemed to be firmly above that number.

For the apache-airflow[all], the time difference between 1024 and None on my machine was 146 seconds vs. 143 seconds.

Happy to take a different approach if someone spots one.

src/pip/_internal/metadata/importlib/_dists.py

ichard26

Disclaimer: I am not a pip committer, please do not take any suggestions in my review as binding.

A few things worth highlighting:

There's already a caching wrapper function for Requirement construction in the utilities, let's use it instead of creating another one. The maxsize argument should probably be bumped, however. ¹

pip/src/pip/_internal/utils/packaging.py

Lines 37 to 45 in 9ef0fdf

    
           @functools.lru_cache(maxsize=512) 
        
           def get_requirement(req_string: str) -> Requirement: 
        
               """Construct a packaging.Requirement object with caching""" 
        
               # Parsing requirement strings is expensive, and is also expected to happen 
        
               # with a low diversity of different arguments (at least relative the number 
        
               # constructed). This method adds a cache to requirement object creation to 
        
               # minimize repeated parsing of the same string to construct equivalent 
        
               # Requirement objects. 
        
               return Requirement(req_string)

Packaging 22.0+ has a hand-rolled parser (replacing pyparser) which is apparently 95% faster. Will the caching remain beneficial once Upgrade vendored packaging lib #12300 lands? (To test this, the easiest way is likely to import Requirement from your pip dev environment and install the latest version of packaging, bypassing the vendored copy.)

I'd wager that it's still worth the caching, but it would be good to know by how much after packaging is bumped.

It is used globally so there would be likely more cache pressure, although (sporadically, but yeah) ↩

notatallshaw · 2024-04-30T20:51:18Z

Packaging 22.0+ has a hand-rolled parser (replacing pyparser) which is apparently 95% faster. Will the caching remain beneficial once Upgrade vendored packaging lib #12300 lands?

Awesome, I will pull that PR branch and reprofile on it and see if this still takes a significant amount of time.

There's already a caching wrapper function for Requirement construction in the utilities, let's use it instead of creating another one. The maxsize argument should probably be bumped, however. 1

I didn't realize that existed, I can also profile switching all Requirement construction to it and checking cache pressure under different scenarios.

notatallshaw · 2024-04-30T23:38:03Z

There's already a caching wrapper function for Requirement construction in the utilities, let's use it instead of creating another one

Fixed, made usage consistent across pip codebase, and bumped maxsize, testing cache pressure:

apache-airflow[all]: CacheInfo(hits=78198, misses=5803, maxsize=2048, currsize=2048)
"urllib3>2" "boto3<1.20": CacheInfo(hits=786557, misses=1783, maxsize=2048, currsize=1588)
pandas: CacheInfo(hits=91, misses=86, maxsize=2048, currsize=86)
requests: CacheInfo(hits=17, misses=24, maxsize=2048, currsize=20)

Packaging 22.0+ has a hand-rolled parser (replacing pyparser) which is apparently 95% faster.

Indeed, I found that parsing is significantly faster, but due to the O(n^2) nature of resolving the performance improvements were still noticeable, at least with pathological examples.

With #12300 I found apache-airflow[all] improved from 141 seconds to 137 seconds, and "urllib3>2" "boto3<1.20" improved from 146 seconds to 112 seconds.

I still saw very small improvements for simple requirements that didn't involve backtracking, but instead of running dozens of times I had to run hundreds of times for the statistics to approach significance.

So this PR is a big improvement if #12300 doesn't land, but still a significant improvement for pathalogical requirements if #12300 does land.

ichard26

Unfortunately Requirement instance attributes are mutable so mayhem is possible if a requirement is ever modified in-place, but it appears that the codebase does not do that. It'll just have to be something to be careful of in future code reviews.

Thanks @notatallshaw for the in-depth investigation here!

pfmoore · 2024-05-01T20:48:47Z

I made the same comment recently on another PR. The Requirement object is a huge chunk of our technical debt, unfortunately. One thing that would be nice is to enforce the immutability that we're starting to rely on - but how we do that without a performance cost isn't clear to me (and adding a performance cost to allow us to implement a performance improvement without risk is clearly silly...)

notatallshaw · 2024-05-01T21:27:20Z

but how we do that without a performance cost isn't clear to me (and adding a performance cost to allow us to implement a performance improvement without risk is clearly silly...)

I'm not sure the two actions are actions are actually that counter productive.

E.g, If Requirement became a frozen dataclass it might be more costly to construct by some constant time, But constant time really isn't a problem (unless it's massive), this PR reduces the number of times Requirement objects have to be constructed, at least in cases of long backtracking, far more than a constant factor.

And likely #12300 reduces the constant time factor of construction significantly more than the constant time factor increase switching to a frozen data class might incur.

notatallshaw · 2024-05-01T22:14:01Z

Unfortunately Requirement instance attributes are mutable so mayhem is possible if a requirement is ever modified in-place, but it appears that the codebase does not do that. It'll just have to be something to be careful of in future code reviews.

To be clear though, this PR doesn't introduce the assumption that Requirement won't be mutated, as it was already being cached already, just not consistently across pip's codebase.

Further Requirement has a __hash__ method, I believe since it was introduced, it is always precarious to mutate an object with a __hash__ method, as there's significant risk it could be used as the key in a dictionary. Partly, this is the fun of Python's "we're all consenting adults here" philosophy 😄 .

pfmoore · 2024-05-01T22:57:18Z

Fair point - as a static typing skeptic, I should be the one arguing the "consenting adults" position 🙂 Consider my reservations withdrawn.

notatallshaw · 2024-05-04T21:59:58Z

Rebased now #12300 has landed

pradyunsg · 2024-05-15T20:07:17Z

I'm mildly concerned about the number of caches we've thrown into the resolver context, especially since we have nothing to enforce mutation prevention. While we have control over all the pieces here, so this is theoretically "safe"; we don't have anything to prevent us from mutating these objects.

We should likely look into ensuring we don't mutate these objects if they've been entered into the cache.

notatallshaw · 2024-05-15T20:27:54Z

Is there a preferred way you have to prevent mutation?

I would be happy to do a follow up PR that turned the objects I've recently touched into data classes, with the appropriate flags set.

The current resolution algorithm is O(n^2), and if caches aren't used then pip can end up calling things O(n^3) or worse. So while this is the way the resolution works, I forsee myself and others finding more places where caches improve the resolution.

pradyunsg · 2024-05-15T20:32:59Z

Is there a preferred way you have to prevent mutation?

None at the moment, hence the "we should look into" statement. To be clear, I am not concerned about this specific PR -- I'm on board for that and it's the broader pattern we have that I am concerned about. 😅

psf-chronographer bot added the bot:chronographer:provided label Apr 30, 2024

uranusjr reviewed Apr 30, 2024

View reviewed changes

src/pip/_internal/metadata/importlib/_dists.py Outdated Show resolved Hide resolved

uranusjr approved these changes Apr 30, 2024

View reviewed changes

notatallshaw changed the title ~~Cache Requirements construction~~ Cache Requirement construction Apr 30, 2024

notatallshaw mentioned this pull request Apr 30, 2024

Calculate candidate string versions only once in get_applicable_candidates #12664

Open

ichard26 self-requested a review April 30, 2024 14:14

ichard26 reviewed Apr 30, 2024

View reviewed changes

ichard26 added the type: performance Commands take too long to run label Apr 30, 2024

notatallshaw force-pushed the cache-Requirements-construction branch from 0c89f1d to 172b0a7 Compare April 30, 2024 22:51

notatallshaw changed the title ~~Cache Requirement construction~~ Consistently use get_requirement to cache Requirement construction Apr 30, 2024

notatallshaw force-pushed the cache-Requirements-construction branch from 172b0a7 to 3ccb656 Compare April 30, 2024 22:53

notatallshaw force-pushed the cache-Requirements-construction branch from ed2c8db to 34bcf02 Compare April 30, 2024 23:54

ichard26 approved these changes May 1, 2024

View reviewed changes

ichard26 requested a review from uranusjr May 1, 2024 20:25

notatallshaw added 3 commits May 4, 2024 17:31

Consistently use get_requirement to cache Requirement construction

e5a25ad

Fix tests

88278b6

NEWS ENTRY

b453c38

notatallshaw force-pushed the cache-Requirements-construction branch from 34bcf02 to b453c38 Compare May 4, 2024 21:31

uranusjr approved these changes May 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistently use `get_requirement` to cache `Requirement` construction #12663

Consistently use `get_requirement` to cache `Requirement` construction #12663

notatallshaw commented Apr 30, 2024 •

edited

ichard26 left a comment

notatallshaw commented Apr 30, 2024 •

edited

notatallshaw commented Apr 30, 2024 •

edited

ichard26 left a comment •

edited

pfmoore commented May 1, 2024

notatallshaw commented May 1, 2024 •

edited

notatallshaw commented May 1, 2024

pfmoore commented May 1, 2024

notatallshaw commented May 4, 2024

pradyunsg commented May 15, 2024

notatallshaw commented May 15, 2024

pradyunsg commented May 15, 2024

	@functools.lru_cache(maxsize=512)
	def get_requirement(req_string: str) -> Requirement:
	"""Construct a packaging.Requirement object with caching"""
	# Parsing requirement strings is expensive, and is also expected to happen
	# with a low diversity of different arguments (at least relative the number
	# constructed). This method adds a cache to requirement object creation to
	# minimize repeated parsing of the same string to construct equivalent
	# Requirement objects.
	return Requirement(req_string)

Consistently use get_requirement to cache Requirement construction #12663

Are you sure you want to change the base?

Consistently use get_requirement to cache Requirement construction #12663

Conversation

notatallshaw commented Apr 30, 2024 • edited

ichard26 left a comment

Choose a reason for hiding this comment

Footnotes

notatallshaw commented Apr 30, 2024 • edited

notatallshaw commented Apr 30, 2024 • edited

ichard26 left a comment • edited

Choose a reason for hiding this comment

pfmoore commented May 1, 2024

notatallshaw commented May 1, 2024 • edited

notatallshaw commented May 1, 2024

pfmoore commented May 1, 2024

notatallshaw commented May 4, 2024

pradyunsg commented May 15, 2024

notatallshaw commented May 15, 2024

pradyunsg commented May 15, 2024

Consistently use `get_requirement` to cache `Requirement` construction #12663

Consistently use `get_requirement` to cache `Requirement` construction #12663

notatallshaw commented Apr 30, 2024 •

edited

notatallshaw commented Apr 30, 2024 •

edited

notatallshaw commented Apr 30, 2024 •

edited

ichard26 left a comment •

edited

notatallshaw commented May 1, 2024 •

edited