Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance by caching find_spec #2408

Merged

Conversation

crazybolillo
Copy link
Contributor

Certain checkers upstream on pylint like import-error heavily use find_spec. This method is IO intensive as it looks for files across several search paths to return a ModuleSpec.

Since imports across files may repeat themselves it makes sense to cache this method in order to speed up the linting process.

Local testing shows that caching reduces the total amount of calls to find_module methods (used by find_spec) by about 50%. Linting the test repository in the related issue goes from 40 seconds to 37 seconds. This was on a NVME disk and after warmup, so timing gains may be bigger on slower file systems like the one mentioned in the referenced issue.

Closes pylint-dev/pylint#9310.

Type
🔨 Refactoring

@crazybolillo
Copy link
Contributor Author

The cache feels dirty but lru_cache and its siblings force you to cache the whole thing which is impossible (lists can't be hashed) and feels redundant, if I already have info on the module requested, who cares if the search paths this time are different. I may be horrendously wrong on that last statement, but it was my thought process.

There may be room for improvement, however I think this can get the ball rolling to discuss changes/improvements.

Here is a brief review of the profiling I did.

Before PR

Timings using hyperfine

  Time (mean ± σ):     40.231 s ±  0.429 s    [User: 36.848 s, System: 3.355 s]
  Range (min … max):   39.611 s … 40.846 s    10 runs

Syscalls

  % time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 62.70    5.696665           3   1578640   1260035 newfstatat
 21.24    1.929284           2    756383           clock_gettime
  4.53    0.411450           5     72581     13450 openat
  3.31    0.300422           2    101300           read
  1.99    0.180693           3     59132           close
  1.89    0.172086           2     69186         5 lseek
  1.84    0.166910           2     62868           fstat
  1.69    0.153237           2     58172     58153 ioctl
  0.31    0.028427           3      8483           getcwd
  0.15    0.013854           2      5047           mmap
  0.15    0.013600           2      4724           munmap
  0.13    0.011465           6      1804           getdents64
  0.05    0.004886           4      1149           brk
  0.02    0.001471           6       230           write
  0.00    0.000408           7        56           mprotect
  0.00    0.000221         221         1           rename
  0.00    0.000092          18         5           getrandom
  0.00    0.000076           7        10         4 readlink
  0.00    0.000042          42         1           unlink
  0.00    0.000021           3         6           fcntl
  0.00    0.000015           7         2           prlimit64
  0.00    0.000014           7         2           gettid
  0.00    0.000010          10         1           futex
  0.00    0.000009           9         1           getpid
  0.00    0.000005           1         5           geteuid
  0.00    0.000004           0         5           getuid
  0.00    0.000002           0         3           getgid
  0.00    0.000002           0         3           getegid
  0.00    0.000002           1         2           setfsuid
  0.00    0.000002           1         2           setfsgid
  0.00    0.000001           0        67           rt_sigaction
  0.00    0.000001           0         2         1 access
  0.00    0.000000           0         2           pread64
  0.00    0.000000           0         1           mremap
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         3           uname
  0.00    0.000000           0         1         1 mkdir
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
  0.00    0.000000           0         1           epoll_create1
  0.00    0.000000           0         1           rseq
------ ----------- ----------- --------- --------- ----------------
100.00    9.085377           3   2779886   1331649 total

After PR

Timings using hyperfine

  Time (mean ± σ):     37.277 s ±  0.131 s    [User: 35.208 s, System: 2.043 s]
  Range (min … max):   37.022 s … 37.508 s    10 runs

Syscalls

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 54.62    2.141208           2    827202    642701 newfstatat
 29.37    1.151274           2    569776           clock_gettime
  4.47    0.175144           4     43641     13443 openat
  2.79    0.109267           2     53100           read
  1.77    0.069260           1     40245         5 lseek
  1.69    0.066270           2     30199           close
  1.63    0.063822           1     33933           fstat
  1.34    0.052577           1     29240     29221 ioctl
  1.16    0.045629           3     11589           getcwd
  0.41    0.016088           8      1804           getdents64
  0.31    0.012145           2      5150           mmap
  0.28    0.011102           2      4836           munmap
  0.12    0.004787           4      1059           brk
  0.02    0.000756           3       229           write
  0.01    0.000429           6        67           rt_sigaction
  0.01    0.000300           5        56           mprotect
  0.00    0.000102          10        10         4 readlink
  0.00    0.000048           8         6           fcntl
  0.00    0.000041           8         5           getrandom
  0.00    0.000016           8         2           prlimit64
  0.00    0.000008           4         2           gettid
  0.00    0.000005           2         2           mremap
  0.00    0.000002           0         3           getgid
  0.00    0.000002           0         5           geteuid
  0.00    0.000002           0         3           getegid
  0.00    0.000002           1         2           setfsuid
  0.00    0.000002           1         2           setfsgid
  0.00    0.000001           0         2         1 access
  0.00    0.000001           0         5           getuid
  0.00    0.000000           0         2           pread64
  0.00    0.000000           0         1           getpid
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         3           uname
  0.00    0.000000           0         1         1 mkdir
  0.00    0.000000           0         1           unlink
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           futex
  0.00    0.000000           0         1           set_tid_address
  0.00    0.000000           0         1           set_robust_list
  0.00    0.000000           0         1           epoll_create1
  0.00    0.000000           0         1           rseq
------ ----------- ----------- --------- --------- ----------------
100.00    3.920290           2   1652190    685376 total

@crazybolillo
Copy link
Contributor Author

I also obtained this call graph to find the best place to cache. The issue refers to find_module but I thought it was better to cache up the stack in find_spec because that method calls itself all the finders available. Since I thought it was cool and may be of interest, I am attaching it. You will need to zoom into it.

output

@crazybolillo
Copy link
Contributor Author

crazybolillo commented Apr 5, 2024

Also, I used the repository linked in the issue to obtain all the profiling data. I am still yet to profile the memory allocated for the process execution, will try to obtain that since I want to measure the hit an unbound cache may create.

def wrapper(*args):
key = ".".join(args[0])
if key not in modpath_cache:
modpath_cache[key] = func(*args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this cache also needs to be cleared there ?

def clear_cache(self) -> None:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing that out. The cache is now cleared from there and I also renamed variables a little bit and made the code look more like existing cache code (e.g. the inference tip cache).

Copy link
Contributor

@perrinjerome perrinjerome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it seems good to clear the cache here as well. I check "Approve" here, but I don't astroid well ( I just submitted an issue yesterday for the first time ). A maintainer should approve, not me, but I can say that these changes look good to me.

Comment on lines 432 to 440
def spec_cache(func):
def wrapper(*args):
key = ".".join(args[0])
if key not in _spec_cache:
_spec_cache[key] = func(*args)

return _spec_cache[key]

return wrapper
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add typing to this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have typed the function and also the dictionary used for caching (mypy was complaining).

@@ -423,6 +429,18 @@ def _find_spec_with_path(
raise ImportError(f"No module named {'.'.join(module_parts)}")


def spec_cache(func: Callable) -> Callable:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def spec_cache(func: Callable) -> Callable:
def spec_cache(func: Callable[[list[str], Sequence[str] | None], ModuleSpec]) -> Callable[[...], ModuleSpec]:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes have been applied.

@Pierre-Sassoulas Pierre-Sassoulas added the Enhancement ✨ Improvement to a component label Apr 14, 2024
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the analysis of what need to be cachec. Any reason not to use functools.cacheor functools.lru_cache instead of using a global ?

@crazybolillo
Copy link
Contributor Author

crazybolillo commented Apr 15, 2024

Thank you for the analysis of what need to be cachec. Any reason not to use functools.cacheor functools.lru_cache instead of using a global ?

I really wanted to use something standard, however the problem I found with functools caches is that they take into consideration all function arguments while I thought it only made sense to cache the requested module because:

  • Search paths are lists so they need further processing to hash them
  • I don't think search paths should be cached? If I already know where the package is found, does it matter if you give me different search paths? This point I am not so sure about, but it was my logic.

Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we also need a changelog entry?

Code LGTM!

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right thank you for the explanation. In find_spec we have :type modpath: list or tuple, maybe we could force it to be a tuple with something like this:

def make_modpath_hashable(
    func: Callable[[list[str], Sequence[str] | None], ModuleSpec]
) -> Callable[..., ModuleSpec]:
    def wrapper(modpath: list[str], *args) -> ModuleSpec:
        return func(tuple(modpath), *args)
    return wrapper

@make_modpath_hashable
@lru_cache(maxsize=1024)
def find_spec(modpath: list[str], path: Sequence[str] | None = None) -> ModuleSpec:

(I didn't test if it work tbh, but it should ? )

func: Callable[[list[str], Sequence[str] | None], ModuleSpec]
) -> Callable[..., ModuleSpec]:
def wrapper(modpath: list[str], *args) -> ModuleSpec:
key = ".".join(modpath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
key = ".".join(modpath)
key = tuple(modpath)

I suppose it's faster to get something hashable that way ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks better, thanks for the suggestion, I have applied it.

@crazybolillo
Copy link
Contributor Author

I tried the approach but It does not seem to work, main reason being that find_spec uses lists, so wrapping the function to pass it a tuple in order to make it cachable fails.

We would also need to make *args hashable, but that one is tricky, since I don't think order matters for search paths, so if the search path order changes we would get different hashes and the cache wont be hit as much.

What do you think? Maybe I am missing something.

I think we could work a hack use lru_cache internals but the code would be probably more complicated than using our own cache 🤔

@Pierre-Sassoulas Pierre-Sassoulas added this to the 3.2.0 milestone Apr 16, 2024
@Pierre-Sassoulas
Copy link
Member

I'm hesitant to add a global, because pylint/astroid can be used on massive codebase and we had issues with memory constantly increasing in the past. I don't remember all cache clear API where we need to add this global to myself, and would need some time to check it (Jacob probably remember better than me if it comes to it though). Using a global is not impossible but it's easy to make a mistake and create a memory leak. So using lru_cache to set a max number of value stored would be convenient / reassuring. I suppose another (less elegant than adding two decorators) approach would be to create a new cached subfunction inside the current function with the paths we want to cache specifically, what do you think ?

@crazybolillo crazybolillo force-pushed the cache-find-spec-crazybolillo branch 2 times, most recently from 38315c0 to 1b2f354 Compare April 22, 2024 19:29
@crazybolillo
Copy link
Contributor Author

I think I was able to find a solution that lets us use lru_cache without getting into too much decorators or sub functions. I have basically wrapped the arguments in a class so we can control its hash and ignore problematic arguments. The benchmarks still look good as well.

Without cache:

Time (mean ± σ):     58.254 s ±  0.470 s    [User: 53.538 s, System: 4.705 s]
Range (min … max):   57.863 s … 59.150 s    10 runs

With cache:

Time (mean ± σ):     52.131 s ±  0.191 s    [User: 49.164 s, System: 2.954 s]
Range (min … max):   51.910 s … 52.550 s    10 runs

What do you think?

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thank you ! The perf increase and the design are great. I need to fix the coverage and the pipeline for pypy 3.10 on main but I think we're going to release astroid 3.2.0 for this alone !

astroid/manager.py Outdated Show resolved Hide resolved
tests/test_manager.py Outdated Show resolved Hide resolved
astroid/interpreter/_import/spec.py Show resolved Hide resolved
astroid/interpreter/_import/spec.py Outdated Show resolved Hide resolved
@crazybolillo
Copy link
Contributor Author

Nuked AstroidManager as suggested. Just waiting on your thoughts about exceptions preventing not found packages from being cached.

@jacobtylerwalls
Copy link
Member

Thanks, I didn't notice that ImportError is raised when nothing is found. Good point.

However, I still think we should add the path to the cache key even when modules were found.

The following is unexpected, and I think it's realistic given the upstream callers like ast_from_module_name(). Pathing is one of the most sensitive parts of astroid.

>>> from astroid.interpreter._import.spec import find_spec
>>> find_spec(['brain'], ['astroid'])
ModuleSpec(name='brain', type=<ModuleType.PKG_DIRECTORY: 3>, location='astroid/brain', origin=None, submodule_search_locations=None)
>>> find_spec(['brain'], ['pylint'])
ModuleSpec(name='brain', type=<ModuleType.PKG_DIRECTORY: 3>, location='astroid/brain', origin=None, submodule_search_locations=None)

context_file: str | None = None,

@jacobtylerwalls
Copy link
Member

For example, the path is part of the cache key for modules here:

astroid/astroid/manager.py

Lines 301 to 305 in 7a3b482

def file_from_module_name(
self, modname: str, contextfile: str | None
) -> spec.ModuleSpec:
try:
value = self._mod_file_cache[(modname, contextfile)]

@crazybolillo
Copy link
Contributor Author

I made the search path part of its cache by converting it into a frozenset however there are lots of cache misses and the performance benefit gets nullified, so maybe this is harder than I thought 🤔 . This is the diff I tried (when compared with the current PR)

diff --git a/astroid/interpreter/_import/spec.py b/astroid/interpreter/_import/spec.py
index c812a612..62bb3e6c 100644
--- a/astroid/interpreter/_import/spec.py
+++ b/astroid/interpreter/_import/spec.py
@@ -428,6 +428,7 @@ def _find_spec_with_path(
 @dataclass(frozen=True)
 class SpecArgs:
     key: tuple
+    path_cache: frozenset | None
     modpath: list[str] = field(compare=False, hash=False)
     path: Sequence[str] | None = field(compare=False, hash=False)
 
@@ -449,7 +450,7 @@ def find_spec(modpath: list[str], path: Sequence[str] | None = None) -> ModuleSp
     :return: A module spec, which describes how the module was
              found and where.
     """
-    return _find_spec(SpecArgs(tuple(modpath), modpath, path))
+    return _find_spec(SpecArgs(tuple(modpath), frozenset(path) if path else None, modpath, path))

Here are the cache specs after running it on the test repo provided in the original issue:

No cache for search path: CacheInfo(hits=39800, misses=34979, maxsize=1024, currsize=1024)
Caching search path: CacheInfo(hits=39755, misses=74619, maxsize=1024, currsize=1024)

Am I missing something? Maybe a better way to cache the path? Otherwise it seems like it may be better to optimize somewhere else.

@crazybolillo
Copy link
Contributor Author

Also since the main point of this PR (perf) may take a while, I can make a new PR to apply the nuking of the Astroid test class separately.

@jacobtylerwalls
Copy link
Member

About the dataclasses, did you consider doing something like this for simplicity?

diff --git a/astroid/interpreter/_import/spec.py b/astroid/interpreter/_import/spec.py
index c812a612..65d7c88d 100644
--- a/astroid/interpreter/_import/spec.py
+++ b/astroid/interpreter/_import/spec.py
@@ -16,7 +16,6 @@ import types
 import warnings
 import zipimport
 from collections.abc import Iterator, Sequence
-from dataclasses import dataclass, field
 from functools import lru_cache
 from pathlib import Path
 from typing import Any, Literal, NamedTuple, Protocol
@@ -425,13 +424,6 @@ def _find_spec_with_path(
     raise ImportError(f"No module named {'.'.join(module_parts)}")
 
 
-@dataclass(frozen=True)
-class SpecArgs:
-    key: tuple
-    modpath: list[str] = field(compare=False, hash=False)
-    path: Sequence[str] | None = field(compare=False, hash=False)
-
-
 def find_spec(modpath: list[str], path: Sequence[str] | None = None) -> ModuleSpec:
     """Find a spec for the given module.
 
@@ -449,15 +441,14 @@ def find_spec(modpath: list[str], path: Sequence[str] | None = None) -> ModuleSp
     :return: A module spec, which describes how the module was
              found and where.
     """
-    return _find_spec(SpecArgs(tuple(modpath), modpath, path))
-
+    return _find_spec(tuple(modpath), tuple(path) if path else None)
 
 @lru_cache(maxsize=1024)
-def _find_spec(args: SpecArgs) -> ModuleSpec:
-    _path = args.path or sys.path
+def _find_spec(modpath: tuple[str], path: tuple[str] | None = None) -> ModuleSpec:
+    _path = path or sys.path
 
     # Need a copy for not mutating the argument.
-    modpath = args.modpath[:]
+    modpath = list(modpath)
 
     submodule_path = None
     module_parts = modpath[:]
@@ -466,7 +457,7 @@ def _find_spec(args: SpecArgs) -> ModuleSpec:
     while modpath:
         modname = modpath.pop(0)
         finder, spec = _find_spec_with_path(
-            _path, modname, module_parts, processed, submodule_path or args.path
+            _path, modname, module_parts, processed, submodule_path or path
         )
         processed.append(modname)
         if modpath:

@jacobtylerwalls
Copy link
Member

Am I missing something? Maybe a better way to cache the path? Otherwise it seems like it may be better to optimize somewhere else.

I'm wondering if the issue is in pylint. We're probably doing some unnecessary "is this import a relative import, if so provide the filepath we started from". It's providing all those filepaths that get checked first and strike out that are adding to the number of calls here.

This diff reduces a ton of calls, but it would need to be checked for correctness:

diff --git a/pylint/checkers/imports.py b/pylint/checkers/imports.py
index ac8962c50..361bd5571 100644
--- a/pylint/checkers/imports.py
+++ b/pylint/checkers/imports.py
@@ -1043,11 +1043,13 @@ class ImportsChecker(DeprecatedMixin, BaseChecker):
         module_file = node.root().file
         context_name = node.root().name
         base = os.path.splitext(os.path.basename(module_file))[0]
-
         try:
-            importedmodname = astroid.modutils.get_module_part(
-                importedmodname, module_file
-            )
+            if isinstance(node, nodes.ImportFrom):
+                importedmodname = astroid.modutils.get_module_part(
+                    importedmodname, module_file
+                )
+            else:
+                importedmodname = astroid.modutils.get_module_part(importedmodname)
         except ImportError:
             pass

@crazybolillo
Copy link
Contributor Author

Yes, it seems like there are lots of calls for the same package with different paths which obliterates the cache. I collected all calls to find_spec with their arguments and grouped them by requested modpath. Here is an excerpt:

** ['jinja2'] 403
         - ['pylint-corpus/venv/lib/python3.11/site-packages/scalene']
         - ['.', 'pylint-corpus/venv/lib/python3.11/site-packages', 'pylint-corpus', 'pylint-corpus/venv/lib/python3.11/site-packages', 'pylint-corpus', '/usr/lib/python311.zip', '/usr/lib/python3.11', '/usr/lib/python3.11/lib-dynload', 'pylint-corpus/venv/lib/python3.11/site-packages', 'astroid']
         - ['pylint-corpus/venv/lib/python3.11/site-packages/jinja2']
         - None
         - ['pylint-corpus/venv/lib/python3.11/site-packages/pkg_resources/_vendor/pyparsing/diagram']
         - ['pylint-corpus/venv/lib/python3.11/site-packages/pip/_vendor/pyparsing/diagram']
         - ['pylint-corpus/venv/lib/python3.11/site-packages/setuptools/_vendor/pyparsing/diagram']

jinja2 was requested 403 times and those were the search paths used. I also wonder why jinja2 is looked for since its not used in the test repo.

I will investigate in pylint.

@jacobtylerwalls jacobtylerwalls modified the milestones: 3.2.0, 3.3.0 May 4, 2024
@crazybolillo crazybolillo force-pushed the cache-find-spec-crazybolillo branch from e593f84 to 53c71a7 Compare May 4, 2024 18:25
@crazybolillo
Copy link
Contributor Author

I followed the suggestion and applied a similar change in pylint. Cache specs do improve a lot. Also it seems like previous tests were running with a polluted virtualenv so that explains stuff like looking for jinja2 😅.

Run times diminished in all cases because of the clean virtualenv, however relatively speaking the performance increase seems even better than the first iteration. I also changed tuples to frozensets since apparently they are faster by a noticeable margin (according to hyperfine).

Stats

No Cache

Time (mean ± σ):     46.954 s ±  1.127 s    [User: 43.222 s, System: 3.700 s] 
Range (min … max):   45.499 s … 48.479 s    10 runs

Cache

tuple

Time (mean ± σ):     38.001 s ±  0.359 s    [User: 35.533 s, System: 2.452 s]
Range (min … max):   37.388 s … 38.509 s    10 runs

frozenset

Time (mean ± σ):     35.961 s ±  0.758 s    [User: 33.585 s, System: 2.337 s]
Range (min … max):   34.297 s … 36.718 s    10 runs

The related PR in pylint that would make this cache work is pylint#9595.

@crazybolillo crazybolillo force-pushed the cache-find-spec-crazybolillo branch from 53c71a7 to 9984bc4 Compare May 4, 2024 18:53
@crazybolillo
Copy link
Contributor Author

Nevermind about the frozenset, when converting it back to a list ordering is sometimes lost so it can't be used, at least not for the sensitive modpath, for the search paths I don't think the order matters too much (?) so we can use frozensets there.

Copy link

codecov bot commented May 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.78%. Comparing base (7a3b482) to head (82c839e).
Report is 11 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2408      +/-   ##
==========================================
+ Coverage   92.76%   92.78%   +0.02%     
==========================================
  Files          94       94              
  Lines       11087    11095       +8     
==========================================
+ Hits        10285    10295      +10     
+ Misses        802      800       -2     
Flag Coverage Δ
linux 92.60% <100.00%> (+0.02%) ⬆️
pypy 92.78% <100.00%> (+2.01%) ⬆️
windows 92.69% <100.00%> (+0.32%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
astroid/interpreter/_import/spec.py 97.44% <100.00%> (+0.05%) ⬆️
astroid/manager.py 89.58% <100.00%> (+0.04%) ⬆️

... and 2 files with indirect coverage changes

@jacobtylerwalls
Copy link
Member

The potential indeterminacy of the frozenset does scare me a little bit. Just this week on Discord someone was asking for pointers about debugging indeterminate behavior of the cyclic-import check, and I advised them to look into nondeterminacy in astroid. Leaving a frozenset for the path sounds like exactly that kind of behavior.

I'd suggest using tuple for both. But if you want to prepare a new benchmark and discuss it, that's fine too.

@Pierre-Sassoulas
Copy link
Member

the performance increase seems even better than the first iteration

Right, more than 20% faster is HUGE !

search paths I don't think the order matters too much (?)

Not sure if I understand this right but the order of discovery does matter, right ? Doesn't it affect pylint-dev/pylint#6535 (pinned issue in pylint) ?

@jacobtylerwalls
Copy link
Member

jacobtylerwalls commented May 4, 2024

Yeah, I traced what happens to that path argument, and it eventually ends up here, in an order-sensitive for loop in importlib, so I wouldn't want this to become unordered:

https://github.com/python/cpython/blob/978fba58aef347de4a1376e525df2dacc7b2fff3/Lib/importlib/_bootstrap_external.py#L1522


Side note: it seems like our constructor for Finder is actually useless? I can't find anywhere that uses _path. Could be a small gain to nuke it.

def __init__(self, path: Sequence[str] | None = None) -> None:
self._path = path or sys.path

@jacobtylerwalls jacobtylerwalls modified the milestones: 3.3.0, 3.2.0 May 4, 2024
Certain checkers upstream on pylint like import-error heavily use
find_spec. This method is IO intensive as it looks for files
across several search paths to return a ModuleSpec.

Since imports across files may repeat themselves it makes sense to cache
this method in order to speed up the linting process.

Closes pylint-dev/pylint#9310.
This class predates efforts to have a central interface to control
global state (including caches) and it is no longer needed.
@crazybolillo crazybolillo force-pushed the cache-find-spec-crazybolillo branch from 9984bc4 to 98a0380 Compare May 4, 2024 20:32
@crazybolillo
Copy link
Contributor Author

Ooops , I changed the search paths to tuple as well. Perf is still good, in fact since most of the time we should be getting no search paths I ran hyperfine again and there was not much of an impact.

Copy link
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great work!

@crazybolillo
Copy link
Contributor Author

Thanks to everyone as well for the help and reviews 🤝

@jacobtylerwalls jacobtylerwalls merged commit 2ec0115 into pylint-dev:main May 4, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component topic-performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

E0401 (import-error) checks perform a lot of repeated stat calls
5 participants