Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugins: Investigate official importlib.metadata for entrypoint discovery #3458

Closed
dev-zero opened this issue Oct 23, 2019 · 18 comments · Fixed by #5058
Closed

Plugins: Investigate official importlib.metadata for entrypoint discovery #3458

dev-zero opened this issue Oct 23, 2019 · 18 comments · Fixed by #5058
Assignees

Comments

@dev-zero
Copy link
Contributor

Python 3.8 introduced a built-in module for entrypoint discovery https://docs.python.org/3.8/library/importlib.metadata.html with a backport for <3.8 and 2.7 here: https://importlib-metadata.readthedocs.io/en/latest/

While there's nothing wrong with pkg_resources other than we need reentry to make it scale, we should do a test with importlib.metadata to see whether it would be fast enough for us without reentry.

Release notes from pluggy (pytest) and tox suggest that it performs better than setuptools pkg_resources.

@dev-zero dev-zero added the type/feature request status undecided label Oct 23, 2019
@dev-zero dev-zero added this to the v1.1.0 milestone Oct 23, 2019
@dev-zero dev-zero self-assigned this Oct 23, 2019
@ltalirz
Copy link
Member

ltalirz commented Oct 23, 2019

Sounds great, I'll have a look at this.

P.S. Ah, I see you've already assigned yourself ;-) I'll wait a bit then to see what you discover.

@ltalirz ltalirz self-assigned this Oct 23, 2019
@ltalirz
Copy link
Member

ltalirz commented Oct 29, 2019

as part of this, one should also check whether the dependency on fastentrypoints.py can be removed (see utils/fastentrypoints.py)

@ltalirz
Copy link
Member

ltalirz commented Nov 8, 2019

I started looking into this here:
https://github.com/ltalirz/aiida_core/tree/issue_3458_importlib_metadata

I'm using a basic in-memory cache to make sure we're loading the entry points just once, but still the cli is a bit less responsive than with reentry - not terribly, but noticeably.

Timings with importlib (on Macbook Pro Retina Mid-2015):

$ python -mtimeit -s 'import importlib_metadata as im' 'im.entry_points()'
5 loops, best of 5: 57 msec per loop

Note: It seems that with importlib_metadata you always have to load the full set of entry points.

Timings with reentry (on Macbook Pro Retina Mid-2015):

$ python -mtimeit  -s 'from reentry.default_manager import PluginManager as p' 'p().get_entry_map()'
500 loops, best of 5: 635 usec per loop

P.S. Importing importlib_metadata takes another ~40ms, compared to 25ms for reentry and 180ms for pkg_resources.

So for importlib_metadata, we're likely looking at a total of ~60+40=100ms vs something close to 25ms for reentry.

@ltalirz
Copy link
Member

ltalirz commented Nov 9, 2019

@dev-zero Do you happen to have an AiiDA installation on a magnetic hard disk, where you could give the two tests above a try?

@ltalirz
Copy link
Member

ltalirz commented Nov 11, 2019

@giovannipizzi @sphuber
So, it seems in my environment, the difference between using reentry and using the built-in entry-point mechanism is down from roughly an order of magnitude to roughly a factor of 4.

Would you mind giving the PR a try and let me know how the autocompletion speed feels to you?
https://github.com/ltalirz/aiida_core/tree/issue_3458_importlib_metadata

It's certainly a bit slower, but at some point we need to think whether this still warrants the hassle of maintaining reentry, and asking people to reentry scan.

@sphuber
Copy link
Contributor

sphuber commented Nov 11, 2019

So, it seems in my environment, the difference between using reentry and using the built-in entry-point mechanism is down from roughly an order of magnitude to roughly a factor of 4.

For me personally the responsiveness of the CLI is very important. I use it all the time and the slowness of auto-completion really annoys me. A factor of 4 is huge in that context I feel and for me would definitely warrant the continued development of reentry and the annoyance of reentry scan.

However, where does that number come from. I installed your branch and tested time verdi. For your branch I get of the order of 0.380 ms. The current develop gives 0.280 and 0.1 second difference is not ideal but nowhere near as bad as what I expected when you said factor of 4.

@dev-zero
Copy link
Contributor Author

results on SSD (only thing so far) are:

develop:

(develop=) $ hyperfine --warmup 3 verdi "verdi data gaussian.basisset"
Benchmark #1: verdi
  Time (mean ± σ):     159.9 ms ±   7.2 ms    [User: 135.9 ms, System: 15.1 ms]
  Range (min … max):   149.3 ms … 171.0 ms    19 runs
 
Benchmark #2: verdi data gaussian.basisset
  Time (mean ± σ):     171.4 ms ±   7.9 ms    [User: 145.9 ms, System: 15.8 ms]
  Range (min … max):   159.3 ms … 186.6 ms    17 runs
 
Summary
  'verdi' ran
    1.07 ± 0.07 times faster than 'verdi data gaussian.basisset'

the importlib branch:

(issue_3458_importlib_metadata=) $ hyperfine --warmup 3 verdi "verdi data" "verdi data gaussian.basisset"
Benchmark #1: verdi
  Time (mean ± σ):     204.2 ms ±   2.8 ms    [User: 175.6 ms, System: 18.3 ms]
  Range (min … max):   201.2 ms … 210.5 ms    14 runs
 
Benchmark #2: verdi data
  Time (mean ± σ):     216.0 ms ±   2.5 ms    [User: 186.5 ms, System: 18.7 ms]
  Range (min … max):   212.0 ms … 220.5 ms    13 runs
 
Benchmark #3: verdi data gaussian.basisset
  Time (mean ± σ):     228.4 ms ±  13.0 ms    [User: 196.6 ms, System: 20.6 ms]
  Range (min … max):   208.9 ms … 246.9 ms    13 runs
 
Summary
  'verdi' ran
    1.06 ± 0.02 times faster than 'verdi data'
    1.12 ± 0.07 times faster than 'verdi data gaussian.basisset'

@ltalirz
Copy link
Member

ltalirz commented Nov 11, 2019

However, where does that number come from.

As described above, this comes from adding up the timings for importing the package (reentry vs importlib_metadata) and the call to load the entry points. 4 *25ms = 100ms, and 75ms of difference is close to the difference you are seeing as well.

Is the result of time verdi also the time needed for autocompletion?

@dev-zero
Copy link
Contributor Author

@ltalirz yes, pretty much it seems. Here's a benchmark with the simulated completion for verdi data g<TAB>:

develop:

$ COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash hyperfine --warmup 3 verdi
Benchmark #1: verdi
  Time (mean ± σ):     159.4 ms ±   2.5 ms    [User: 137.2 ms, System: 13.7 ms]
  Range (min … max):   156.9 ms … 166.0 ms    18 runs
 

importlib:

$ COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash hyperfine --warmup 3 verdi
Benchmark #1: verdi
  Time (mean ± σ):     211.9 ms ±   6.8 ms    [User: 184.3 ms, System: 16.9 ms]
  Range (min … max):   206.9 ms … 232.8 ms    13 runs

@dev-zero
Copy link
Contributor Author

@ltalirz but it seems there's still a bug in your branch:

COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash python -m cProfile -o program.importlib.prof /scratch/tiziano/virtualenvs/aiida/bin/verdi
Traceback (most recent call last):
  File "/scratch/tiziano/work/aiida/aiida_core/aiida/cmdline/params/types/identifier.py", line 65, in __init__
    entry_point = get_entry_point_from_string(entry_point_string)
  File "/scratch/tiziano/work/aiida/aiida_core/aiida/plugins/entry_point.py", line 193, in get_entry_point_from_string
    return get_entry_point(group, name)
  File "/scratch/tiziano/work/aiida/aiida_core/aiida/plugins/entry_point.py", line 298, in get_entry_point
    raise MultipleEntryPointError("Multiple entry points '{}' found in group".format(name, group))
aiida.common.exceptions.MultipleEntryPointError: Multiple entry points 'process.calculation.calcjob' found in group

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/users/tiziano/.pyenv/versions/3.7.5/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/users/tiziano/.pyenv/versions/3.7.5/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/users/tiziano/.pyenv/versions/3.7.5/lib/python3.7/cProfile.py", line 185, in <module>
    main()
  File "/users/tiziano/.pyenv/versions/3.7.5/lib/python3.7/cProfile.py", line 178, in main
    runctx(code, globs, None, options.outfile, options.sort)
  File "/users/tiziano/.pyenv/versions/3.7.5/lib/python3.7/cProfile.py", line 20, in runctx
    filename, sort)
  File "/users/tiziano/.pyenv/versions/3.7.5/lib/python3.7/profile.py", line 62, in runctx
    prof.runctx(statement, globals, locals)
  File "/users/tiziano/.pyenv/versions/3.7.5/lib/python3.7/cProfile.py", line 100, in runctx
    exec(cmd, globals, locals)
  File "/scratch/tiziano/virtualenvs/aiida/bin/verdi", line 6, in <module>
    from aiida.cmdline.commands.cmd_verdi import verdi
  File "/scratch/tiziano/work/aiida/aiida_core/aiida/cmdline/commands/__init__.py", line 21, in <module>
    from aiida.cmdline.commands import (
  File "/scratch/tiziano/work/aiida/aiida_core/aiida/cmdline/commands/cmd_calcjob.py", line 31, in <module>
    @arguments.CALCULATION('calcjob', type=CalculationParamType(sub_classes=('aiida.node:process.calculation.calcjob',)))
  File "/scratch/tiziano/work/aiida/aiida_core/aiida/cmdline/params/types/identifier.py", line 67, in __init__
    raise ValueError('{} is not a valid entry point string: {}'.format(entry_point_string, exception))
ValueError: aiida.node:process.calculation.calcjob is not a valid entry point string: Multiple entry points 'process.calculation.calcjob' found in group

... works in develop.

@ltalirz
Copy link
Member

ltalirz commented Nov 11, 2019

Ok, here are the corresponding timings for my laptop:

importlib_metadata

$ COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash hyperfine --warmup 3 verdi
Benchmark #1: verdi
  Time (mean ± σ):     304.1 ms ±  33.8 ms    [User: 254.9 ms, System: 45.2 ms]
  Range (min … max):   283.4 ms … 394.6 ms    10 runs

reentry

$ COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash hyperfine --warmup 3 verdi
Benchmark #1: verdi
  Time (mean ± σ):     207.8 ms ±   4.2 ms    [User: 170.5 ms, System: 33.2 ms]
  Range (min … max):   202.0 ms … 219.3 ms    14 runs

Not a negligible difference... on the other hand, most of the time is already spent outside entry-point discovery, so one could look whether we could avoid some other package imports.

it seems there's still a bug in your branch:

Yeah, the verdi data isn't working properly yet - the entry point interface of importlib_metadata is slightly different, I need to check what to fix there.
I first wanted to get an idea of whether we are going to use it.

@giovannipizzi
Copy link
Member

Just to report my results.

Ubuntu workstation, develop:

Benchmark #1: verdi
  Time (mean ± σ):     152.0 ms ±   6.6 ms    [User: 115.4 ms, System: 14.0 ms]
  Range (min … max):   144.4 ms … 162.9 ms    20 runs

Ubuntu workstation, branch:

Benchmark #1: verdi
  Time (mean ± σ):     194.6 ms ±   6.4 ms    [User: 174.6 ms, System: 15.7 ms]
  Range (min … max):   187.4 ms … 207.8 ms    14 runs

On my mac: on develop I get ~0.25s and on the branch: ~0.31s

Note: I tested with py3.

I think this is more or less consistent with the other results.

My feeling: while this is a small loss for a very important feature, probably dropping reentry is still a big win so I would be in favour of going in this direction, especially if we can try to gain something by optimising other parts of the code called by verdi.

@sphuber sphuber removed this from the v1.1.0 milestone Feb 28, 2020
@ltalirz
Copy link
Member

ltalirz commented May 21, 2021

Since @chrisjsewell brought this up again, I quickly ported the changes to the current develop and gave it a spin on the M1 Macbook Air, python 3.9.1:

reentry (bare)

$ python -mtimeit -n1 -r1 'import reentry'
1 loop, best of 1: 19.6 msec per loop
$ python -mtimeit  -s 'from reentry.default_manager import PluginManager as p' 'p().get_entry_map()'
200 loops, best of 5: 1.07 msec per loop

Still takes about 21ms altogether (compared to 25ms for the previous test on my old MacBook)

importlib (bare)

$ python -mtimeit -n1 -r1 'from importlib import metadata'
1 loop, best of 1: 16.7 msec per loop
$ python -mtimeit -s 'import importlib.metadata as im' 'im.entry_points()'
20 loops, best of 5: 18.8 msec per loop

Now at 17+19=36ms (down from ~100ms on my old MacBook which also used importlib-metadata, not the one in the standard lib).

This is a big difference to my previous benchmark; it leaves only a ~15ms bonus for reentry.

Note: Since this difference is so large, I tried the same using the importlib-metadata backport package, which can also be installed under python 3.9. This implementation is significantly slower than the one from the standard lib:

$ python -mtimeit -n1 -r1 'import importlib_metadata'
1 loop, best of 1: 31.3 msec per loop
$ python -mtimeit -s 'import importlib_metadata as im' 'im.entry_points()'
5 loops, best of 5: 53.5 msec per loop

Total of 31+54=85ms

verdi autocompletion (reentry, develop branch)

$ COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash hyperfine --warmup 3 verdi
Benchmark #1: verdi
  Time (mean ± σ):     209.9 ms ±   5.2 ms    [User: 153.6 ms, System: 34.9 ms]
  Range (min … max):   205.5 ms … 224.0 ms    13 runs

verdi autocompletion (importlib, https://github.com/ltalirz/aiida-core/tree/issue_3458_importlib_metadata_2 )

$ COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash hyperfine --warmup 3 verdi
Benchmark #1: verdi
  Time (mean ± σ):     225.9 ms ±   2.6 ms    [User: 169.3 ms, System: 36.1 ms]
  Range (min … max):   222.9 ms … 231.3 ms    12 runs

As expected from the bare tests, the difference in the mean is ~15ms.

With this, at least on the latest macbooks and with python>=3.8 I think there is no longer a reason to use reentry.
While we promise to still support python 3.7 until 26-Dec-2021, I think it's ok to accept a little slowdown there - it won't break anything and people who want better speed can upgrade to python 3.8.

Let's see whether we find similar behavior also on other hardware - if yes, I would say we can move ahead with dropping the reentry dependency.

P.S. I'm a little surprised that the absolute time of the autocompletion is not lower (still 200+ms). We may want to check at some point again whether there are any imports that could be avoided...

@ltalirz
Copy link
Member

ltalirz commented May 21, 2021

P.P.S. I notice that the timings do depend on the number of packages installed in the environment (which makes sense - this will add a linear cost to importlib, while the cost for reentry remains effectively constant).

For reference, in my tests:

$ pip list | wc -l
219

@ltalirz
Copy link
Member

ltalirz commented May 21, 2021

Now, here come the same bare tests on the AiiDA lab (python 3.7; NFS file system)

$ pip list | wc -l
253

reentry (bare)

(base) aiida@168c1dbd1cfe:~$  python -mtimeit -n1 -r1 'import reentry'
1 loop, best of 1: 25.1 msec per loop
(base) aiida@168c1dbd1cfe:~$ python -mtimeit  -s 'from reentry.default_manager import PluginManager as p' 'p().get_entry_map()'
100 loops, best of 5: 2.8 msec per loop

importlib-metadata (bare)

(base) aiida@168c1dbd1cfe:~$ python -mtimeit -n1 -r1 'import importlib_metadata'
1 loop, best of 1: 22.8 msec per loop
$ python -mtimeit -s 'import importlib_metadata as im' 'im.entry_points()'
5 loops, best of 5: 56.4 msec per loop

I guess here one could see a significant speed benefit from upgrading to python 3.8

@chrisjsewell
Copy link
Member

chrisjsewell commented Aug 10, 2021

As discussed in #5058, I think importlib-metadata >= 4.3 is may now be faster than importlib.metadata. For me on python 3.9, with importlib-metadata 4.6.3 and reentry 1.3.2:

$ python -mtimeit -n1 -r1 'import reentry'
1 loop, best of 1: 36.8 msec per loop
$ python -mtimeit  -s 'from reentry.default_manager import PluginManager as p' 'p().get_entry_map()'
1 loop, best of 5: 1.53 msec per loop
$ python -mtimeit -n1 -r1 'from importlib import metadata'
1 loop, best of 1: 39.1 msec per loop
$ python -mtimeit -s 'import importlib.metadata as im' 'im.entry_points()'
10 loops, best of 5: 30.3 msec per loop
$ python -mtimeit -n1 -r1 'import importlib_metadata'
1 loop, best of 1: 51 msec per loop
$ python -mtimeit -s 'import importlib_metadata as im' 'im.entry_points()'
20 loops, best of 5: 10.7 msec per loop

@ltalirz
Copy link
Member

ltalirz commented Aug 10, 2021

I confirm that there seems to have been a massive improvement from importlib_metadata v4.0.1 (that I was testing before) to v4.6.3

v4.0.1

$ python -mtimeit -s 'import importlib_metadata as im' 'im.entry_points()'
5 loops, best of 5: 59.3 msec per loop

v4.6.3

$ python -mtimeit -s 'import importlib_metadata as im' 'im.entry_points()'
50 loops, best of 5: 5.94 msec per loop

Both tests under python 3.9.1

@ltalirz
Copy link
Member

ltalirz commented Aug 12, 2021

Final note: I've archived reentry with a detailed explanation & archival note aiidateam/reentry@577e5bb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants