Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

♻️ REFACTOR: Remove reentry requirement #5058

Merged
merged 29 commits into from Aug 12, 2021

Conversation

chrisjsewell
Copy link
Member

@chrisjsewell chrisjsewell commented Aug 10, 2021

Utilise the python 3.10 select API (https://docs.python.org/3.10/library/importlib.metadata.html), via importlib-metadata.

The code currently includes a fall back to the old API, since I have previously encountered an issue with old versions of importlib_metadata being installed on Conda: python/importlib_metadata#308

Also added typing to module and mypy type checking

@chrisjsewell
Copy link
Member Author

@ltalirz not sure what was happening with #4960, but yeh wanted to push this through

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Aug 10, 2021

http://www.quantum-espresso.org/ is currently down with crazy error 😬 (hence linkcheck failing)

@ltalirz
Copy link
Member

ltalirz commented Aug 10, 2021

Just to understand: are you building on top of #4960 or not?

Have all references to reentry been removed from the codebase?

@chrisjsewell
Copy link
Member Author

Just to understand: are you building on top of #4960 or not?

no

Have all references to reentry been removed from the codebase?

yes

@codecov
Copy link

codecov bot commented Aug 10, 2021

Codecov Report

Merging #5058 (d540e99) into develop (6f93253) will increase coverage by 0.01%.
The diff coverage is 94.45%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5058      +/-   ##
===========================================
+ Coverage    80.39%   80.39%   +0.01%     
===========================================
  Files          529      529              
  Lines        36881    36882       +1     
===========================================
+ Hits         29646    29649       +3     
+ Misses        7235     7233       -2     
Flag Coverage Δ
django 74.90% <94.45%> (+0.01%) ⬆️
sqlalchemy 73.80% <94.45%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/orm/nodes/node.py 96.29% <ø> (ø)
aiida/orm/nodes/process/calculation/calcjob.py 78.05% <0.00%> (ø)
aiida/plugins/__init__.py 100.00% <ø> (ø)
aiida/plugins/entry_point.py 93.75% <95.84%> (+1.94%) ⬆️
aiida/cmdline/commands/cmd_computer.py 80.73% <100.00%> (ø)
aiida/manage/database/integrity/plugins.py 96.56% <100.00%> (-0.11%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f93253...d540e99. Read the comment docs.

Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that you keep updating the PR.

I'll stop going through it now and come back later - please let me know once it's ready for review.

As for using the importlib_metadata: I haven't yet looked at your implementation, but I would just like to mention that in python 3.9 (that has the importlib.metadata in the standard library) using the importlib_metadata package as a drop-in replacement was significantly slower, see #3458 (comment)

I.e. if we are going to rely on this, we should make sure to use the fast implementation whenever it is available (maybe you already do this; I will check later).

@@ -1,5 +1,5 @@
[build-system]
requires = ["setuptools>=40.8.0", "wheel", "reentry~=1.3", "fastentrypoints~=0.12"]
Copy link
Member

@ltalirz ltalirz Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remember to mention in the eventual commit message that fastentrypoints is also being removed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to check, is there a reason you know of why we would still need this? I believe the issue it solved is no longer an issue?

Copy link
Member

@ltalirz ltalirz Aug 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you compare the cli scripts produced by pip-installing the PR (e.g. the verdi one) against the current ones?

If the new ones do not import pkg_resources, I guess we can remove the dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is without fastentrypoints, so I guess it is now ok to remove:

#!/Users/chrisjsewell/Documents/GitHub/aiida_core_develop/.tox/py38-pre-commit/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'aiida-core','console_scripts','verdi'
import re
import sys

# for compatibility with easy_install; see #2198
__requires__ = 'aiida-core'

try:
    from importlib.metadata import distribution
except ImportError:
    try:
        from importlib_metadata import distribution
    except ImportError:
        from pkg_resources import load_entry_point


def importlib_load_entry_point(spec, group, name):
    dist_name, _, _ = spec.partition('==')
    matches = (
        entry_point
        for entry_point in distribution(dist_name).entry_points
        if entry_point.group == group and entry_point.name == name
    )
    return next(matches).load()


globals().setdefault('load_entry_point', importlib_load_entry_point)


if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(load_entry_point('aiida-core', 'console_scripts', 'verdi')())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeh so actually the current develop verdi script is:

#!/Users/chrisjsewell/Documents/GitHub/aiida-core-origin/.tox/py39-pre-commit/bin/python
# -*- coding: utf-8 -*-
# EASY-INSTALL-ENTRY-SCRIPT: 'aiida-core','console_scripts','verdi'
__requires__ = 'aiida-core'
import re
import sys

from aiida.cmdline.commands.cmd_verdi import verdi

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(verdi())

I guess you need importlib_metadata installed to produce the new script?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I re-added fastentrypoints: I guess that the second script, with the direct verdi import, is still sightly quicker

@chrisjsewell
Copy link
Member Author

I notice that you keep updating the PR.

yeh sorry just minor fixes to the tests

@chrisjsewell
Copy link
Member Author

but I would just like to mention that in python 3.9 (that has the importlib.metadata in the standard library) using the importlib_metadata package as a drop-in replacement was significantly slower

the complication here though comes in that importlib_metadata now allows the new select API, which is only available in python 3.10. So is it still slower to use that API vs the old API with importlib.metadata?

@chrisjsewell
Copy link
Member Author

I would also note that from v4.3.0, there is a claimed performance improvement: python/importlib_metadata#317

setup.json Outdated Show resolved Hide resolved
@chrisjsewell
Copy link
Member Author

Just playing around with autocompletion on my computer, and the performance seems good

@sphuber
Copy link
Contributor

sphuber commented Aug 10, 2021

Just playing around with autocompletion on my computer, and the performance seems good

You upped the time limit for the load time of verdi to 0.5. Is that still necessary after the update to a more recent importlib.metadata? In other words, is with the latest version the load time still slower compared to reentry or do we have comparable or better performance?

@chrisjsewell
Copy link
Member Author

You upped the time limit for the load time of verdi to 0.5.

Haha you noticed my cheat, yeh a few of them were sneaking over (I added a run for all python versions)

  • python 3.7: SUCCESS: loading time 0.37 at iteration 1 below 0.5
  • python 3.8: SUCCESS: loading time 0.33 at iteration 1 below 0.5
  • python 3.9: SUCCESS: loading time 0.42 at iteration 1 below 0.5

Comparing that to the latest run on develop (python 3.8): SUCCESS: loading time 0.34 at iteration 1 below

so not much in it

@chrisjsewell
Copy link
Member Author

Happy to approve once you revert the increase of the time limits for the performance tests

reverted

@ltalirz ltalirz self-requested a review August 12, 2021 08:34
ltalirz
ltalirz previously approved these changes Aug 12, 2021
@chrisjsewell
Copy link
Member Author

ok @sphuber, do you want to give this another try before I merge today?

@giovannipizzi
Copy link
Member

@chrisjsewell, just to clarify if I can look into this in the next few days: do we have until the next dev meeting or you are going to merge this today?

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Aug 12, 2021

do we have until the next dev meeting or you are going to merge this today?

I would say that, now we have made this final improvement (6689fd2) and removed any significant timing difference to reentry, there is no need to wait. Unless obviously you feel strongly we should?

@ltalirz
Copy link
Member

ltalirz commented Aug 12, 2021

From my side I agree that there is no reason to wait anymore

@chrisjsewell
Copy link
Member Author

@ltalirz, if you have time, perhaps you could quickly pop a new comparison of the new vs reentry times on here? (I'm working on another branch at the mo, so can't)

@ltalirz
Copy link
Member

ltalirz commented Aug 12, 2021

this branch

$ COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash hyperfine --warmup 3 verdi
Benchmark #1: verdi
  Time (mean ± σ):     272.7 ms ±   2.3 ms    [User: 207.6 ms, System: 42.7 ms]
  Range (min … max):   269.8 ms … 277.9 ms    10 runs

develop

$ COMP_CWORD="2" COMP_WORDS="verdi data g" _VERDI_COMPLETE=complete-bash hyperfine --warmup 3 verdi
Benchmark #1: verdi
  Time (mean ± σ):     261.0 ms ±   9.9 ms    [User: 194.2 ms, System: 43.4 ms]
  Range (min … max):   249.0 ms … 281.0 ms    11 runs

This is really negligible and won't be noticed by users (we're also getting into a region where I would need to make sure the difference is really reproducible)

@giovannipizzi
Copy link
Member

Thanks for the clarification, it was just since you asked for feedback and I didn't understand that the time difference has disappeared. If the time difference is negligible I'm ok with the merge.
Just to report, I also run on my computer and I get:

This branch:

Benchmark #1: verdi
  Time (mean ± σ):     387.7 ms ±  18.9 ms    [User: 315.7 ms, System: 65.9 ms]
  Range (min … max):   371.0 ms … 428.8 ms    10 runs

develop:

Benchmark #1: verdi
  Time (mean ± σ):     338.6 ms ±   5.1 ms    [User: 275.4 ms, System: 57.6 ms]
  Range (min … max):   331.4 ms … 346.4 ms    10 runs

So I'm OK to merge.

However, while looking into it with cProfile+snakeviz, I realised that a call to cmd_computer was taking a lot of time (~80ms) even when the command was not involved (e.g. when expanding "verdi data bands sho").

In this branch, if you comment out the whole content of the cmd_computer.py file, indeed the time goes down significantly from ~370ms to ~305ms.
If you just uncomment this single line:

from aiida.transports import cli as transport_cli

Then the time goes up again.

If you want to give a look into this already now, great, otherwise I'm ok with merging this as is since this would be a further improvement, but in this case could you please open an issue mentioning this? (I also have the feeling that there might be at least one or two similar things one could improve, e.g. also cmd_archive seems to be taking some time, but the cProfile+snakeviz is quite complex to "parse" and I still don't know if it's in addition to the time of cmd_computer or not).

@chrisjsewell
Copy link
Member Author

thanks @giovannipizzi! I will indeed have a quick check

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Aug 12, 2021

the cProfile+snakeviz is quite complex to "parse"

FYI @giovannipizzi maybe check out https://github.com/benfred/py-spy, which is a lot easier to use

sphuber
sphuber previously approved these changes Aug 12, 2021
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get this show on the road 🚀

@chrisjsewell
Copy link
Member Author

chrisjsewell commented Aug 12, 2021

commit message:

This commit replaces the use of reentry, for entry point loading,
with importlib_metadata and, in turn,
removes the requirement for users to run reentry scan after installations.

aiida-core makes heavy use of entry-points to define plugins.
The reentry package was introduced to load these plugins since,
at the time, the de facto pkg_resources method for using entry points
was too slow, in particular for responsive CLI usage.
This, however, came with the drawback that users must perform an extra step
to register the plugins before aiida-core can be used, or when new plugins are installed.

In recent years importlib.metadata and its backport importlib_metadata
has replaced pkg_resources, and as of python/importlib_metadata#317
is now on a par with reentry for performance.

For now, we use importlib_metadata for all python versions,
rather than the built-in (as of python 3.8) importlib.metadata,
so that we can use the new python 3.10 API and performance boosts.

@chrisjsewell
Copy link
Member Author

ah @sphuber you ** lol, merging commits when I'm almost finished running the branch update

@chrisjsewell
Copy link
Member Author

are you gonna be merging anymore?

@sphuber
Copy link
Contributor

sphuber commented Aug 12, 2021

ah @sphuber you ** lol, merging commits when I'm almost finished running the branch update

Can't help it you're not quick enough, I updated the other branch before you updated this one.

But yeah, not merging anything in the near future.

@chrisjsewell chrisjsewell dismissed stale reviews from sphuber and ltalirz via d540e99 August 12, 2021 13:39
@chrisjsewell chrisjsewell merged commit 3ad0712 into aiidateam:develop Aug 12, 2021
@chrisjsewell chrisjsewell deleted the remove-reentry branch August 12, 2021 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plugins: Investigate official importlib.metadata for entrypoint discovery
4 participants