New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correctly match package/module names in import hook #144
Correctly match package/module names in import hook #144
Conversation
The previously used regular expression tried to handle both exact matches and and prefix matches in one go, using this approach: re.compile(r'^%s\.?' % pkg) However, this is incorrect, since the literal dot is optional in the pattern, causing longer matches to also get included. For example, ‘foo’ should match ‘foo’ and ‘foo.bar’, but it also incorrectly matches ‘foobar’: >>> re.compile(r'^foo\.?').match('foobar') <_sre.SRE_Match object; span=(0, 3), match='foo'> In practice, a command like this (using the pytest plugin as an example) is supposed to check the ‘flask’ package and any modules below it: pytest --typeguard-packages=flask ... but in reality it also checks other packages, such as ‘flask_sqlalchemy’ and ‘flask_redis’, if those happen to be installed. This can be easily fixed by not using regular expression, but simple string matching instead.
Alternatively, we could fix the RE to be |
big fan of simple code here. the regex wasn't obviously wrong the first time. sure, it can be fixed by making it more complex. or more clever. i am not a fan of clever. my preference is always to fix things by making them simpler. 😀 |
and to prove my point, your suggestion would still be wrong. 🙃 the string substitution can introduce wildcard characters in the regex. most notably a dot which is actually expected in submodule names. sure, another re.escape() thrown on top would fix that. but the end result would be even more complex/clever and hard to understand at a glance. |
Fair enough. |
Just wondering, is there anything that needs to be done here before this can be merged? It would be great if this and #143 could make it into a new release, so that it is easier to actually use this project without manually installing from a custom git branch (or maintaining a fork). No rush though. |
Since the test suite was passing before, it would be nice to get a test added that does not pass with the previous implementation. Can you do that? |
sure, i pushed a commit with some tests for the i also experimented with a more black-box approach, since the above-mentioned test does not not actually check the import hook itself. the code below approaches it from a higher level, and actually checks that typeguard successfully hijacked the import (by checking for the injected def test_package_name_with_match():
"""
The import hook injects a ‘typeguard’ import into matching modules.
"""
sys.modules.pop("dummymodule", None) # unload hack
with install_import_hook("dummymodule"):
module = import_module("dummymodule")
module.typeguard # typeguard import was injected
def test_package_name_no_match_prefix():
"""
The import hook does not hook into non-matching modules.
"""
sys.modules.pop("dummymodule", None) # unload hack
with install_import_hook("dummy"):
module = import_module("dummymodule")
with pytest.raises(AttributeError):
module.typeguard # typeguard import was not injected |
Perfect. Thanks again! |
The previously used regular expression tried to handle both exact matches and
and prefix matches in one go, using this approach:
However, this is incorrect, since the literal dot is optional in the
pattern, causing longer matches to also get included. For example, ‘foo’
should match ‘foo’ and ‘foo.bar’, but it also incorrectly matches ‘foobar’:
In practice, a command like this (using the pytest plugin as an example)
is supposed to check the ‘flask’ package and any modules below it:
... but in reality it also checks other packages, such as
‘flask_sqlalchemy’ and ‘flask_redis’, if those happen to be installed.
This can be easily fixed by not using regular expression, but simple
string matching instead.