Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate module file paths to prevent redundant scans. #7747

Merged
merged 20 commits into from Nov 18, 2022

Conversation

emcd
Copy link
Contributor

@emcd emcd commented Nov 11, 2022

Maintain and reference a set of visited files while expanding the module files from list of module directories, package names, and standalone files to prevent duplicate scans.

Type of Changes

| βœ“ | πŸ› Bug fix |

Description

Closes #6242, #4053

@coveralls
Copy link

coveralls commented Nov 11, 2022

Pull Request Test Coverage Report for Build 3495785504

  • 8 of 8 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.0008%) to 95.413%

Totals Coverage Status
Change from base Build 3492123039: 0.0008%
Covered Lines: 17391
Relevant Lines: 18227

πŸ’› - Coveralls

@Pierre-Sassoulas Pierre-Sassoulas added Bug πŸͺ² Needs backport Needs to be cherry-picked on the current patch version by a pylint's maintainer labels Nov 11, 2022
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.15.6 milestone Nov 11, 2022
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing, much appreciated !

pylint/lint/pylinter.py Outdated Show resolved Hide resolved
emcd and others added 3 commits November 11, 2022 18:24
With 'path' as the key, we get deduplication for free
and do not need to reprocess the list for deduplication later.
pylint/lint/expand_modules.py Outdated Show resolved Hide resolved
@Pierre-Sassoulas Pierre-Sassoulas added the Needs review πŸ” Needs to be reviewed by one or multiple more persons label Nov 12, 2022
@emcd emcd marked this pull request as ready for review November 12, 2022 14:59
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check if it also fix #689, please ? Maybe we also need to sort the file to be stable between interpreters / operating system. Maybe a dict with an interpreter > 3.7 is enough.

tests/lint/unittest_expand_modules.py Outdated Show resolved Hide resolved
tests/lint/unittest_expand_modules.py Show resolved Hide resolved
doc/whatsnew/fragments/6242.bugfix Outdated Show resolved Hide resolved
doc/whatsnew/fragments/6242.bugfix Outdated Show resolved Hide resolved
@github-actions

This comment has been minimized.

@emcd
Copy link
Contributor Author

emcd commented Nov 13, 2022

Could you check if it also fix #689, please ? Maybe we also need to sort the file to be stable between interpreters / operating system. Maybe a dict with an interpreter > 3.7 is enough.

Do we want to sort the files? As a user of Pylint, I would prefer that it respects the order in which I specify arguments.
If we want reports to appear in a certain order, then it might be better to sort the reports rather the CLI arguments. (But I don't know how well that would interact with parallel jobs.)

Anyway, here's the output I get if I use the reproducer provided earlier this year:

$ pylint --score n --report n b.py a.py
************* Module b
b.py:6:0: C0115: Missing class docstring (missing-class-docstring)
b.py:6:0: C0103: Class name "B" doesn't conform to PascalCase naming style (invalid-name)
b.py:6:0: R0903: Too few public methods (1/2) (too-few-public-methods)
************* Module a
a.py:4:0: C0115: Missing class docstring (missing-class-docstring)
a.py:4:0: C0103: Class name "A" doesn't conform to PascalCase naming style (invalid-name)
a.py:5:4: C0116: Missing function or method docstring (missing-function-docstring)
a.py:5:4: C0104: Disallowed name "foo" (disallowed-name)
a.py:5:18: C0103: Argument name "x" doesn't conform to snake_case naming style (invalid-name)
a.py:4:0: R0903: Too few public methods (1/2) (too-few-public-methods)
$ pylint --score n --report n a.py b.py
************* Module a
a.py:4:0: C0115: Missing class docstring (missing-class-docstring)
a.py:4:0: C0103: Class name "A" doesn't conform to PascalCase naming style (invalid-name)
a.py:5:4: C0116: Missing function or method docstring (missing-function-docstring)
a.py:5:4: C0104: Disallowed name "foo" (disallowed-name)
a.py:5:18: C0103: Argument name "x" doesn't conform to snake_case naming style (invalid-name)
a.py:4:0: R0903: Too few public methods (1/2) (too-few-public-methods)
************* Module b
b.py:6:0: C0115: Missing class docstring (missing-class-docstring)
b.py:6:0: C0103: Class name "B" doesn't conform to PascalCase naming style (invalid-name)
b.py:6:0: R0903: Too few public methods (1/2) (too-few-public-methods)

So, the outputs are identical except for ordering. This looks right to me.

And, if I disable all of the checks that the reporter did not show, then I see:

$ pylint --score n --report n --disable=invalid-name,missing-function-docstring,missing-class-docstring,too-few-public-methods,disallowed-name b.py a.py
$ pylint --score n --report n --disable=invalid-name,missing-function-docstring,missing-class-docstring,too-few-public-methods,disallowed-name a.py b.py

I.e., identical output in spite of ordering. I do not know if this patch can take credit for the fix, but the issue does seem to be fixed one way or another.

@emcd
Copy link
Contributor Author

emcd commented Nov 13, 2022

Could you check if it also fix #689, please ? Maybe we also need to sort the file to be stable between interpreters / operating system. Maybe a dict with an interpreter > 3.7 is enough.

Do we want to sort the files? As a user of Pylint, I would prefer that it respects the order in which I specify arguments. If we want reports to appear in a certain order, then it might be better to sort the reports rather the CLI arguments. (But I don't know how well that would interact with parallel jobs.)

Anyway, here's the output I get if I use the reproducer provided earlier this year:

$ pylint --score n --report n b.py a.py
************* Module b
b.py:6:0: C0115: Missing class docstring (missing-class-docstring)
b.py:6:0: C0103: Class name "B" doesn't conform to PascalCase naming style (invalid-name)
b.py:6:0: R0903: Too few public methods (1/2) (too-few-public-methods)
************* Module a
a.py:4:0: C0115: Missing class docstring (missing-class-docstring)
a.py:4:0: C0103: Class name "A" doesn't conform to PascalCase naming style (invalid-name)
a.py:5:4: C0116: Missing function or method docstring (missing-function-docstring)
a.py:5:4: C0104: Disallowed name "foo" (disallowed-name)
a.py:5:18: C0103: Argument name "x" doesn't conform to snake_case naming style (invalid-name)
a.py:4:0: R0903: Too few public methods (1/2) (too-few-public-methods)
$ pylint --score n --report n a.py b.py
************* Module a
a.py:4:0: C0115: Missing class docstring (missing-class-docstring)
a.py:4:0: C0103: Class name "A" doesn't conform to PascalCase naming style (invalid-name)
a.py:5:4: C0116: Missing function or method docstring (missing-function-docstring)
a.py:5:4: C0104: Disallowed name "foo" (disallowed-name)
a.py:5:18: C0103: Argument name "x" doesn't conform to snake_case naming style (invalid-name)
a.py:4:0: R0903: Too few public methods (1/2) (too-few-public-methods)
************* Module b
b.py:6:0: C0115: Missing class docstring (missing-class-docstring)
b.py:6:0: C0103: Class name "B" doesn't conform to PascalCase naming style (invalid-name)
b.py:6:0: R0903: Too few public methods (1/2) (too-few-public-methods)

So, the outputs are identical except for ordering. This looks right to me.

And, if I disable all of the checks that the reporter did not show, then I see:

$ pylint --score n --report n --disable=invalid-name,missing-function-docstring,missing-class-docstring,too-few-public-methods,disallowed-name b.py a.py
$ pylint --score n --report n --disable=invalid-name,missing-function-docstring,missing-class-docstring,too-few-public-methods,disallowed-name a.py b.py

I.e., identical output in spite of ordering. I do not know if this patch can take credit for the fix, but the issue does seem to be fixed one way or another.

I just checked against the other reproducer in #689 - there is still a problem with which check appears to be associated with which module, but there are no extra check message depending on order:

$ pylint --report n --score n mybase.py override.py
************* Module mymodule.mybase
mybase.py:5:0: C0115: Missing class docstring (missing-class-docstring)
************* Module mymodule.override
override.py:7:4: C0103: Attribute name "myInvalidName" doesn't conform to snake_case naming style (invalid-name)
mybase.py:5:0: R0903: Too few public methods (1/2) (too-few-public-methods)
override.py:5:0: C0116: Missing function or method docstring (missing-function-docstring)
override.py:6:4: C0104: Disallowed name "foo" (disallowed-name)

$ pylint --report n --score n override.py mybase.py
************* Module mymodule.override
override.py:5:0: C0116: Missing function or method docstring (missing-function-docstring)
override.py:6:4: C0104: Disallowed name "foo" (disallowed-name)
************* Module mymodule.mybase
mybase.py:5:0: C0115: Missing class docstring (missing-class-docstring)
override.py:7:4: C0103: Attribute name "myInvalidName" doesn't conform to snake_case naming style (invalid-name)
mybase.py:5:0: R0903: Too few public methods (1/2) (too-few-public-methods)

So, I think that #689 or some variant thereof must remain open.

@github-actions

This comment has been minimized.

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.15.6, 2.15.7 Nov 14, 2022
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you. I have a suggestion to appease mypy, let me know what you think :)

pylint/lint/expand_modules.py Outdated Show resolved Hide resolved
tests/test_pylint_runners.py Outdated Show resolved Hide resolved
@github-actions

This comment has been minimized.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great first contribution, on a nasty bug that was open for far too long πŸ‘ (Probably a pre-requisite to fix the multiprocessing too, another very important issue). Final nitpick and let's merge, maybe even in 2.15.6 :)

tests/test_pylint_runners.py Outdated Show resolved Hide resolved
@github-actions

This comment has been minimized.

@Pierre-Sassoulas Pierre-Sassoulas merged commit 7b82cc8 into pylint-dev:main Nov 18, 2022
@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.15.7, 2.15.6 Nov 18, 2022
@github-actions
Copy link
Contributor

πŸ€– According to the primer, this change has no effect on the checked open source code. πŸ€–πŸŽ‰

This comment was generated for commit c67f204

@Pierre-Sassoulas Pierre-Sassoulas modified the milestones: 2.15.6, 2.15.7 Nov 19, 2022
@Pierre-Sassoulas Pierre-Sassoulas removed the Needs review πŸ” Needs to be reviewed by one or multiple more persons label Nov 21, 2022
@Pierre-Sassoulas Pierre-Sassoulas added Backported and removed Needs backport Needs to be cherry-picked on the current patch version by a pylint's maintainer labels Nov 23, 2022
Pierre-Sassoulas added a commit to Pierre-Sassoulas/pylint that referenced this pull request Nov 23, 2022
…#7747)

* Use dict for 'expand_modules' result rather than list.
With 'path' as the key, we get deduplication for free
and do not need to reprocess the list for deduplication later.
* Fix deduplication to account for CLI args marker.
* Fix corner case with CLI arg flag handling during deduplication.
* Add 'deduplication' to custom Pyenchant dict.

Closes pylint-dev#6242
Closes pylint-dev#4053

Co-authored-by: Eric McDonald <emcd@users.noreply.github.com>
Co-authored-by: Pierre Sassoulas <pierre.sassoulas@gmail.com>
Pierre-Sassoulas added a commit that referenced this pull request Nov 23, 2022
* Use dict for 'expand_modules' result rather than list.
With 'path' as the key, we get deduplication for free
and do not need to reprocess the list for deduplication later.
* Fix deduplication to account for CLI args marker.
* Fix corner case with CLI arg flag handling during deduplication.
* Add 'deduplication' to custom Pyenchant dict.

Closes #6242
Closes #4053

Co-authored-by: Eric McDonald <emcd@users.noreply.github.com>
Co-authored-by: Pierre Sassoulas <pierre.sassoulas@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too-many-branches scanning returns different values if directories to scan are explicitly specified.
4 participants