Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented new check use-maxsplit-arg #4469

Merged
merged 27 commits into from May 22, 2021
Merged

Conversation

yushao2
Copy link
Collaborator

@yushao2 yushao2 commented May 12, 2021

Steps

  • Add a ChangeLog entry describing what your PR does.
  • If it's a new feature or an important bug fix, add a What's New entry in
    doc/whatsnew/<current release.rst>.
  • Write a good description on what the PR does.

Description

Implemented a new checker, consider-using-str-partition, which checks for instances where
only the first or last element of a str.split() is accessed

Type of Changes

Type
✨ New feature

Related Issue

Closes #4440

Notes:

It seems like the check has difficulty detecting if something is a string or not when iterating over an iterable of strings that is returned from a function. See:

#4469 (comment)

#4469 (comment)

#4469 (comment)

That being said, I'm not too sure how to resolve these specific cases, or if the checker is alright as it is -- I'm just worried that it might be a bit too "weak" for the cases mentioned. Thanks!

@yushao2 yushao2 marked this pull request as draft May 12, 2021 06:56
@yushao2
Copy link
Collaborator Author

yushao2 commented May 12, 2021

Leaving this as draft as a moment because I'm having difficulty with a particular test case:

import os
path = "/home/"
dirs = os.listdirs(path)
extensions = [d.split(".")[-1] for d in dirs if "." in d]   # should fire an error

It seems like the d in this case can't be inferred as a string, and the .split() is also uninferrable -- however we know that os.listdirs returns a list of string.

Therefore I'm somewhat stuck at the moment -- will experiment around and see if I could fix it

@coveralls
Copy link

coveralls commented May 12, 2021

Coverage Status

Coverage decreased (-0.003%) to 91.723% when pulling 7e640f5 on yushao2:checkers-4440 into 55d3cb7 on PyCQA:master.

@yushao2 yushao2 changed the title Implemented new check consider-using-str-partition Implemented new check consider-using-str-partition May 12, 2021
@yushao2
Copy link
Collaborator Author

yushao2 commented May 12, 2021

Leaving this as draft as a moment because I'm having difficulty with a particular test case:

import os
path = "/home/"
dirs = os.listdirs(path)
extensions = [d.split(".")[-1] for d in dirs if "." in d]   # should fire an error

It seems like the d in this case can't be inferred as a string, and the .split() is also uninferrable -- however we know that os.listdirs returns a list of string.

Therefore I'm somewhat stuck at the moment -- will experiment around and see if I could fix it

It seems like some output of functions might not be infer-able, during my debugging process I found this and it seems like the return type information is not available for os.listdir():

image

I'm not entirely sure if there's a utility function to check, but for now I will be changing this PR to be ready for review

@yushao2
Copy link
Collaborator Author

yushao2 commented May 12, 2021

Leaving this as draft as a moment because I'm having difficulty with a particular test case:

import os
path = "/home/"
dirs = os.listdirs(path)
extensions = [d.split(".")[-1] for d in dirs if "." in d]   # should fire an error

It seems like the d in this case can't be inferred as a string, and the .split() is also uninferrable -- however we know that os.listdirs returns a list of string.
Therefore I'm somewhat stuck at the moment -- will experiment around and see if I could fix it

It seems like some output of functions might not be infer-able, during my debugging process I found this and it seems like the return type information is not available for os.listdir():

image

I'm not entirely sure if there's a utility function to check, but for now I will be changing this PR to be ready for review

Further investigation further confirming difficulties with determining type of iterable returned from functions:

list_of_strs = ["a", "b", "c", "d", "e", "f"]
for s in list_of_strs:
    print(s.split(" ")[0])  # [consider-using-str-partition]
    print(s.split(" ")[-1])  # [consider-using-str-partition]
    print(s.split(" ")[-2])

my_dict = {"a":0, "b": 0, "c": 0, "d": 0, "e": 0, "f":0}
for s in my_dict:
    print(s.split(" ")[0])  # [consider-using-str-partition]  (NOT FIRED)
    print(s.split(" ")[-1])  # [consider-using-str-partition] (NOT FIRED)
    print(s.split(" ")[-2])

@yushao2 yushao2 marked this pull request as ready for review May 12, 2021 09:30
@cdce8p cdce8p self-requested a review May 12, 2021 10:59
Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments.

--

import os
path = "/home/"
dirs = os.listdirs(path)
extensions = [d.split(".")[-1] for d in dirs if "." in d]   # should fire an error

It seems like the d in this case can't be inferred as a string, and the .split() is also uninferrable -- however we know that os.listdirs returns a list of string.

Therefore I'm somewhat stuck at the moment -- will experiment around and see if I could fix it

I won't worry too much about this. It's probably best to just ignore everything that can't easily be inferred.

ChangeLog Outdated Show resolved Hide resolved
doc/whatsnew/2.9.rst Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
@cdce8p cdce8p self-assigned this May 12, 2021
@cdce8p cdce8p added Checkers Related to a checker Enhancement ✨ Improvement to a component labels May 12, 2021
@yushao2 yushao2 marked this pull request as draft May 12, 2021 14:59
@yushao2
Copy link
Collaborator Author

yushao2 commented May 13, 2021

@cdce8p could I get another initial review on this? Have re-implemented as discussed -- but still in the midst of performing some refactoring (moving some code into utils)

@yushao2 yushao2 requested a review from cdce8p May 20, 2021 05:38
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
@cdce8p cdce8p changed the title Implemented new check consider-using-str-partition Implemented new check consider-using-maxsplit-arg May 20, 2021
@yushao2 yushao2 requested a review from cdce8p May 21, 2021 03:34
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice checker, and the non-opinionated performance checks are always a pleasure to merge. Thanks a lot for handling all the comments during the review. The design on this one got better and better thanks to your perseverance. Probably the main feature in 2.8.3 !

@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.8.3 milestone May 21, 2021
yushao2 and others added 2 commits May 21, 2021 15:23
@yushao2
Copy link
Collaborator Author

yushao2 commented May 21, 2021

Very nice checker, and the non-opinionated performance checks are always a pleasure to merge. Thanks a lot for handling all the comments during the review. The design on this one got better and better thanks to your perseverance. Probably the main feature in 2.8.3 !

thanks for the kind words :)

@yushao2 yushao2 changed the title Implemented new check consider-using-maxsplit-arg Implemented new check use-maxsplit-arg May 21, 2021
Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks so good.. Unfortunately I noticed one small issue during the final test run. (See comment below)

FYI: I also pushed a small commit with formatting changes. It wouldn't have been worth it to mention these changes individually.

pylint/checkers/refactoring/recommendation_checker.py Outdated Show resolved Hide resolved
Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They might be a bit contrived but I found two more cases that won't work properly:

"1,2,3".split('split')[-1]
"1,2,3".rsplit('rsplit')[0]

With the current logic, the separator will be replaced instead of the function name.

Some ideas

  • Use node.func.as_string() and call rsplit on it instead
  • I didn't know that myself, but whereas utils.get_argument_from_call(node, 0, "sep").value returns the unescaped separator, you can use utils.get_argument_from_call(node, 0, "sep").as_string()

tests/functional/u/use/use_maxsplit_arg.py Show resolved Hide resolved
Copy link
Member

@cdce8p cdce8p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's done! Tremendous work you did here @yushao2 🥇
The review process wasn't the easiest but you still did a really good job getting this one over the finish line. The next release will be awesome in no small part thanks to you 🐬

Will have to look at #4485 now I guess 😉 Can't be that you only contributed two new checkers for one release.

@cdce8p cdce8p merged commit 605afdb into pylint-dev:master May 22, 2021
@cdce8p cdce8p removed this from the 2.8.3 milestone May 22, 2021
@yushao2 yushao2 deleted the checkers-4440 branch January 4, 2023 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Checkers Related to a checker Enhancement ✨ Improvement to a component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggest partition() or rpartition() over specific uses of split()
4 participants