[`flake8-pyi`] Implement `PYI059` (`generic-not-last-base-class`) #11233

tusharsadhwani · 2024-05-01T18:24:10Z

Summary

Implements Y059 from flake8-pyi: typing.Generic[] should be the last base in a class' bases list.

If Multiple Generic[] classes are found, we don't generate a fix for it.

Test Plan

cargo test and cargo insta review.

tusharsadhwani · 2024-05-01T18:26:31Z

Is it okay to add the autofix in the same PR?

zanieb · 2024-05-01T18:28:12Z

Just wondering, what happens if a class has two Generic[] bases?

zanieb · 2024-05-01T18:28:22Z

Is it okay to add the autofix in the same PR?

Yep!

tusharsadhwani · 2024-05-01T18:29:38Z

what happens if a class has two Generic[] bases?

It's a runtime error:

>>> import typing
>>> T = typing.TypeVar('T')
>>> U = typing.TypeVar('U')
>>> class C(typing.Generic[T], typing.Generic[U]):
...   ...
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Cellar/python@3.12/3.12.2_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/typing.py", line 1102, in _generic_init_subclass
    raise TypeError(
TypeError: Cannot inherit from Generic[...] multiple times.

zanieb · 2024-05-01T18:34:57Z

Good to know. We should make sure the rule doesn't behave poorly on that case regardless, i.e. it could cause an infinite fix loop.

github-actions · 2024-05-01T18:37:46Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+2 -0 violations, +0 -0 fixes in 1 projects; 43 projects unchanged)

rotki/rotki (+2 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ rotkehlchen/db/filtering.py:577:25: PYI059 [*] `Generic[]` should always be the last base class
+ rotkehlchen/types.py:977:34: PYI059 [*] `Generic[]` should always be the last base class

Changes by rule (1 rules affected)

code	total	+ violation	- violation	+ fix	- fix
PYI059	2	2	0	0	0

tusharsadhwani · 2024-05-01T18:39:13Z

We should make sure the rule doesn't behave poorly on that case regardless

Should multiple Generic[] case not raise the issue at all then?

AlexWaygood

Nice, thank you! Overall this looks good. Left some inline comments below :-)

crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last_base_class.rs

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

AlexWaygood · 2024-05-03T11:12:47Z

Should multiple Generic[] case not raise the issue at all then?

I'd say probably not. The behaviour we implemented at flake8-pyi is to ignore this rule if Generic[] appeared multiple times in the bases tuple, and I think it makes sense to do the same thing here. If you have something like class Foo(int, Generic[T], Generic[S]): pass, I think it's honestly just confusing for a user if they get a message about Generic[] "not being the last base class". There's a much more significant issue in their code (you can't have more than one Generic[] in the bases tuple), and if they solve the more significant issue, then Generic[] not being the last base class would be naturally solved as a side effect of that.

(We shouldn't flag multiple Generic[]s in this check; that could possibly be a separate check.)

crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last_base_class.rs

.../src/rules/flake8_pyi/snapshots/ruff_linter__rules__flake8_pyi__tests__PYI059_PYI059.py.snap

crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last_base_class.rs

AlexWaygood · 2024-05-03T11:48:47Z

crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last_base_class.rs

+fn generate_fix(bases: &Arguments, generic_base_index: usize, locator: &Locator) -> Fix {
+    let last_base = bases.args.last().expect("Last base should always exist");
+    let generic_base = bases
+        .args
+        .get(generic_base_index)
+        .expect("Generic base should always exist");
+    let next_base = bases
+        .args
+        .get(generic_base_index + 1)
+        .expect("Generic base should never be the last base during auto-fix");
+
+    let deletion = Edit::deletion(generic_base.start(), next_base.start());
+    let insertion = Edit::insertion(
+        format!(", {}", locator.slice(generic_base.range())),
+        last_base.end(),
+    );
+    Fix::safe_edits(insertion, [deletion])
+}


what happens for pathological cases like

class Foo( # comment about the bracket # Part 1 of multiline comment 1 # Part 2 of multiline comment 1 Generic[T] # comment about Generic[T] # another comment? , # comment about the comma? # part 1 of multiline comment 2 # part 2 of multiline comment 2 int, # comment about int # yet another comment? ): # and another one for good measure pass

It will technically work, but comments before int and after Generic[T] get eaten up. That is fixable, but the fix is subjective.

What's ruff's general stance in such cases? I know libcst has capabilities to try and identify which comment groups belong to which expression. But the easier thing would be to avoid moving comments at all, and just preserve comments.

Different rules take different stances; we don't have a unified policy yet. Several of our autofixes will currently happily delete your comments (e.g. #9779); I think we all agree that this is bad, but we haven't yet decided how bad it is (see #9790). Other rules approach comments with a perhaps-excessive level of caution, either using the raw token stream for their autofix (RUF022, RUF023), or using libcst. Both are much slower than simple text replacements, I think libcst especially.

the fix is subjective.

absolutely! But I think that's always the case for something like this. I don't mind that much if some comments end up in the "wrong place", but I do think we should try to avoid deleting any comments if it's not too hard or costly to do.

Trying to figure out how to get the start and end ranges of the comma after the Generic base. Tried using libcst to obtain the Comma itself, but does it have position information?

I would probably just read one character at a time after the end of the generic base until you see a comma — does that make sense here? I'd avoid LibCST unless it's absolutely necessary (it's slow).

I would probably just read one character at a time after the end of the generic base until you see a comma — does that make sense here? I'd avoid LibCST unless it's absolutely necessary (it's slow).

Agree -- memchr is probably the best tool for that (you can grep in the crates/ruff_linter directory for existing uses). Other possibilities are using libCST (but as Zanie says, it's really slow), or using the raw tokens (also can be slow, and probably more complicated here than using memchr).

If you look at https://play.ruff.rs/b2b834e8-0247-47fe-995a-c0ade0e6b240, with the following snippet:

class Foo(int, str): pass

On the AST tab on the right, we can see:

The ExprName node for the int base has range 10..13

The ExprName node for the str base has range 15..18

On the tokens tab on the right, we can see that:

The comma in between the two nodes has range 13..14

First try at this, I've used .find() (which I suppose can be swapped with memchr) and .position() (which I'm not sure if it can be replaced)

crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last_base_class.rs

…_base_class.rs Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

AlexWaygood · 2024-05-06T20:05:39Z

crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last_base_class.rs

+    let last_whitespace = (comma_after_generic_base + 1)
+        + locator.contents()[comma_after_generic_base + 1..]
+            .bytes()
+            .position(|b| !b.is_ascii_whitespace())
+            .expect("Non whitespace character must always exist after Generic[]");


I don't really understand what's going on here, I'm afraid :(

Is this the best name for this variable? Would you mind adding some comments to this function to explain how it works (preferably with some examples)?

I'm trying to find the first index in the source code after Generic[...],, which doesn't contain a whitespace.

Basically, deleting just Generic[T] and the comma after it would leave a trailing whitespace around (usually a space).

Without the whitespace detection, this code:

class Foo(Generic[T], Bar):

would be autofixed into:

class Foo( Bar, Generic[T]):

i.e. The space before Bar was not deleted.

I'm honestly not sure if this is the best way to write this, please feel free to refactor or comment on this.

I think deleting a positional argument from a function call during autofix is common enough that once we finalize on the way it should be done this can be lifted to make a helper instead.

Ah... in fact...

ruff/crates/ruff_linter/src/fix/edits.rs

Lines 155 to 237 in 12b5c3a

/// Generic function to remove arguments or keyword arguments in function

/// calls and class definitions. (For classes `args` should be considered

/// `bases`)

///

/// Supports the removal of parentheses when this is the only (kw)arg left.

/// For this behavior, set `remove_parentheses` to `true`.

pub(crate) fn remove_argument<T: Ranged>(

argument: &T,

arguments: &Arguments,

parentheses: Parentheses,

source: &str,

) -> Result<Edit> {

// Partition into arguments before and after the argument to remove.

let (before, after): (Vec<_>, Vec<_>) = arguments

.arguments_source_order()

.map(|arg| arg.range())

.filter(|range| argument.range() != *range)

.partition(|range| range.start() < argument.start());

if !after.is_empty() {

// Case 1: argument or keyword is _not_ the last node, so delete from the start of the

// argument to the end of the subsequent comma.

let mut tokenizer = SimpleTokenizer::starts_at(argument.end(), source);

// Find the trailing comma.

tokenizer

.find(|token| token.kind == SimpleTokenKind::Comma)

.context("Unable to find trailing comma")?;

// Find the next non-whitespace token.

let next = tokenizer

.find(|token| {

token.kind != SimpleTokenKind::Whitespace && token.kind != SimpleTokenKind::Newline

})

.context("Unable to find next token")?;

Ok(Edit::deletion(argument.start(), next.start()))

} else if let Some(previous) = before.iter().map(Ranged::end).max() {

// Case 2: argument or keyword is the last node, so delete from the start of the

// previous comma to the end of the argument.

let mut tokenizer = SimpleTokenizer::starts_at(previous, source);

// Find the trailing comma.

let comma = tokenizer

.find(|token| token.kind == SimpleTokenKind::Comma)

.context("Unable to find trailing comma")?;

Ok(Edit::deletion(comma.start(), argument.end()))

} else {

// Case 3: argument or keyword is the only node, so delete the arguments (but preserve

// parentheses, if needed).

Ok(match parentheses {

Parentheses::Remove => Edit::range_deletion(arguments.range()),

Parentheses::Preserve => Edit::range_replacement("()".to_string(), arguments.range()),

})

}

}

/// Generic function to add arguments or keyword arguments to function calls.

pub(crate) fn add_argument(

argument: &str,

arguments: &Arguments,

comment_ranges: &CommentRanges,

source: &str,

) -> Edit {

if let Some(last) = arguments.arguments_source_order().last() {

// Case 1: existing arguments, so append after the last argument.

let last = parenthesized_range(

match last {

ArgOrKeyword::Arg(arg) => arg.into(),

ArgOrKeyword::Keyword(keyword) => (&keyword.value).into(),

},

arguments.into(),

comment_ranges,

source,

)

.unwrap_or(last.range());

Edit::insertion(format!(", {argument}"), last.end())

} else {

// Case 2: no arguments. Add argument, without any trailing comma.

Edit::insertion(argument.to_string(), arguments.start() + TextSize::from(1))

}

}

I'm so sorry, I completely forgot that we had those helpers!

I also didn't find them, so partly on me :P it uses the Tokenizer which is pretty clever

I'll go ahead and refactor it.

AlexWaygood · 2024-05-06T20:13:28Z

(Everything looks great to me now except the autofix code, which I'm still not quite sure about -- both in terms of code readability and in terms of the number of .expect() and .unwrap() calls in it :-)

AlexWaygood · 2024-05-06T20:47:06Z

@tusharsadhwani, I've just pushed 6fce0fd, which gets rid of the remaining .expect() and .unwrap() calls from generate_fix(). Instead, the errors are bubbled up by returning an anyhow::Result<Fix> from generate_fix(). In combination with the Diagnostic::try_set_fix() method, that means that, rather than panicking, we'll gracefully log the error and move on if we fail to construct a Fix. That's usually preferable for this kind of thing, since it's not 100% essential that a Fix is constructed, but panicking would be pretty bad.

tusharsadhwani · 2024-05-06T21:15:40Z

Okay, this is much cleaner.

AlexWaygood

Thanks! :D

tusharsadhwani added 3 commits May 1, 2024 23:48

[flake8-pyi] Implement PYI059

51ea306

add negative cases to pyi test case

1f303fd

improve tests to use attribute as well

586c8d1

tusharsadhwani requested a review from AlexWaygood as a code owner May 1, 2024 18:24

tusharsadhwani added 4 commits May 2, 2024 01:05

Add auto fix

16ac12a

clippy

68aa934

Make issue non fixable if multiple generics are found in bases

7e3a5d2

clippy fix

97a9f93

AlexWaygood reviewed May 3, 2024

View reviewed changes

[wip] Apply suggestions from code review

84a3edd

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

tusharsadhwani added 2 commits May 3, 2024 17:05

cleanup code and add bare generic test case

3cf3ed3

i thought i set a pre commit hook for clippy

ed30b1a

AlexWaygood reviewed May 3, 2024

View reviewed changes

raise diagnostic on bases list

26df9fe

AlexWaygood reviewed May 3, 2024

View reviewed changes

preserve comments between the two bases

281cf09

AlexWaygood reviewed May 3, 2024

View reviewed changes

crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last_base_class.rs Outdated Show resolved Hide resolved

refactor to avoid some unwraps

2c82817

AlexWaygood reviewed May 3, 2024

View reviewed changes

crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last_base_class.rs Outdated Show resolved Hide resolved

tusharsadhwani and others added 2 commits May 3, 2024 22:07

Update crates/ruff_linter/src/rules/flake8_pyi/rules/generic_not_last…

587d5a2

…_base_class.rs Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>

fmt

0b28a2e

AlexWaygood reviewed May 6, 2024

View reviewed changes

Remove remaining unwrap() and expect() calls in generate_fix()

6fce0fd

use argument helpers

fc70de6

Small touch-ups

7750c91

AlexWaygood approved these changes May 7, 2024

View reviewed changes

AlexWaygood changed the title ~~[flake8-pyi] Implement PYI059~~ [flake8-pyi] Implement PYI059 (generic-not-last-base-class) May 7, 2024

AlexWaygood enabled auto-merge (squash) May 7, 2024 10:02

AlexWaygood merged commit bc3f4fa into astral-sh:main May 7, 2024
19 checks passed

tusharsadhwani deleted the pyi-059 branch May 7, 2024 18:30

BrewTestBot mentioned this pull request May 9, 2024

ruff 0.4.4 Homebrew/homebrew-core#171299

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`flake8-pyi`] Implement `PYI059` (`generic-not-last-base-class`) #11233

[`flake8-pyi`] Implement `PYI059` (`generic-not-last-base-class`) #11233

tusharsadhwani commented May 1, 2024 •

edited

tusharsadhwani commented May 1, 2024

zanieb commented May 1, 2024

zanieb commented May 1, 2024

tusharsadhwani commented May 1, 2024

zanieb commented May 1, 2024

github-actions bot commented May 1, 2024 •

edited

tusharsadhwani commented May 1, 2024

AlexWaygood left a comment

AlexWaygood commented May 3, 2024 •

edited

AlexWaygood May 3, 2024 •

edited

tusharsadhwani May 3, 2024 •

edited

AlexWaygood May 3, 2024 •

edited

tusharsadhwani May 3, 2024

zanieb May 3, 2024

AlexWaygood May 3, 2024

tusharsadhwani May 3, 2024 •

edited

AlexWaygood May 6, 2024 •

edited

tusharsadhwani May 6, 2024

tusharsadhwani May 6, 2024

tusharsadhwani May 6, 2024

AlexWaygood May 6, 2024

AlexWaygood May 6, 2024

tusharsadhwani May 6, 2024

tusharsadhwani May 6, 2024

AlexWaygood commented May 6, 2024

AlexWaygood commented May 6, 2024

tusharsadhwani commented May 6, 2024

AlexWaygood left a comment

	/// Generic function to remove arguments or keyword arguments in function
	/// calls and class definitions. (For classes `args` should be considered
	/// `bases`)
	///
	/// Supports the removal of parentheses when this is the only (kw)arg left.
	/// For this behavior, set `remove_parentheses` to `true`.
	pub(crate) fn remove_argument<T: Ranged>(
	argument: &T,
	arguments: &Arguments,
	parentheses: Parentheses,
	source: &str,
	) -> Result<Edit> {
	// Partition into arguments before and after the argument to remove.
	let (before, after): (Vec<_>, Vec<_>) = arguments
	.arguments_source_order()
	.map(\|arg\| arg.range())
	.filter(\|range\| argument.range() != *range)
	.partition(\|range\| range.start() < argument.start());

	if !after.is_empty() {
	// Case 1: argument or keyword is _not_ the last node, so delete from the start of the
	// argument to the end of the subsequent comma.
	let mut tokenizer = SimpleTokenizer::starts_at(argument.end(), source);

	// Find the trailing comma.
	tokenizer
	.find(\|token\| token.kind == SimpleTokenKind::Comma)
	.context("Unable to find trailing comma")?;

	// Find the next non-whitespace token.
	let next = tokenizer
	.find(\|token\| {
	token.kind != SimpleTokenKind::Whitespace && token.kind != SimpleTokenKind::Newline
	})
	.context("Unable to find next token")?;

	Ok(Edit::deletion(argument.start(), next.start()))
	} else if let Some(previous) = before.iter().map(Ranged::end).max() {
	// Case 2: argument or keyword is the last node, so delete from the start of the
	// previous comma to the end of the argument.
	let mut tokenizer = SimpleTokenizer::starts_at(previous, source);

	// Find the trailing comma.
	let comma = tokenizer
	.find(\|token\| token.kind == SimpleTokenKind::Comma)
	.context("Unable to find trailing comma")?;

	Ok(Edit::deletion(comma.start(), argument.end()))
	} else {
	// Case 3: argument or keyword is the only node, so delete the arguments (but preserve
	// parentheses, if needed).
	Ok(match parentheses {
	Parentheses::Remove => Edit::range_deletion(arguments.range()),
	Parentheses::Preserve => Edit::range_replacement("()".to_string(), arguments.range()),
	})
	}
	}

	/// Generic function to add arguments or keyword arguments to function calls.
	pub(crate) fn add_argument(
	argument: &str,
	arguments: &Arguments,
	comment_ranges: &CommentRanges,
	source: &str,
	) -> Edit {
	if let Some(last) = arguments.arguments_source_order().last() {
	// Case 1: existing arguments, so append after the last argument.
	let last = parenthesized_range(
	match last {
	ArgOrKeyword::Arg(arg) => arg.into(),
	ArgOrKeyword::Keyword(keyword) => (&keyword.value).into(),
	},
	arguments.into(),
	comment_ranges,
	source,
	)
	.unwrap_or(last.range());
	Edit::insertion(format!(", {argument}"), last.end())
	} else {
	// Case 2: no arguments. Add argument, without any trailing comma.
	Edit::insertion(argument.to_string(), arguments.start() + TextSize::from(1))
	}
	}

[flake8-pyi] Implement PYI059 (generic-not-last-base-class) #11233

[flake8-pyi] Implement PYI059 (generic-not-last-base-class) #11233

Conversation

tusharsadhwani commented May 1, 2024 • edited

Summary

Test Plan

tusharsadhwani commented May 1, 2024

zanieb commented May 1, 2024

zanieb commented May 1, 2024

tusharsadhwani commented May 1, 2024

zanieb commented May 1, 2024

github-actions bot commented May 1, 2024 • edited

ruff-ecosystem results

Linter (stable)

Linter (preview)

tusharsadhwani commented May 1, 2024

AlexWaygood left a comment

Choose a reason for hiding this comment

AlexWaygood commented May 3, 2024 • edited

AlexWaygood May 3, 2024 • edited

Choose a reason for hiding this comment

tusharsadhwani May 3, 2024 • edited

Choose a reason for hiding this comment

AlexWaygood May 3, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tusharsadhwani May 3, 2024 • edited

Choose a reason for hiding this comment

AlexWaygood May 6, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexWaygood commented May 6, 2024

AlexWaygood commented May 6, 2024

tusharsadhwani commented May 6, 2024

AlexWaygood left a comment

Choose a reason for hiding this comment

[`flake8-pyi`] Implement `PYI059` (`generic-not-last-base-class`) #11233

[`flake8-pyi`] Implement `PYI059` (`generic-not-last-base-class`) #11233

tusharsadhwani commented May 1, 2024 •

edited

github-actions bot commented May 1, 2024 •

edited

`ruff-ecosystem` results

AlexWaygood commented May 3, 2024 •

edited

AlexWaygood May 3, 2024 •

edited

tusharsadhwani May 3, 2024 •

edited

AlexWaygood May 3, 2024 •

edited

tusharsadhwani May 3, 2024 •

edited

AlexWaygood May 6, 2024 •

edited