[Python Formatter] Preserve empty lines #10639

robincaloudis · 2024-03-27T22:15:22Z

Summary

#9958 reported that code like

b = """
    This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
    Ruff should leave it as is
""";


a = "another normal string"

gets formatted as

b = """
    This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
    Ruff should leave it as is
""";
a = "another normal string"

This is unexpected and the expected behaviour of the formatter should preserve the empty lines between b and a.

What

Modified lines_after such that it respects the semicolon. In my opinion, the extension makes a lot of sense in this function as it should not break counting of lines due to a semicolon
Test call sides

I intentionally did not modify lines_before as alone standing ;'s are not allowed by the Python syntax anyway, e.g.

b = """
    This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
    Ruff should leave it as is
""";

;
a = "another normal string"

Test Plan

Correct tests
Add tests

Closes #9958.

Even though there is a clear new line between the lines, the test was written such that it tests the formatter trimming away empty lines after `;`. This is not expected. Changing the test such that I have something to test against.

robincaloudis · 2024-03-27T22:21:53Z

crates/ruff_python_formatter/tests/snapshots/format@notebook_docstring.py.snap

@@ -33,7 +44,18 @@ source_type                = Ipynb
    This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
    Ruff should leave it as is
 """
+


Interestingly, this empty like break was previously checked in.

I think the existing behavior correct because the source type is a Jupyter Notebook which doesn't have a concept of module-level docstring. So, it's just a string followed by another string.

Hello @dhruvmanila, thanks for the review! Shouldn't the formatter keep the two empty lines between the two strings even though the concept of module-level docstring is missing? From what I understood so far, the existing behavior is not correct and should respect the empty lines as is.

github-actions · 2024-03-27T22:42:37Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

robincaloudis · 2024-03-28T08:31:00Z

Some formatting has changed in the ruff-ecosystem. Seems like the backslash("\") is not recognized correctly with this change. Probably lines_after_ignoring_end_of_line_trivia needs to be adapted such that it interprets SimpleTokenKind::Continuation correctly. I will look into that.

MichaReiser

Thanks for looking into this.

I'm a bit afraid of merging this without more extensive test coverage because the empty lines logic is complicated and using the right combinations of lines_after and lines_before variants is essential to avoid instabilities.

I recommend you to add more tests that also include comments, as well as tests that cover suppressed statements with trailing semicolons

For example, the following comment is misplaced:

if True:
	a = test; # comment

	# trailing end of block comment

c

Get's reformated to

if True:
    a = test  # comment

# trailing end of block comment

c

crates/ruff_python_formatter/src/statement/suite.rs

crates/ruff_python_trivia/src/tokenizer.rs

dhruvmanila · 2024-03-28T09:37:53Z

Thank you for working on this! I agree with Micha that blank lines are hard to get right even if it seems to be working. Even if the implementation work in isolation, it breaks when mixed with other things (speaking from personal experience ;)). As suggested, testing this out thoroughly would be really helpful for us to evaluate the change.

This makes the additional test fail again.

zanieb · 2024-03-29T03:35:45Z

I'm surprised this never occurs in the current ecosystem checks. Is there a project that uses this syntax that we could test on for a real world case?

robincaloudis · 2024-04-04T19:42:25Z

Hi @dhruvmanila and @MichaReiser, thank you both for your review. I agree. I rethought my current approach of extending lines_after_ignoring_end_of_line_trivia and decided to extend lines_after instead. Note that I intentionally did not extend lines_before. Find details in the PR description. This new approach should not mess up any corner cases, as it does not touches the current combination of lines_after and lines_before. @MichaReiser, do you mind to re-review? Thank you.

MichaReiser

Thanks for following up on this PR.

Would you mind taking a look at why

if True:
	a = test; # comment

	# trailing end of block comment

c

gets formatted differently when the semicolon is present? We don't need to fix it as part of this PR but I would like to understand the reason for it (to know if it makes sense to address it in a separate PR).

Is there any github project that has some usage of semicolons in Python? I would feel more comfortable when we could test the impact of this change on a larger code base.

MichaReiser · 2024-04-08T13:19:27Z

crates/ruff_python_trivia/src/tokenizer.rs

@@ -76,6 +76,7 @@ pub fn lines_after(offset: TextSize, code: &str) -> u32 {
                cursor.eat_char('\n');
                newlines += 1;
            }
+            ';' => continue,


The function comment is outdated.

The lines_after_ignoring_trivia is now no longer a more specific version of lines_after because it doesn't implement the "skip over commas". I think we should keep the two in sync. This possibly also applies to lines_after_ignoring_end_of_line_trivia but I would need to dig deeper into the actual usage.

robincaloudis added 4 commits March 27, 2024 22:51

Fix test

a1e8423

Even though there is a clear new line between the lines, the test was written such that it tests the formatter trimming away empty lines after `;`. This is not expected. Changing the test such that I have something to test against.

Recognize semi correctly

bc72cfa

Test compound level as well

4544bb4

Handle semi in compound level

0618b33

robincaloudis requested a review from MichaReiser as a code owner March 27, 2024 22:15

Change function name

af95c55

robincaloudis commented Mar 27, 2024

View reviewed changes

Fix formatting

7b6d7ab

robincaloudis changed the title ~~Formatter: Preserve empty lines~~ [Pyton Formatter] Preserve empty lines Mar 27, 2024

robincaloudis mentioned this pull request Mar 27, 2024

Formatter trims empty lines after ; #9958

Open

Test function change

bb04cb1

charliermarsh added the formatter Related to the formatter label Mar 28, 2024

MichaReiser requested changes Mar 28, 2024

View reviewed changes

crates/ruff_python_formatter/src/statement/suite.rs Outdated Show resolved Hide resolved

crates/ruff_python_trivia/src/tokenizer.rs Outdated Show resolved Hide resolved

crates/ruff_python_trivia/src/tokenizer.rs Outdated Show resolved Hide resolved

Revert changes

345e5a7

This makes the additional test fail again.

robincaloudis force-pushed the rc-semi branch from 677458b to 7f92ee8 Compare March 28, 2024 20:54

Handle semicolon

dc5180a

robincaloudis force-pushed the rc-semi branch from 7f92ee8 to dc5180a Compare March 28, 2024 21:00

zanieb changed the title ~~[Pyton Formatter] Preserve empty lines~~ [Pyhton Formatter] Preserve empty lines Mar 29, 2024

zanieb changed the title ~~[Pyhton Formatter] Preserve empty lines~~ [Python Formatter] Preserve empty lines Mar 29, 2024

robincaloudis added 7 commits April 2, 2024 22:54

Correct string content

3f09cc5

Test trailing semi in module level

41c143c

Test docstring with trailing semi

8dbed40

Handle semi in docstring

e736bfc

Change string content

0b652d6

Make strings fit to module scope

fab1c78

Remove assertion line

011c131

robincaloudis requested review from MichaReiser and dhruvmanila April 5, 2024 12:38

MichaReiser reviewed Apr 8, 2024

View reviewed changes

MichaReiser added the bug Something isn't working label Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Python Formatter] Preserve empty lines #10639

[Python Formatter] Preserve empty lines #10639

robincaloudis commented Mar 27, 2024 •

edited

robincaloudis Mar 27, 2024

dhruvmanila Mar 29, 2024 •

edited

robincaloudis Apr 4, 2024

github-actions bot commented Mar 27, 2024 •

edited

robincaloudis commented Mar 28, 2024 •

edited

MichaReiser left a comment •

edited

dhruvmanila commented Mar 28, 2024

zanieb commented Mar 29, 2024 •

edited

robincaloudis commented Apr 4, 2024 •

edited

MichaReiser left a comment •

edited

MichaReiser Apr 8, 2024

[Python Formatter] Preserve empty lines #10639

Are you sure you want to change the base?

[Python Formatter] Preserve empty lines #10639

Conversation

robincaloudis commented Mar 27, 2024 • edited

Summary

What

Test Plan

robincaloudis Mar 27, 2024

Choose a reason for hiding this comment

dhruvmanila Mar 29, 2024 • edited

Choose a reason for hiding this comment

robincaloudis Apr 4, 2024

Choose a reason for hiding this comment

github-actions bot commented Mar 27, 2024 • edited

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

robincaloudis commented Mar 28, 2024 • edited

MichaReiser left a comment • edited

Choose a reason for hiding this comment

dhruvmanila commented Mar 28, 2024

zanieb commented Mar 29, 2024 • edited

robincaloudis commented Apr 4, 2024 • edited

MichaReiser left a comment • edited

Choose a reason for hiding this comment

MichaReiser Apr 8, 2024

Choose a reason for hiding this comment

robincaloudis commented Mar 27, 2024 •

edited

dhruvmanila Mar 29, 2024 •

edited

github-actions bot commented Mar 27, 2024 •

edited

`ruff-ecosystem` results

robincaloudis commented Mar 28, 2024 •

edited

MichaReiser left a comment •

edited

zanieb commented Mar 29, 2024 •

edited

robincaloudis commented Apr 4, 2024 •

edited

MichaReiser left a comment •

edited