Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python Formatter] Preserve empty lines #10639

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

robincaloudis
Copy link
Contributor

@robincaloudis robincaloudis commented Mar 27, 2024

Summary

#9958 reported that code like

b = """
    This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
    Ruff should leave it as is
""";


a = "another normal string"

gets formatted as

b = """
    This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
    Ruff should leave it as is
""";
a = "another normal string"

This is unexpected and the expected behaviour of the formatter should preserve the empty lines between b and a.

What

  • Modified lines_after such that it respects the semicolon. In my opinion, the extension makes a lot of sense in this function as it should not break counting of lines due to a semicolon
  • Test call sides
  • I intentionally did not modify lines_before as alone standing ;'s are not allowed by the Python syntax anyway, e.g.
    b = """
        This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
        Ruff should leave it as is
    """;
    
    ;
    a = "another normal string"

Test Plan

  • Correct tests
  • Add tests

Closes #9958.

Even though there is a clear
new line between the lines, the
test was written such that it
tests the formatter trimming
away empty lines after `;`. This
is not expected. Changing the
test such that I have something
to test against.
@@ -33,7 +44,18 @@ source_type = Ipynb
This looks like a docstring but is not in a notebook because notebooks can't be imported as a module.
Ruff should leave it as is
"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, this empty like break was previously checked in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the existing behavior correct because the source type is a Jupyter Notebook which doesn't have a concept of module-level docstring. So, it's just a string followed by another string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @dhruvmanila, thanks for the review! Shouldn't the formatter keep the two empty lines between the two strings even though the concept of module-level docstring is missing? From what I understood so far, the existing behavior is not correct and should respect the empty lines as is.

@robincaloudis robincaloudis changed the title Formatter: Preserve empty lines [Pyton Formatter] Preserve empty lines Mar 27, 2024
Copy link

github-actions bot commented Mar 27, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@charliermarsh charliermarsh added the formatter Related to the formatter label Mar 28, 2024
@robincaloudis
Copy link
Contributor Author

robincaloudis commented Mar 28, 2024

Some formatting has changed in the ruff-ecosystem. Seems like the backslash("\") is not recognized correctly with this change. Probably lines_after_ignoring_end_of_line_trivia needs to be adapted such that it interprets SimpleTokenKind::Continuation correctly. I will look into that.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this.

I'm a bit afraid of merging this without more extensive test coverage because the empty lines logic is complicated and using the right combinations of lines_after and lines_before variants is essential to avoid instabilities.

I recommend you to add more tests that also include comments, as well as tests that cover suppressed statements with trailing semicolons

For example, the following comment is misplaced:

if True:
	a = test; # comment

	# trailing end of block comment

c

Get's reformated to

if True:
    a = test  # comment

# trailing end of block comment

c

crates/ruff_python_formatter/src/statement/suite.rs Outdated Show resolved Hide resolved
crates/ruff_python_trivia/src/tokenizer.rs Outdated Show resolved Hide resolved
crates/ruff_python_trivia/src/tokenizer.rs Outdated Show resolved Hide resolved
@dhruvmanila
Copy link
Member

Thank you for working on this! I agree with Micha that blank lines are hard to get right even if it seems to be working. Even if the implementation work in isolation, it breaks when mixed with other things (speaking from personal experience ;)). As suggested, testing this out thoroughly would be really helpful for us to evaluate the change.

This makes the additional test
fail again.
@zanieb
Copy link
Member

zanieb commented Mar 29, 2024

I'm surprised this never occurs in the current ecosystem checks. Is there a project that uses this syntax that we could test on for a real world case?

@zanieb zanieb changed the title [Pyton Formatter] Preserve empty lines [Pyhton Formatter] Preserve empty lines Mar 29, 2024
@zanieb zanieb changed the title [Pyhton Formatter] Preserve empty lines [Python Formatter] Preserve empty lines Mar 29, 2024
@robincaloudis
Copy link
Contributor Author

robincaloudis commented Apr 4, 2024

Hi @dhruvmanila and @MichaReiser, thank you both for your review. I agree. I rethought my current approach of extending lines_after_ignoring_end_of_line_trivia and decided to extend lines_after instead. Note that I intentionally did not extend lines_before. Find details in the PR description. This new approach should not mess up any corner cases, as it does not touches the current combination of lines_after and lines_before. @MichaReiser, do you mind to re-review? Thank you.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following up on this PR.

Would you mind taking a look at why

if True:
	a = test; # comment

	# trailing end of block comment

c

gets formatted differently when the semicolon is present? We don't need to fix it as part of this PR but I would like to understand the reason for it (to know if it makes sense to address it in a separate PR).

Is there any github project that has some usage of semicolons in Python? I would feel more comfortable when we could test the impact of this change on a larger code base.

@@ -76,6 +76,7 @@ pub fn lines_after(offset: TextSize, code: &str) -> u32 {
cursor.eat_char('\n');
newlines += 1;
}
';' => continue,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function comment is outdated.

The lines_after_ignoring_trivia is now no longer a more specific version of lines_after because it doesn't implement the "skip over commas". I think we should keep the two in sync. This possibly also applies to lines_after_ignoring_end_of_line_trivia but I would need to dig deeper into the actual usage.

@MichaReiser MichaReiser added the bug Something isn't working label Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working formatter Related to the formatter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Formatter trims empty lines after ;
5 participants