Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualize CR and LF symbols #465

Closed
vlsi opened this issue Oct 1, 2019 · 12 comments · Fixed by #485
Closed

Visualize CR and LF symbols #465

vlsi opened this issue Oct 1, 2019 · 12 comments · Fixed by #485

Comments

@vlsi
Copy link
Contributor

vlsi commented Oct 1, 2019

CR and LF are invisible in Spotless output.

For instance:

      src/core/src/main/java/org/apache/jmeter/SplashScreen.java
          @@ -52,7 +52,7 @@
                   setLocationRelativeTo(null);\n
               }\n
           \n
          -    private int x=2;\r\n
          +    private int x=2;\n
           \n
               /**\n
                * Show screen\n

So far so good. Spotless shows that \n should not be there (which is true because I'm on macOS, and lines should end with LF).

However, if I add one more line with "trailing whitespace", then the output becomes

      src/core/src/main/java/org/apache/jmeter/SplashScreen.java
          @@ -53,7 +53,7 @@
           ····}

           ·····private·int·x=2;
          -·····private·int·y=3;··
          +·····private·int·y=3;

           ····/**
           ·····*·Show·screen

Note: x=2 still has the wrong line ending, however, spotless does not print a warning on that.

spotlessApply corrects both issues (newline and the trailing whitespace).

Have you considered to use unicode characters like
SYMBOL FOR CARRIAGE RETURN, ␍
SYMBOL FOR LINE FEED, ␊
?

Then the output could be like

      src/core/src/main/java/org/apache/jmeter/SplashScreen.java
          @@ -53,7 +53,7 @@
           ····}

          -·····private·int·x=2;␍␊
          -·····private·int·y=3;··␊
          -·····private·int·x=2;␊
          +·····private·int·y=3;␊

           ····/**
           ·····*·Show·screen
@nedtwigg
Copy link
Member

nedtwigg commented Oct 1, 2019

I don't remember exactly, but it must be that we do some fancy "the only difference is line-endings" detection to decide whether to show them. Your suggestion is simpler in that we don't have to do that. So it seems like a good idea to me, I'd be happy to merge a PR for this.

My only concern is if the ␍ and ␊ symbols have any compat problems (e.g. if Travis or some other popular CI system garbles them, then the old approach seems better to me).

@vlsi
Copy link
Contributor Author

vlsi commented Oct 1, 2019

What do you think of "print dots only for the lines that have differences"?

@nedtwigg
Copy link
Member

nedtwigg commented Oct 1, 2019

So, what we're doing now, but just replace \n with linefeed symbol? I guess so, but that seems like a lateral change.

  • code more simple = good
  • UI more clear = good

Changing \n to linefeed character is just the teensiest bit clearer, and a teensy bit more fragile for unicode / ascii bugs. Seems like a wash unless we're also getting simpler code as a side-effect. But I guess I don't have an objection.

@vlsi
Copy link
Contributor Author

vlsi commented Oct 1, 2019

So, what we're doing now, but just replace \n with linefeed symbol?

For now, there's a logic that flips between the representations depending on the nature of the diff.

As you see, it prints "diff without dots, but with \r\n when only EOL differs", and it prints "diff with dots, but without \r\n when whitespace differs".
I have not opened the code yet.

However, I just thought it might be a nice improvement to print dots only for those lines that have whitespace issues.
At the same time, it might make sense to print fancy CRLF symbols only for the lines that have EOL issues.

That would keep code more clear, and it would be easier to spot the difference.

@nedtwigg
Copy link
Member

nedtwigg commented Oct 1, 2019

I thought it might be a nice improvement to print dots only for those lines that have whitespace issues.

Code less simple, UI more clear. I think this is an overall improvement.

fancy CRLF symbols vs \n and \r\n

This seems like bling rather than meat. If it simplifies the code, then it becomes meat. But if we're also adding your whitespace comment above, then it's just bling. And if there's a user using a tool with unicode problems, then they probably don't appreciate the bling.


Taking a step back for a minute, our fancy "diff detection" seems neat, but users don't know how it works. We think "oh we're being so helpful we only show X when there's an X problem". But it's totally possible that every day there are a thousand people looking at a Spotless error message who say "well I can see I used the wrong case here, but why isn't it showing me my whitespace anymore? Oh well I'll fix the case." So they fix the case manually, then Spotless says "WRONG! Look at this whitespace problem" so they fix that and then it says "WRONG! Look at this line-ending problem". And the user says "why can't they just show it all to me in the first place!"

Code has two stakeholders - builders and users. And they each benefit from simple code - builders are less likely to screw it up, and users are more likely to understand what's happening.

My instinct on this is that it's hard to read if every line ends with \r\n, but not so bad if every line ends with ␍␊. If the programmer tools ecosystem actually works with unicode, then it would be smart to simplify the code and replace \t\r\n with their fancy-shmancy unicode symbols, and make the diff simpler. But if we're going to do the fancy-shmancy logic to only show the changed part of the diff, then we might as well stay ASCII so that we're resilient to unicode bugs.

@vlsi
Copy link
Contributor Author

vlsi commented Oct 1, 2019

We think "oh we're being so helpful we only show X when there's an X problem".

Frankly speaking, dots in diffs make code harder to read for me.
Typical IDEs and code browsers don't really use dots instead of whitespace, so showing dots everywhere looks interesting, but it does disturb.

@nedtwigg
Copy link
Member

nedtwigg commented Oct 1, 2019

Sounds like "print dots only for those lines that have whitespace issues." is where we have agreement.

@vlsi
Copy link
Contributor Author

vlsi commented Oct 2, 2019

I see that Spotless does not really support CR-based (==\r only) end of lines.

If that is the case, then there's no need to visualize ␊

@nedtwigg
Copy link
Member

nedtwigg commented Oct 2, 2019

The problem is user-education. If Spotless says this line ends with \r, the user will have a hard time guessing that Spotless only shows CR, and even then it only shows that when there's a mismatch.

@vlsi
Copy link
Contributor Author

vlsi commented Nov 5, 2019

Running Spotless with file.encoding=ISO-8859-1 produces something like

17:01:43            �*���http://www.apache.org/licenses/LICENSE-2.0
17:01:43            �*
17:01:43            �*�Unless�required�by�applicable�law�or�agreed�to�in�writing,�software
17:01:43           -�*�distributed��under�the��License�is�distributed�on�an�"AS�IS"�BASIS,
17:01:43           -�*�WITHOUT��WARRANTIES�OR�CONDITIONS��OF�ANY�KIND,�either��express��or
17:01:43           -�*�implied.
17:01:43           -�*
17:01:43           +�*�distributed�under�the�License�is�distributed�on�an�"AS�IS"�BASIS,
17:01:43           +�*�WITHOUT�WARRANTIES�OR�CONDITIONS�OF�ANY�KIND,�either�express�or�implied.
17:01:43            �*�See�the�License�for�the�specific�language�governing�permissions�and
17:01:43            �*�limitations�under�the�License.
17:01:43           +�*

In other words, Spotless should probably refrain from using characters that are not encodable with file.encoding.

In other words, there should probably be two styles of "whitespace visualization", and the default one should be selected based on the encoding capabilities.

vlsi added a commit to vlsi/spotless that referenced this issue Nov 8, 2019
context lines, added-only lines, and removed-only lines are shown as usual in the diffs.

fixes diffplug#465
vlsi added a commit to vlsi/spotless that referenced this issue Nov 10, 2019
context lines, added-only lines, and removed-only lines are shown as usual in the diffs.

fixes diffplug#465
vlsi added a commit to vlsi/spotless that referenced this issue Nov 11, 2019
context lines, added-only lines, and removed-only lines are shown as usual in the diffs.

fixes diffplug#465
vlsi added a commit to vlsi/spotless that referenced this issue Nov 11, 2019
context lines, added-only lines, and removed-only lines are shown as usual in the diffs.

fixes diffplug#465
@nedtwigg
Copy link
Member

Released in x.26.0

@vlsi
Copy link
Contributor Author

vlsi commented Nov 12, 2019

Migrated to autostyle/autostyle#2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants