-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve unusual unicode whitespace #1658
Conversation
This is probably not ready to merge yet, I just wanted to put it here to gather feedback. |
This is way out of my comfort zone, who do you think should review this? If no one, I'd be happy to just merge it if you believe this is the right thing to do :) |
This is only about JSX, right? |
Yep, just for text in JSX! I haven't tested |
@vjeux This is also a little out of my comfort zone 😀 I'm pretty sure it is the right thing to do, I just want to test that my assumptions about JSX are true by testing the behaviour of a range of white space characters. Once I've done that then this should be good to merge! |
What is a "normal jSX string"? Where did you find the |
This test file suggests that only sequences of <!DOCTYPE html>
<meta charset="utf8">
<title>space test</title>
<p>space b</p>
<p>\fb</p>
<p>\n
b</p>
<p>\r
b</p>
<p>\t b</p>
<p>\v�b</p>
<p>\u00a0 b</p>
<hr>
<p>space b</p>
<p>\f b</p>
<p>\n
b</p>
<p>\r
b</p>
<p>\t b</p>
<p>\v � b</p>
<p>\u00a0 b</p> |
Thanks, that is the test case I was about to whip up! I got the character range by taking all the white space chars that match the I suspected I didn't have the correct range, so wanted to test before requesting a review of this PR. I'll update the PR based on your tests! |
"normal JSX strings" is a typo (adding these comments from my phone) it was meant to say "normal JS strings"! |
I’ve read up a bit on whitespace handling in JSX. Apparently, it is not part of the JSX spec: facebook/jsx#19 (comment). Unlike JS whitespace, which is discarded while parsing, JSX whitespace is preserved in the AST nodes containing JSX text. For example, try this in the REPL and you’ll see it: require("babylon").parse("<div>\n three spaces: end\n</div>") It is instead up to the tool, such as Babel, that transforms JSX AST nodes to JS AST nodes (function calls) to do stuff with the whitespace. Playing around in the Babel REPL confirms that Babel does not treat Also, Babel does not collapse spaces between words. Prettier 1.3.1 respects that, but latest master collapses it into 1 space. Perhaps I should open a separate issue about this? So in theory we can’t change JSX formatting at all without changing the meaning the of the program. But it would be super sad to give up all our super nice JSX formatting! After all, JSX is most commonly used with Babel anyway. |
Interesting stuff! I didn't know that the JSX spec treated whitespace as significant! I'm not sure what that means for Prettier or this PR. Are we happy for Prettier to keep treating JSX whitespace as something that can be modified? I've updated the PR to only count |
|
This PR currently removes duplicate spaces between JSX elements. Based on @lydell's findings it looks like we will probably need to keep them. I'll update the PR when I have a moment, but until then consider this unsafe to merge! |
This no longer converts unusual unicode whitespace characters (such as a non-breaking space) into normal spaces.
I've updated this PR to keep the existing behaviour of maintaining multiple whitespaces between tags. It is now ready for review and merge 😀 |
Happy to take a look at #1581. Might be a few days until I have the time, but I don't think it should be too difficult to fix. |
Heh. So #1581 looks like a case where "we can’t change JSX formatting at all without changing the meaning the of the program", but it looks fixable without being so drastic. |
tests/jsx-whitespace/test.js
Outdated
@@ -0,0 +1,22 @@ | |||
// Should collapse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
// Jest Snapshot v1, https://goo.gl/fbAQLP | ||
|
||
exports[`test.js 1`] = ` | ||
// Should collapse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These shouldn't collapse. If it's out of scope for this PR, we could just change the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tweaked the comments to remove any mention of collapsing.
@@ -46,7 +46,7 @@ raw_amp = <span>foo & bar</span> | |||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |||
many_nbsp = <div> </div>; | |||
single_nbsp = <div> </div>; | |||
many_raw_nbsp = <div> </div>; | |||
many_raw_nbsp = <div> </div>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These have turned from regular spaces back to non-breaking spaces, right? That's great!
Yay, awesome work on this :) |
This PR preserves unusual unicode whitespace characters.
Currently any whitespace that matches a unicode whitespace character gets converted to a normal space. This means we lose whitespace characters with special meanings (such as non-breaking spaces and zero width spaces).
In this PR we no longer treat anything other than newline, tab, and space characters as whitespace. This allows us to preserve these unusual unicode whitespace characters as is.
With this change I believe we no longer need to preserve multiple space characters when formatting JSX, because they can no longer contain unusual whitespace we know that they will collapse down to a single space when rendering so it is safe to do that in Prettier.