Skip to content

Latest commit

 

History

History
104 lines (78 loc) · 4.13 KB

File metadata and controls

104 lines (78 loc) · 4.13 KB

[HIGHLIGHT] Improve handling of whitespace for Chinese, Japanese, and Korean (#11597 by @tats-u)

Stop inserting spaces between Chinese or Japanese and Western characters

Previously, Prettier would insert spaces between Chinese or Japanese and Western characters (letters and digits). While some people prefer this style, it isn’t standard, and is in fact contrary to official guidelines. Please see here for more details. We decided it’s not Prettier’s job to enforce a particular style in this case, so spaces aren’t inserted anymore, while existing ones are preserved. If you need a tool for enforcing spacing style, consider textlint-ja or lint-md (rules space-round-alphabet and space-round-number).

The tricky part of this change were ambiguous line breaks between Chinese or Japanese and Western characters. When Prettier unwraps text, it needs to decide whether such a line break should be simply removed or replaced with a space. For that Prettier examines the surrounding text and infers the preferred style.

<!-- Input -->
漢字
Alphabetsひらがな12345カタカナ67890

漢字 Alphabets ひらがな 12345 カタカナ 67890

<!-- Prettier stable -->
漢字 Alphabets ひらがな 12345 カタカナ 67890

漢字 Alphabets ひらがな 12345 カタカナ 67890

<!-- Prettier main -->
漢字Alphabetsひらがな12345カタカナ67890

漢字 Alphabets ひらがな 12345 カタカナ 67890
Comply to line breaking rules in Chinese and Japanese

There are rules that prohibit certain characters from appearing at the beginning or the end of a line in Chinese and Japanese. E.g., full stop characters , , and . shouldn’t start a line whereas shouldn’t end a line. Prettier now follows these rules when it wraps text, that is when proseWrap is set to always.

<!-- Input -->
HTCPCPのエラー418は、ティーポットにコーヒーを淹(い)れさせようとしたときに返されるステータスコードだ。

<!-- Prettier stable with --prose-wrap always --print-width 8 -->
HTCPCP の
エラー
418 は、
ティーポ
ットにコ
ーヒーを
淹(い)
れさせよ
うとした
ときに返
されるス
テータス
コードだ
。

<!-- Prettier main with the same options -->
HTCPCPの
エラー
418は、
ティー
ポットに
コーヒー
を淹
(い)れ
させよう
としたと
きに返さ
れるス
テータス
コード
だ。
Do not break lines inside Korean words

Korean uses spaces to divide words, and an inappropriate division may change the meaning of a sentence:

  • 노래를 못해요.: I’m not good at singing.
  • 노래를 못 해요.: I can’t sing (for some reason).

Previously, when proseWrap was set to always, successive Hangul characters could get split by a line break, which could later be converted to a space when the document is edited and reformatted. This doesn’t happen anymore. Korean text is now wrapped like English.

<!-- Input -->
노래를 못해요.

<!-- Prettier stable with --prose-wrap always --print-width 9 -->
노래를 못
해요.

<!-- Prettier stable, subsequent reformat with --prose-wrap always --print-width 80 -->
노래를 못 해요.

<!-- Prettier main with --prose-wrap always --print-width 9 -->
노래를
못해요.

<!-- Prettier main, subsequent reformat with --prose-wrap always --print-width 80 -->
노래를 못해요.

A line break between Hangul and non-Hangul letters and digits is converted to a space when Prettier unwraps the text. Consider this example:

3분 기다려 주지.

In this sentence, if you break the line between “3” and “분”, a space will be inserted there when the text gets unwrapped.