New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don’t recognize U+2028 as a newline character #581
Conversation
As of https://github.com/tc39/proposal-json-superset (ES2019), U+2028 is allowed in ECMAScript string literals, just like it already was in JSON strings. Splitting on U+2028 breaks the layout of a Markdown file containing a code example: https://markdown-it.github.io/#md3=%7B%22source%22%3A%22a%5Cn%5Cn%5Cn%60%60%60js%5Cn%2F%2F%20A%20JavaScript%20object%20%28or%20array%2C%20or%20string%29%20representing%20some%20data.%5Cnconst%20data%20%3D%20%7B%5Cn%20%20LineTerminators%3A%20%27%5C%5Cn%5C%5Cr%E2%80%A8%E2%80%A9%27%2C%20%2F%2F%20%27%5C%5Cn%5C%5Cr%5C%5Cu2028%5C%5Cu2029%27%5Cn%7D%3B%5Cn%60%60%60%5Cn%5Cnb%22%2C%22defaults%22%3A%7B%22html%22%3Afalse%2C%22xhtmlOut%22%3Afalse%2C%22breaks%22%3Afalse%2C%22langPrefix%22%3A%22language-%22%2C%22linkify%22%3Atrue%2C%22typographer%22%3Atrue%2C%22_highlight%22%3Atrue%2C%22_strict%22%3Afalse%2C%22_view%22%3A%22html%22%7D%7D This patch removes U+2028 as a newline character, to align markdown-it’s behavior more closely with the JavaScript string literal grammar.
Hi Mathias! As far as i see, commonmark spec mentions only 0x0A & 0x0D https://spec.commonmark.org/0.29/#preliminaries. You are right, we need to clarify condition. What can you say about the rest of characters? |
Ooh, good point — it would be nice to match CommonMark exactly, which would be even simpler! Happy to tweak this patch accordingly if you decide so. Could you share any background on why those other characters (U+0085, U+2424) were added to markdown-it in the first place? Was there user demand for them somehow? |
To be honest, i don't remember. Probably, i imagined myself to be more clever than spec authors :). Need to search CM forum, if anyone did such requests. What is your personal advice for me, according to your experience? Remove everything and leave 0x0A & 0x0D only? |
I’d personally prefer to only do what CM requires, indeed. My PR was initially a smaller change (just removing U+2028 from the set) because I assumed there would be opposition to the other removals. I’ve just updated the patch accordingly. Please take a look — thanks! |
Thank you, merged. Should i release ASAP it that's not urgent? |
I’d love to see this in a release soon :) But don’t worry, if it’s too much trouble, I’ll just npm install the master branch for now. |
Released. |
Thank you for the quick turnaround! |
As of https://github.com/tc39/proposal-json-superset (ES2019), U+2028 is allowed in ECMAScript string literals, just like it already was in JSON strings. Splitting on U+2028 breaks the layout of a Markdown file containing a code example:
https://markdown-it.github.io/#md3=%7B%22source%22%3A%22a%5Cn%5Cn%5Cn%60%60%60js%5Cn%2F%2F%20A%20JavaScript%20object%20%28or%20array%2C%20or%20string%29%20representing%20some%20data.%5Cnconst%20data%20%3D%20%7B%5Cn%20%20LineTerminators%3A%20%27%5C%5Cn%5C%5Cr%E2%80%A8%E2%80%A9%27%2C%20%2F%2F%20%27%5C%5Cn%5C%5Cr%5C%5Cu2028%5C%5Cu2029%27%5Cn%7D%3B%5Cn%60%60%60%5Cn%5Cnb%22%2C%22defaults%22%3A%7B%22html%22%3Afalse%2C%22xhtmlOut%22%3Afalse%2C%22breaks%22%3Afalse%2C%22langPrefix%22%3A%22language-%22%2C%22linkify%22%3Atrue%2C%22typographer%22%3Atrue%2C%22_highlight%22%3Atrue%2C%22_strict%22%3Afalse%2C%22_view%22%3A%22html%22%7D%7D
This patch removes U+2028 as a newline character, to align markdown-it’s behavior more closely with the JavaScript string literal grammar.
Thanks for your consideration!