Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Touching italic/bold doesn't render right (v1.1.1 regression) #1754

Closed
fredck opened this issue Sep 3, 2020 · 10 comments · Fixed by #1755
Closed

Touching italic/bold doesn't render right (v1.1.1 regression) #1754

fredck opened this issue Sep 3, 2020 · 10 comments · Fixed by #1755
Labels
category: inline elements category: mixed content L1 - broken Valid usage causes incorrect output OR a crash AND there is no known workaround for the issue released

Comments

@fredck
Copy link

fredck commented Sep 3, 2020

When italic and bold are touching, italic is ignored:

// Input
_te_**st**

// Expected output
<p><em>te</em><strong>st</strong></p>

// Actual output (wrong)
<p>_te_<strong>st</strong></p>

This was working with v1.1.0 and is not working with v1.1.1.

@styfle styfle added category: mixed content L2 - annoying Similar to L1 - broken but there is a known workaround available for the issue category: inline elements labels Sep 3, 2020
@styfle
Copy link
Member

styfle commented Sep 3, 2020

Thanks for reporting this issue! It looks like it was fixed in v0.6.0 and broke in v1.1.1 🤔

This was mentioned in #1353 years ago but I'll elevate this to L1 since it was working for a couple years and regression in the latest release 👍

Would you like to create a PR to fix it? 🙂

@styfle styfle added L1 - broken Valid usage causes incorrect output OR a crash AND there is no known workaround for the issue and removed L2 - annoying Similar to L1 - broken but there is a known workaround available for the issue labels Sep 3, 2020
@calculuschild
Copy link
Contributor

I suspect this can be fixed in this line by adding an asterisk to the "punctuation" group. Haven't tested it, but probably the place to look.

endUnd: /[^\s]_(?!_)(?:(?=[punctuation\s])|$)/ // last char can't be a space, and final _ must preceed punct or \s (or endline)

endUnd: /[^\s]_(?!_)(?:(?=[*punctuation\s])|$)/ // last char can't be a space, and final _ must preceed punct or \s (or endline) 

Any PR should include tests for all valid permutations of this since evidently this has been fixed and broken over the years without being detected. i.e.: (all of these seem to generate valid tags on both sides within github)

  1. _te_**st** test
  2. _te_*st* test
  3. *te*__st__ test
  4. *te*_st_ test
  5. __te__*st* test
  6. **te**_st_ test
  7. **te**__st__ test
  8. __te__**st** test

@fredck
Copy link
Author

fredck commented Sep 4, 2020

I suspect this can be fixed in this line by adding an asterisk to the "punctuation" group.

Found a few lines later:

marked/src/rules.js

Lines 191 to 192 in a41d8f9

// list of punctuation marks from common mark spec
// without * and _ to workaround cases with double emphasis

@fredck
Copy link
Author

fredck commented Sep 4, 2020

Would you like to create a PR to fix it? 🙂

I was hoping for @UziTech to jump in, as this regression may have been introduced by their PR (#1686). It may be more difficult for me, but I may try to check this in the future.

@UziTech
Copy link
Member

UziTech commented Sep 4, 2020

I can take a look at it this weekend. Thanks for the test cases @calculuschild.

@UziTech
Copy link
Member

UziTech commented Sep 4, 2020

I started PR #1755 to fix this with @calculuschild's suggestion.

There are a few other cases I think we should test that are still failing such as _te___st__ => <em>te___st</em>_

Or we could just consider these unspecified behavior.

I'll have to look closer at the spec to see if these cases should pass.

@calculuschild
Copy link
Contributor

such as te___st_ => te___st_

I don't see this case exactly in the GFM spec; the closest being Example #424. There's the comment below Test #420 that says an emphasis delimiter can't exist if the size of the delimiter runs containing the opening and closing delimiters sums to a multiple of 3, so in this case:

1te234st__ means we could have <em>te</em><strong>st</strong> but not 1te___st23 => <em>te___st_</em> which is what we have now.

I guess the best answer is to look at Babelmark which shows literally every implementation has a different answer. Commonmark is probably the "most accurate" though, and they get around the multiples of 3 rule by excluding the final underscore as per Example #467. Markdown 1.1.1 is probably the second best out of that list though :).

@UziTech
Copy link
Member

UziTech commented Sep 4, 2020

I would like to get marked to follow common mark on all of these examples.

demo

markdown commonmark
_te_*st* <em>te</em><em>st</em>
_te_**st** <em>te</em><strong>st</strong>
_te__st_ <em>te__st</em>
_te___st__ <em>te___st</em>_
*te**st* <em>te**st</em>
*te***st** <em>te</em><strong>st</strong>
*te*_st_ <em>te</em><em>st</em>
*te*__st__ <em>te</em><strong>st</strong>
__te__*st* <strong>te</strong><em>st</em>
__te__**st** <strong>te</strong><strong>st</strong>
__te___st_ _<em>te___st</em>
__te____st__ <strong>te____st</strong>
**te***st* <strong>te</strong><em>st</em>
**te****st** <strong>te****st</strong>
**te**_st_ <strong>te</strong><em>st</em>
**te**__st__ <strong>te</strong><strong>st</strong>

@UziTech
Copy link
Member

UziTech commented Sep 4, 2020

#1755 already passes all of the tests in #1754 (comment) but to pass the rest might require a change in the tokenizer.

@github-actions
Copy link

🎉 This issue has been resolved in version 1.1.2 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: inline elements category: mixed content L1 - broken Valid usage causes incorrect output OR a crash AND there is no known workaround for the issue released
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants