Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bare URLs support behind a new flag #695

Open
Hywan opened this issue Jul 26, 2023 · 7 comments
Open

Bare URLs support behind a new flag #695

Hywan opened this issue Jul 26, 2023 · 7 comments

Comments

@Hywan
Copy link

Hywan commented Jul 26, 2023

Hey!

Let's assume this test:

#[test]
fn foobar() {
    let original = r##"https://foo.bar/_/A and https://baz.qux/_/B"##;
    let expected = r##"<a href="https://foo.bar/_/A">https://foo.bar/_/A</a> and <a href="https://baz.qux/_/B">https://baz.qux/_/B</a>"##;

    test_markdown_html(original, expected, false, false, false);
}

This test is failing:

thread 'suite::spec::foobar' panicked at 'assertion failed: `(left == right)`
  left: `"<a href=\"https://foo.bar/_/A\">https://foo.bar/_/A</a> and <a href=\"https://baz.qux/_/B\">https://baz.qux/_/B</a>"`,
 right: `"<p>https://foo.bar/<em>/A and https://baz.qux/</em>/B</p>"`', tests/lib.rs:41:5

The first and second underscores in the links are considered as an emphasize. It seems like links have a lower priority than emphasizes.

Testing this here with Github markdown renderer: https://foo.bar/_/A and https://baz.qux/_/B; links are parsed correctly. I don't know what the spec says about priorities honestly, I believe it's implementation dependent.

@jplatte
Copy link

jplatte commented Jul 26, 2023

What I think other markdown parsers tend to do is only interpret _ as emphasis if the start is right before a word, and the end is right after a word. So the following should all not work (let's test what GitHub does):

  • a_b c_d: a_b c_d
  • a_ _b: a_ _b
  • _a _b: _a _b
  • a_ b_: b_ a_

… while _a b_ should produce emphasis: a b

@ehuss
Copy link
Contributor

ehuss commented Jul 26, 2023

pulldown_cmark does not support detection of bare URLs. That is an extension that GitHub's parser has. On dingus, you can see that it parses the same as pulldown_cmark. The URLs need to be enclosed in angled brackets, like <https://foo.bar/_/A>.

@Martin1887
Copy link
Collaborator

Therefore, supporting bare URLs should be a new feature behind a flag, a non-urgent enhancement.

@Martin1887 Martin1887 changed the title 2 links with underscores are generating broken HTML Bare URLs support behind a new flag Jul 26, 2023
@jplatte
Copy link

jplatte commented Jul 26, 2023

But what about the more general thing about underscores within words being interpreted as emphasis? That's independent of link detection, as I showed above. Is that just a fundamental difference between commonmark and GFM, i.e. not subject to configuration?

@Martin1887
Copy link
Collaborator

Your example works in pulldown-cmark, that is the behaviour of CommonMark and pulldown-cmark gets the same than dingus:
dingus

pulldown-cmark

@jplatte
Copy link

jplatte commented Jul 27, 2023

Oh I see, so in the original example it's the _s not being parts of words (there's /es to both sides in the URLs) which causes them to be interpreted as emphasis.

@Hywan
Copy link
Author

Hywan commented Jul 27, 2023

Thanks for the details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants