Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enh(parser) Warn if unescaped HTML is present #3057

Merged
merged 4 commits into from Mar 22, 2021

Conversation

joshgoebel
Copy link
Member

Closes #2886.

Note: This cannot be merged until V11.

Changes

If unescaped HTML is detected inside the element that has been requested
to highlight then it will be stripped and a warning will be logged to
the console.

This warning can be supressed with the option ignoreUnescapedHTML.

This is done AFTER the before plugin hooks fire, allowing for a plugin
to potentially grab the HTML before hand, replace it with text, and
later do something special - say perhaps merging it back it.

Currently this can happen without touching the ignoreUnescapedHTML
option as the element is passed to the plugin BEFORE the security checks
are performed.

An alternative would be to always do the security checks first - and if
a user wishes to use a plugin that allows HTML then they would have to
both install that plugin AND turn on ignoreUnescapedHTML.

Hard to say if one might be preferrable over the other?

Checklist

  • Added markup tests, or they don't apply here because...
  • Updated the changelog at CHANGES.md

@joshgoebel
Copy link
Member Author

Screen Shot 2021-03-18 at 7 29 53 PM

@allejo
Copy link
Member

allejo commented Mar 20, 2021

An alternative would be to always do the security checks first - and if
a user wishes to use a plugin that allows HTML then they would have to
both install that plugin AND turn on ignoreUnescapedHTML.

I would personally prefer this option. Making it tedious to opt-in to this behavior should make people think, "why is this so tedious to enable?" For example, in React, if you want to bypass the React sanitization and such, you have to use the convoluted dangerouslySetInnerHTML prop. It's convoluted by design to actively discourage this and only if you actually know what you're doing, then you should be doing it.

@joshgoebel
Copy link
Member Author

joshgoebel commented Mar 20, 2021

I would personally prefer this option. Making it tedious to opt-in to this behavior should make people think, "why is this so tedious to enable?"

Isn't it kind of double annoying though if the line right before/after it essentially says the same?

hljs.options({ignoreUnescapedHTML: true});
hljs.registerPlugin(MergeUnescapedHTML);

A little redundant, no?

The warning would STILL show if the plugin didn't strip the HTML from the element... so a plugin that wanted to do this properly (silently) would do:

save_for_later = el.innerHTML
el.innerHTML = el.textContent

So then when the security check runs AFTER all plugins run it would see that the plugin already handled the HTML and hence no need for a security warning... does that make sense?

It'd be nice if plugins could handle this seamlessly, is the idea. I'm still willing to be persuaded but I'm leaning like 60/40 to doing the warning AFTER plugins have a chance to tweak things.

@joshgoebel joshgoebel added this to the 11.0 milestone Mar 20, 2021
@allejo
Copy link
Member

allejo commented Mar 21, 2021

hljs.options({ignoreUnescapedHTML: true});
hljs.registerPlugin(MergeUnescapedHTML);

A little redundant, no?

Redundant? Yes. But the way I see it is the first line is to disable the warning emitted by hljs. And the second line is to actually handle the behavior.

Is it be possible for a plugin to change the hljs options? i.e. have the MergeUnescapedHTML plugin set ignoreUnescapedHTML to true.

So then when the security check runs AFTER all plugins run it would see that the plugin already handled the HTML and hence no need for a security warning... does that make sense?

But I would agree that doing the security check after the plugin runs would be a good idea since it'd allow plugins to fix/silence security warnings.

@joshgoebel
Copy link
Member Author

But I would agree that doing the security check after the plugin runs would be a good idea since it'd allow plugins to fix/silence security warnings.

Then I will leave it as it is for now - the plugin runs first. So a properly structured plugin can fix the security issue - making adding HTML merging back a one-liner.

Is it be possible for a plugin to change the hljs options?

No, and I'm not sure that'd be a good idea. :) It did cross my mind. Hopefully we'd have so few options in core that this type of thing wouldn't be necessary (plugins reconfiguring the core engine).

@joshgoebel
Copy link
Member Author

Can this be approved then (the functionality)? The tests won't go green until this is rebased on top of V11... so it'll probably be held for a bit, but I'd like to know if it's good to go or not.

src/highlight.js Outdated
fire("before:highlightElement",
{ el: element, language: language });

if (!options.ignoreUnescapedHTML && element.innerHTML !== element.textContent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<div id="foo">
  I'm just a &lt;code&gt;sample&lt;/code&gt; block.
</div>

<div id="innerHTML"></div>
<div id="textContent"></div>
<div id="result"></div>

<script>
document.getElementById("innerHTML").innerText = document.getElementById("foo").innerHTML;
document.getElementById("textContent").innerText = document.getElementById("foo").textContent;
document.getElementById("result").innerText = document.getElementById("foo").innerHTML == document.getElementById("foo").textContent;
</script>

https://jsfiddle.net/uj5qzn2e/

Am I doing something wrong? Or does innerHTML need to be have HTML entities transformed before they can be compared with textContent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, my test cases locally didn't include escaped HTML... that certainly breaks the comparison approach. I think instead perhaps we just need to check node.children...

Copy link
Member

@allejo allejo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

element.children.length > 0 works much better 👍

@joshgoebel joshgoebel changed the base branch from main to version_11 March 22, 2021 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detect HTML/JS injection attacks and warn users pro-actively
2 participants