Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMLParser fails for script tag with JS-Code that looks like a start tag #504

Closed
baranor opened this issue Jun 13, 2022 · 4 comments · Fixed by #505
Closed

XMLParser fails for script tag with JS-Code that looks like a start tag #504

baranor opened this issue Jun 13, 2022 · 4 comments · Fixed by #505
Labels
bug Something isn't working high priority

Comments

@baranor
Copy link

baranor commented Jun 13, 2022

I ran into a problem where adding the following HTML code (simplified):

<html>
<head>
    <title>Title</title>
</head>
<body>
  <script type="text/javascript">
    var vars = []
    for (var i=0;i<vars.length;i++) {}
  </script>
</body>
</html>

via document.write(...) (with script execution) or document.documentElement.outerHTML = ... (without script execution).

After some debugging I found the XMLParser is at fault here, as it tries to parse the <vars condition in the for-loop as a new tag.

The problem did have a different impact depending on the actual HTML structure:

  • in the original code, it somehow caught itself, but the textContent of the script tag was most of the HTML document after that point
  • in the above example, it allocated memory until nodejs killed it

If I understand the current parser implementation correctly, there is already handling for the similar case of actual tags within the script-tag, but it doesn't catch this one.

I originally stumbled upon this in happy-dom@4.1.0, but it still happens in happy-dom@5.2.0.

@Mas0nShi
Copy link
Contributor

Mas0nShi commented Jun 14, 2022

Hi, @baranor 😀, HTML parsers(XML parser) have been buggy for a long time...
e.g. #376
we will try to fix it in our spare time. thank you for your feedback.

@Mas0nShi
Copy link
Contributor

Mas0nShi commented Jun 14, 2022

the reason of this bug:

only one script tag is matched.

the regex can't correct match </script>

incorrect match result: <vars.length;i++) {}</script>

html
head
title
title
head
body
script
vars.length
body
html

if (ChildLessElements.includes(tagName)) {
let childLessMatch = null;
while ((childLessMatch = markupRegexp.exec(data))) {
if (childLessMatch[2] === match[2] && childLessMatch[1]) {
markupRegexp.lastIndex -= childLessMatch[0].length;
break;
}
}
}

@capricorn86
Copy link
Owner

Thanks for reporting @baranor! 🙂
This should be quite high priority. Will look into it as soon as possible.

capricorn86 added a commit that referenced this issue Jun 14, 2022
…-with-tags

#504@patch: Fixes XmlParser error parse with tags.
@capricorn86
Copy link
Owner

Thanks to @Mas0nShi we now have a fix in place 🙂

You can read more about the release here:
https://github.com/capricorn86/happy-dom/releases/tag/v5.3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working high priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants