Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

marked 0.3.14 hangs on the attached file, worked in 0.3.9 #1058

Closed
paulkoerbitz opened this issue Feb 18, 2018 · 16 comments · Fixed by #1387
Closed

marked 0.3.14 hangs on the attached file, worked in 0.3.9 #1058

paulkoerbitz opened this issue Feb 18, 2018 · 16 comments · Fixed by #1387
Labels
L0 - security A security vulnerability within the Marked library is discovered

Comments

@paulkoerbitz
Copy link

Expectation

example.txt
should be converted to html. This worked with marked 0.3.9.

Result

With marked 0.3.14 marked "hangs" when given the file.

What was attempted

npm install marked@0.3.14
node_modules/marked/bin/marked

@Feder1co5oave
Copy link
Contributor

I'm investigating this.

Commit a8f2d7f introduced this behavior.

It is caused by this piece of markdown:

~~~
move.rs:9:9: 9:22 error: cannot move out of borrowed content
move.rs:9         &MyEnum::Y(y) => println!("{}", *y)  // <-- Error, y cannot be moved out of a reference
                  ^~~~~~~~~~~~~
move.rs:9:20: 9:21 note: attempting to move value to here
move.rs:9         &MyEnum::Y(y) => println!("{}", *y)  // <-- Error, y cannot be moved out of a reference
                             ^
move.rs:9:20: 9:21 help: to prevent the move, use `ref y` or `ref mut y` to capture value by reference
move.rs:9         &MyEnum::Y(y) => println!("{}", *y)  // <-- Error, y cannot be moved out of a reference
                             ^
~~~

@Feder1co5oave
Copy link
Contributor

Feder1co5oave commented Feb 19, 2018

This causes a hang:

> marked.InlineLexer.rules.tag
/^<!--[\s\S]*?-->|^<\/?[a-zA-Z0-9\-]+(?:"[^"]*"|'[^']*'|\s[^<'">\/]*)*?\/?>/

> marked.InlineLexer.rules.tag.exec(`<-- Error, y cannot be moved out of a reference
...                              ^`)

I don't really understand the reason underlying this.

Feder1co5oave added a commit to Feder1co5oave/marktex that referenced this issue Feb 19, 2018
@paulkoerbitz
Copy link
Author

Wow, quick turn around. My guess is that the regex backtracks and that it hangs because of this (Kind of like here: http://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016). Can't fully pinpoint the issue though. More info here: https://www.regular-expressions.info/catastrophic.html

@Feder1co5oave
Copy link
Contributor

Thanks @paulkoerbitz, I too thought immediately to catastrophic backtracking but that group (?: ... ) should not be affected by it because its branches are mutually exclusive.
I've worked a patch for this but I wanna be sure it doesn't break anything before posting it.
I'll post it here in a bit and any help would be welcome, since I cannot figure out why that works and the current one doesn't.

@paulkoerbitz
Copy link
Author

@Feder1co5oave thank you for the quick replies. I can't really pinpoint it either (but it seems you know a lot more about this than me anyway ;) ) - I've worked around the issue by pinning my dependency to 0.3.9, so from my side the need for a fix is not super urgent.

@Feder1co5oave
Copy link
Contributor

@paulkoerbitz and thank you for reporting and helping. Unfortunately regexes are tricky, and there is no way to determine they run smoothly on every possible input. What I'm trying to work on is to widen our test base.

I think I found it is in fact exponential backtracking. I wrote a script to test this:

const regex = /^<!--[\s\S]*?-->|^<\/?[a-zA-Z0-9\-]+(?:"[^"]*"|'[^']*'|\s[^<'">\/]*)*?\/?>/;
var src = '<-- Error, y cannot be moved out of a reference';

for (let i = 0; i < 18; i++) {
	let begin = process.hrtime();
	regex.exec(src);
	end = process.hrtime(begin);
	console.log(`Matching '${src}' Took ${end[0] * 1000 + end[1] / 10**6} milliseconds`);
	src += ' ';
}

this prints out:

Matching '<-- Error, y cannot be moved out of a reference' Took 0.28264 milliseconds
Matching '<-- Error, y cannot be moved out of a reference ' Took 0.099293 milliseconds
Matching '<-- Error, y cannot be moved out of a reference  ' Took 0.094701 milliseconds
Matching '<-- Error, y cannot be moved out of a reference   ' Took 0.102704 milliseconds
Matching '<-- Error, y cannot be moved out of a reference    ' Took 0.156877 milliseconds
Matching '<-- Error, y cannot be moved out of a reference     ' Took 0.291632 milliseconds
Matching '<-- Error, y cannot be moved out of a reference      ' Took 0.516443 milliseconds
Matching '<-- Error, y cannot be moved out of a reference       ' Took 0.993562 milliseconds
Matching '<-- Error, y cannot be moved out of a reference        ' Took 2.872591 milliseconds
Matching '<-- Error, y cannot be moved out of a reference         ' Took 3.970916 milliseconds
Matching '<-- Error, y cannot be moved out of a reference          ' Took 8.493158 milliseconds
Matching '<-- Error, y cannot be moved out of a reference           ' Took 16.972238 milliseconds
Matching '<-- Error, y cannot be moved out of a reference            ' Took 35.138719 milliseconds
Matching '<-- Error, y cannot be moved out of a reference             ' Took 65.447452 milliseconds
Matching '<-- Error, y cannot be moved out of a reference              ' Took 140.613324 milliseconds
Matching '<-- Error, y cannot be moved out of a reference               ' Took 263.678523 milliseconds
Matching '<-- Error, y cannot be moved out of a reference                ' Took 504.905385 milliseconds
Matching '<-- Error, y cannot be moved out of a reference                 ' Took 1031.746128 milliseconds

with any additional space added, it takes roughly double the time and gets terribly slow pretty soon. So that group needs to be rewritten.

@joshbruce joshbruce added the L2 - annoying Similar to L1 - broken but there is a known workaround available for the issue label Feb 20, 2018
@Feder1co5oave Feder1co5oave self-assigned this Feb 20, 2018
@Feder1co5oave Feder1co5oave added L0 - security A security vulnerability within the Marked library is discovered and removed L2 - annoying Similar to L1 - broken but there is a known workaround available for the issue labels Feb 22, 2018
@davisjam
Copy link
Contributor

davisjam commented Feb 22, 2018

there is no way to determine they run smoothly on every possible input

This is a good starting place: https://github.com/NicolaasWeideman/RegexStaticAnalysis
Some setup required.

@Feder1co5oave
Copy link
Contributor

@davisjam Thanks! I've been looking for something like this for a while!

@Feder1co5oave
Copy link
Contributor

Two very different errors were revealed by this issue:

This code fence should not end at line 3 because ^~~~~~~~~~~~~ is not a valid closing fence (must be on a new line). So the rest of the code block is parsed as a paragraph.

~~~
move.rs:9:9: 9:22 error: cannot move out of borrowed content
move.rs:9         &MyEnum::Y(y) => println!("{}", *y)  // <-- Error, y cannot be moved out of a reference
                  ^~~~~~~~~~~~~
move.rs:9:20: 9:21 note: attempting to move value to here
move.rs:9         &MyEnum::Y(y) => println!("{}", *y)  // <-- Error, y cannot be moved out of a reference
                             ^
move.rs:9:20: 9:21 help: to prevent the move, use `ref y` or `ref mut y` to capture value by reference
move.rs:9         &MyEnum::Y(y) => println!("{}", *y)  // <-- Error, y cannot be moved out of a reference
                             ^
~~~

Then, <-- Error, y cannot be moved out of a reference causes the inline html rule to do catastrophic backtracking and marked hangs.

@joshbruce
Copy link
Member

joshbruce commented Feb 24, 2018

Created #1074 to give us a place to discuss tool recommendations and whatnot. Think this might also be a good opportunity to revisit the concept of projects to help with prioritization and tracking; otherwise we could end up with a lot of half-finished work.

@Feder1co5oave, @UziTech, and @styfle: Can you see the projects and move cards now?

https://github.com/markedjs/marked/projects/2?

@Feder1co5oave
Copy link
Contributor

I can edit projects but I'm not used to them (I've been using Trello for some time though 😆) so I don't have a defined workflow.

@davisjam
Copy link
Contributor

See my analysis of the html.closing regex here.

davisjam added a commit to davisjam/marked that referenced this issue Feb 26, 2018
@davisjam
Copy link
Contributor

I believe this is fixed in #1083.

@davisjam
Copy link
Contributor

I tried the sample from the initial report. It hangs on master but not with the fix in #1083.

If the root cause analysis was correct (I haven't checked) then the test case I added in #1083 should suffice to cover this case. But @Feder1co5oave perhaps you still want to add this input file as a standalone test case?

davisjam added a commit to davisjam/marked that referenced this issue Feb 26, 2018
Problem:
Four regexes were vulnerable to catastrophic backtracking.
This leaves markdown servers open to a potential REDOS attack.

Solution:
Refactor the regexes.

For two similar regexes (html) I didn't change the language.
For two similar regexes (noline) I slightly changed the language:

![[[[[[[[[[[]] was accepted by the old noline pattern.
It is now rejected.

All tests pass, though I'm not sure if I've broken something that
was untested.

This addresses markedjs#1070 (with markedjs#1058 along the way).
davisjam added a commit to davisjam/marked that referenced this issue Feb 27, 2018
Problem:
Four regexes were vulnerable to catastrophic backtracking.
This leaves markdown servers open to a potential REDOS attack.

Solution:
Refactor the regexes.

For two similar regexes (html) I didn't change the language.
For two similar regexes (noline) I slightly changed the language:

![[[[[[[[[[[]] was accepted by the old noline pattern.
It is now rejected.

All tests pass, though I'm not sure if I've broken something that
was untested.

This addresses markedjs#1070 (with markedjs#1058 along the way).

Bonus: rename a stray test to use _ instead of -.
@Feder1co5oave Feder1co5oave removed their assignment Feb 27, 2018
@Feder1co5oave
Copy link
Contributor

Fixed in #1083

7korobi pushed a commit to 7korobi/marked that referenced this issue Mar 13, 2018
Problem:
Four regexes were vulnerable to catastrophic backtracking.
This leaves markdown servers open to a potential REDOS attack.

Solution:
Refactor the regexes.

For two similar regexes (html) I didn't change the language.
For two similar regexes (noline) I slightly changed the language:

![[[[[[[[[[[]] was accepted by the old noline pattern.
It is now rejected.

All tests pass, though I'm not sure if I've broken something that
was untested.

This addresses markedjs#1070 (with markedjs#1058 along the way).

Bonus: rename a stray test to use _ instead of -.
@Feder1co5oave
Copy link
Contributor

This fence issue is still not fixed: demo

@styfle styfle reopened this Dec 7, 2018
Feder1co5oave added a commit to Feder1co5oave/marktex that referenced this issue Dec 8, 2018
zhenalexfan pushed a commit to zhenalexfan/MarkdownHan that referenced this issue Nov 8, 2021
Problem:
Four regexes were vulnerable to catastrophic backtracking.
This leaves markdown servers open to a potential REDOS attack.

Solution:
Refactor the regexes.

For two similar regexes (html) I didn't change the language.
For two similar regexes (noline) I slightly changed the language:

![[[[[[[[[[[]] was accepted by the old noline pattern.
It is now rejected.

All tests pass, though I'm not sure if I've broken something that
was untested.

This addresses markedjs#1070 (with markedjs#1058 along the way).

Bonus: rename a stray test to use _ instead of -.
zhenalexfan pushed a commit to zhenalexfan/MarkdownHan that referenced this issue Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L0 - security A security vulnerability within the Marked library is discovered
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants