Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

security: fix REDOS vulnerabilities #1083

Merged
merged 2 commits into from Feb 27, 2018
Merged

Conversation

davisjam
Copy link
Contributor

@davisjam davisjam commented Feb 26, 2018

Demonstrate and fix catastrophic backtracking in two regexes.

For #1070.

@davisjam
Copy link
Contributor Author

Uhm. The tests will presumably not pass here because they are demonstrating a DoS. Hope there's a timeout on Travis.

@davisjam
Copy link
Contributor Author

Aww poor Travis.

"Ran for 22 min 45 sec
Total time 45 min 9 sec"

@joshbruce
Copy link
Member

joshbruce commented Feb 26, 2018

@davisjam: Thank you so much for the contribution! Nah, good Travis, now we know the test works.

I can confirm it crapped out on the DOS test; so, I'm gonna call that good enough evidence that we have the right test.

Not sure what's up with that since the solution should be in the PR itself though?? Maybe something needs to change with the test (tagging @UziTech and gonna put up a review notice for the rest of the active committers).

screen shot 2018-02-26 at 2 16 12 pm

@davisjam
Copy link
Contributor Author

@joshbruce I fixed the html (that's test # 52 redos-html-closing) but not the nolink rule. So one of the attacks still works.

@joshbruce
Copy link
Member

@davisjam: I may not be following. Ah! I think I see.

Yes, 52 completed successfully; so, that's good (I don't think our tests log when they have started - only when they succeed or fail). Therefore, my hypothesis is that #53 is the test/new/redos-nolink.html one, which I missed somehow - thought there was only one test.

Maybe rename the test A-redos-nolink.html - the tests seem to run in alphabetical order?? (Seems like the cheapest experiment to test this hypothesis given our current setup.)

@joshbruce joshbruce added the L0 - security A security vulnerability within the Marked library is discovered label Feb 26, 2018
@joshbruce joshbruce added this to To Do in No known issues via automation Feb 26, 2018
@davisjam
Copy link
Contributor Author

No need for a hypothesis -- if I remove the nolink test, all tests pass on my fork.

66bc339 adds a time check -- tests that take more than 1 second will fail now.

@joshbruce
Copy link
Member

@davisjam: Interesting solution, I like it. Definitely gets us moving in the right direction on performance testing. We should be able to adjust this up and down, yeah? (Not getting too sidetracked here, just want to make sure I have the information for adding to #963 and incorporating conversations @UziTech, @Feder1co5oave, and I have had re benchmarking.)

Travis CI is still hung on 53. Thinking the timer solution isn't working because the function never returns. It's almost like we need to have the succeed-fail timer be an asynchronous thing??

@joshbruce
Copy link
Member

Specifically the testFile(engine, file, filename, i + 1); function.

@davisjam
Copy link
Contributor Author

@joshbruce Right, 66bc339 won't abort long-running tests. It just adds another test for correctness -- tests shouldn't take more than one second.

To abort long-running tests stuck on REDOS, you would need to either (1) run in a Node VM (it's a core Node module), or (2) run in a child process.

@joshbruce
Copy link
Member

@davisjam: Yeah, would probably want to go with the child process.

So, how do we validate that test 54 is the redos-nolink one?? Or, will we just assume it is, at which point this only partially corrects the REDOS issues, yeah??

@davisjam
Copy link
Contributor Author

@joshbruce

  1. Tests in child processes: yes, maybe open that as a separate issue.
  2. Test ordering: I just pushed 9a11030, which adds a 'print for beginning a test'. Merge if you want. Log tests as soon as they start #1084.
  3. I've slightly tweaked the language parsed by the noline regex. See notes in my commit message.

Both REDOS issues (actually all 4, but two pairs of related regexes) are fixed by this PR. Again, not sure if I broke anything, so definitely needs review for the nolink pattern.

@joshbruce
Copy link
Member

joshbruce commented Feb 26, 2018

@davisjam: We're gonna be breaking all the things before this is resolved. Good thing it's a GitHub repository. 😂

I can't add this comment in the PR itself...sorry.

/test/index.js - line 190:

// before
console.log('#%d. %s completed.', index, filename);

// after
console.log('completed %s', filename);

The log is difficult to read to verify the tests actually completed - failed and CI failure is awesome because it blows things up.

The failing test here is the only one allowed. (I'm updating templates to make that obvious.)

screen shot 2018-02-26 at 3 57 10 pm

test/index.js Outdated
@@ -143,6 +148,8 @@ function testFile(engine, file, filename, index) {
delete marked._original;
}

console.log('#%d. Beginning %s.', index, filename);

if (opts.length) {
marked._original = marked.defaults;
marked.defaults = {};
Copy link
Member

@joshbruce joshbruce Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/test/index.js - line 190:

// before
console.log('#%d. %s completed.', index, filename);

// after
console.log('completed %s', filename);

Copy link
Contributor Author

@davisjam davisjam Feb 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in d1a85c6.

@davisjam davisjam force-pushed the REDOSTests branch 3 times, most recently from 6389ccb to d1a85c6 Compare February 26, 2018 21:13
@joshbruce
Copy link
Member

I suppose I could make the change myself, if you're okay with me hopping into your branch directly...??

@Feder1co5oave
Copy link
Contributor

@joshbruce I'm getting a little annoyed at you merging essential code changes without my review.

@Feder1co5oave
Copy link
Contributor

Text fixtures in this PR are wrong. They only pass because the comparing algorithm is broken.

@davisjam
Copy link
Contributor Author

Yes, I know the html files don't match the md. I didn't know what to put for the expected content. Since the test suite didn't complain I did not pursue it further.

Sounds like a problem in the test harness? Is there an issue for this?

@UziTech
Copy link
Member

UziTech commented Feb 27, 2018

I agree. We really need to slow things down and get them right.

@davisjam I think #1017 should fix it.

@joshbruce
Copy link
Member

joshbruce commented Feb 27, 2018

@Feder1co5oave and all: I can see that. Having said that, I stated what the plan was. There was no pushback regarding that plan.

The comparing algorithm being wrong did not seem to be part of the review (unless we're referring to the lack of HTML output, of which there was not an alternative presented) - only that the file names did not follow convention, tabs vs spaces, and so on. The evidence presented by @davisjam seemed sound, thorough, and I watched the DOS happen on Travis.

If the comparing algorithm is wrong, let's open an issue and discuss again.

Further, when @karenyavine from snyk.io reached out she said, "I'll wait a bit for publishing a publish vulnerability advisory, do you think there is an ETA for this?" (see comment). When I asked what the normal protocol was, she said, "our protocol says that if there is a public issue, we can add it." (The time being given was most likely based on the project just now coming back to life and her not wanting to discourage us from continuing to try.) I thanked her for the consideration, explained that sticking to normal snyk.io protocol is fine (nature of open source software development), and did ask for at least 24 hours to attempt a solution before submitting us back to snyk.io - she was kind enough to do so...that's about the time @davisjam stepped in, to give you an idea of how close we are to being back on the list.

To make marked an inviting community for contributors that is sustained and owned by that community it cannot be a one-man show...not even if that "one-man" is me. As I have said elsewhere, I was asked to be here and to take the project on - right now, we don't have a public way of changing that.

I took the project on via the channels and methods afforded to us. We're all doing the best we can to make marked awesome with the knowledge and skills we have. (See also CODE_OF_CONDUCT PR).

Been catching a lot of heat lately, which is admittedly part of my job as a publisher but seriously, how did we go from commenting on the awesomeness of working with one another to storming mode? (The code of conduct should help us establish explicit norms though - I do appreciate the NodeJS one linked from the GitHub docs.)

@davisjam
Copy link
Contributor Author

@UziTech

I think #1017 should fix it.

Sounds good. I wonder if it will turn up other test failures, too?

Anyway, once #1017 is in, fixing the test failures for the test cases I added will be straightforward -- just copy/paste the output from marked into the html file. For these test cases, "any output is good output". The goal is just for the input not to hang things.

I agree. We really need to slow things down and get them right.

I agree that "move fast and break things" isn't the best philosophy for a major library (1M+ downloads/month). In @joshbruce's defense, my PR did not introduce regressions as defined by the test suite, and my tweaks were minimal enough that I think any regressions will be corner cases. Since the issue was already publicly disclosed, I think getting the fixes in place was overall a good move.

However, the past is the past, and a code review of the changes is still in order. @Feder1co5oave, the change in the source code was exclusively to regexes. I believe one change was innocuous (new regex matches same language), but I'd urge a review of the second change, described in the commit message.

@joshbruce

our protocol says that if there is a public issue, we can add it

I'll push back a little on this stance -- I know you're new to this so I mean this constructively, not critically, I promise! Anyway, with the potential security implications of these issues, handling them privately (i.e. not announcing the issue until there's a fix) would have permitted more room to be sure the fix didn't introduce regressions. Open source means the code is open, but it doesn't require that everything happen publicly. In case you don't know, "critical infrastructure"-type projects will often wait until a fix is ready before announcing that there was a vulnerability with a public issue, though of course if they delay too long this becomes problematic.

@joshbruce
Copy link
Member

joshbruce commented Feb 27, 2018

@davisjam and @UziTech:

Anyway, once #1017 is in, fixing the test failures for the test cases I added will be straightforward

Leaving this one entirely to y'all.

@davisjam:

I agree that "move fast and break things" isn't the best philosophy for a major library (1M+ downloads/month).

Agreed. I'm generally a pretty slow mover, to be honest - much to the chagrin of my employers and teammates sometimes - perfectly willing to wait until there is enough internal and external pull before taking action. Really appreciate Leading Lean Software Development on that score with the East versus West observation:

“Our study tour group was being given a lesson in cultural differences at the Danish embassy in Japan. Masayuki Yamaguchi projected a picture of how decision making in Japan differs from decision making in the West (see Figure 5-2). In Japan, decisions take time, but once agreement is achieved, execution is very fast. In the West, decisions are made quickly—so quickly that later rethinking and reevaluation are considered essential. Viewed from an overall perspective, the Japanese model delivers results more quickly.” Excerpt From: Mary Poppendieck & Tom Poppendieck. “Leading Lean Software Development: Results Are Not the Point.” iBooks

One of my favorite mantras is: We cannot change the past, only determine how to progress into the future.

@davisjam (cont.) and all:

I mean this constructively, not critically, I promise!

Funny story. I was a fine arts major in college while doing development in my free time. I've become pretty practiced at discerning constructive from destructive criticism (and putting something out there almost for the express purpose of having people tear it apart - 3 studio courses a quarter, with deliverables due roughly every two weeks, for four years...woof). Appreciate the consideration and you making the implicit explicit. 😊

Anyway, with the potential security implications of these issues, handling them privately...

I hear you. That will take some getting used to for me. Been around enough coaches and things so long I'm something of a transparency fiend: "Hey, we messed up. We're working to correct and improve. And here's what we're doing in the short-term to get there" (Apple and Antennagate).

Having said that, there's a fine line between admitting there's a problem before other people find it and rushing to put it out there for the world to see, I suppose. Maybe publicly stating what our protocols are under certain extreme circumstances would help curb that for me?? (Still concerned about potential anti-patterns that can result from "working in the dark" on certain things. Makes me question the credibility of the person or institution if it happens too often. Also meant constructively, not critically.)

(Actually just found out GitHub has private conversation capabilities - very accustomed to face to face communication. Thanks to @UziTech for educating me on that score.)

@davisjam
Copy link
Contributor Author

Maybe publicly stating what our protocols are under certain extreme circumstances would help curb that for me?

Always good to get this stuff written down, especially for folks like me who don't always know what the preferences are. Something along the lines of "Please disclose potential security issues by email to the project maintainers. We will provide an initial assessment of security reports within 48 hours and should apply patches within 2 weeks."

Still concerned about potential anti-patterns that can result from "working in the dark" on certain things.

Right, don't want to sit on a vulnerability for year(s) of course.

@Feder1co5oave
Copy link
Contributor

Yes, I know the html files don't match the md. I didn't know what to put for the expected content. Since the test suite didn't complain I did not pursue it further.
Sounds like a problem in the test harness? Is there an issue for this?

@davisjam don't worry, I really appreciated you stepping in to help us in dealing with these security issues and this contribution was really important. I was looking forward to investigate the tools you suggested.
Those empty html fixture are very easily solvable with

bin/marked test/new/redos_html_closing.md > test/new/redos_html_closing.html
bin/marked test/new/redos_nolink.md > test/new/redos_nolink.html

That the test suite makes those tests pass is our fault, not yours. I explained why in #1017:

the current text comparison algorithm is broken: whenever the expected html is shorter than the generated one, and matches everything up until its end, the test is marked as complete, even though marked outputted something more.

edit: I see that you already figured that out. Anyway being that #1017 uses an external library (which is not well maintained) it might bring some issues on its own in the future.

I wanted to check the change you made to the html rule, not because of the change itself, but because the original rule was "naive" to say the least. The nolink one never bothered me.

I realize #1058 was critical and a fix should have been released as soon as possible. I was working on it, but also trying to comply more with commonmark on raw html, so we wouldn't have to do the job twice.
I'm pretty sure although that other regexes need to be checked for backtracking vulnerabilities; I suggest you have a look at

  • block._title
  • block.table.nptable
  • block.table.table
  • inline._inside
  • inline.gfm._backpedal

because at a rapid inspection those are the ones with a star height > 1.


@joshbruce I admittedly missed that comment from karen. At one point I had to turn off email notifications: I received a total of 260 of them since #1058 had been opened (feb 18), until feb 26. I really couldn't follow everything that was going on.
Btw that Tuckman thing was a nice read.

@joshbruce
Copy link
Member

@Feder1co5oave: Hey brother, no worries, we miss things sometimes. It's hard to keep up with conversations sometimes. It's also hard to trust that other people "got this" (or not) when you haven't worked with them for long periods of time. (That lack of face to face though - man it's killing me, not gonna lie.)

Yeah. The Tuckman model is interesting, I'm glad you appreciated it. There's debate over its validity (as with most anything involving human interaction). But, I do keep seeing it over and over again; so, I keep telling people about it. 😊

I really appreciate game theory as well, specifically cooperative games (I don't do well in building competitive games). It's fascinating how simple rules can lead to some really interesting group dynamics. I will say the "skipping the storming" bit can usually be helped by having a group get together to establish those norms. I facilitated a session like that for the team I'm on for my day job when we first got together...I love that team. I would probably honestly die for that team if it came to that...I'm not kidding or being hyperbolic. And, we are finally getting the influence to do some really cool things - took us three years to get there - but still. 😝


I tend not to assume the level of knowledge in various areas of other people; so, sorry if you know some of this next bit...

On the notification front, I don't know what email client you have, but here are my rules for dealing with all things GitHub (it has saved my skin! - and sanity) - subscriptions across the board actually (this is Mail for macOS):

  1. I don't have email notifications turned on - I go to it when I want to, not when it gets all up in my face. See Randy Pausch on Time Management. "Notifications are interruptions" [and I got stuff to do]. 😉
  2. Here's the rule (if all of the following are true):
  • If "from" contains "@github.com" (all the GitHub emails)
  • If "subject" contains "[account]/[repo]" (all the ones from that repo)
  • If "Message content" does NOT contain "@joshbruce" (not explicitly for me, woohoo!)
  • Then "Move message" to mailbox: "[repo specific mailbox]"
  1. I have a day where I do most of my planning and whatnot per week, as part of that, I go through and catch up on all the conversations I might have missed that I may or may not want to comment on. See The Seven Habits of Highly Effective People (specifically "sharpening the saw" - but it's a good read - at least I thought so), Getting Things Done, and The Pomodoro Technique (I have the PDF for when this was still being market tested before it became a commercial product, if you would like).

Feder1co5oave added a commit to Feder1co5oave/marktex that referenced this pull request Mar 8, 2018
zhenalexfan pushed a commit to zhenalexfan/MarkdownHan that referenced this pull request Nov 8, 2021
zhenalexfan pushed a commit to zhenalexfan/MarkdownHan that referenced this pull request Nov 8, 2021
security: fix REDOS vulnerabilities
zhenalexfan pushed a commit to zhenalexfan/MarkdownHan that referenced this pull request Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
L0 - security A security vulnerability within the Marked library is discovered
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

4 participants