Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added test for exponential backtracking #2590

Merged
merged 23 commits into from Nov 28, 2020

Conversation

RunDevelopment
Copy link
Member

This was branched from #2584. Blocked by #2584.


I added a small test that checks all of Prism's regexes for exponential backtracking (EB) in under 1 sec. It only checks for the two most common causes of EB (most common by my experience).

  1. Non-disjoint alternatives. If the alternatives of a group aren't disjoint, then the RE engine might have to do exp. much backtracking to reject an input.
    Example: /(?:a|a)+b/.test("a".repeat(30)).
  2. Ambiguous repeats (idk what else to call this). If the quantified element of a Kleene star is not disjoint with repetitions of itself, then it's EB again. (See a comment in code for more details)
    Example: /(?:a+)+b/.test("a".repeat(30)).

(The examples might take a few seconds to return false.)

As usual for the pattern tests, I implemented long and descriptive error messages, each explaining the problem, the way to fix it, and an example. I also included a message to commit/make a PR even if this test fails. Rewriting regexes to fix EB can be extremely hard (some of the fixes in #2584 took me almost an hour (per regex!)), so it's unfair to make this a requirement for contributors.

The test successfully detected all vulnerabilities reported in #2583 (except for the non-EB one) and a simpler previous version of the test was used in #2268.

Limitations:

The test only detects those 2 causes for EB and no other. This is a huge limitation but it allows the test to be fast which is has to be (if each regex too on average 1sec to check, the whole test would take over 40 minutes).

There is also another, more technical, limitation. Backreferences, assertions, and lookarounds can't be converted to NFAs. My library (refa; more on that later) works around this by replacing all elements it can't convert with what is essentially the empty character set. (E.g. /foo$|bar/ will be transformed to /foo[]|bar/ == /bar/ which will then be converted to an NFA.) This means that the check will not be able to fully analyze regexes with those advanced regex features. Emphasis on fully. Even the partial analysis is quite good (many of the regexes fixed in #2584 contain backreferences, assertions, and lookarounds).

Despite those limitations, it's still able to perform meaningful analysis on all of Prism's regexes, and that extremely fast.

refa:

The whole test is implemented using refa, my libraries for constructing and modifying finite automata and regular expressions. The library is still in development and I'm making breaking API changes all the time (hence the version is fixed in package.json).

Going forward:

The ultimate goal regarding EB should be to use a third-party detector instead of implementing our own. However, right now, there are no suitable detectors available that can be easily run locally and are fast enough to be used on Prism's scale.

@RunDevelopment RunDevelopment mentioned this pull request Oct 25, 2020
@joshgoebel
Copy link

joshgoebel commented Nov 20, 2020

This is completely amazing work and has been super helpful. I've been running it against our library but wonder if I've found a bug of false positive:

The quantifier `(([a-zA-Z_]\w*(?:<.*?>)?)[\*&\s]+)+` ambiguous for all words "A<>\tA<>\t".repeat(n) for any n>1. This will cause exponential backtracking.

So what we have here is:

(
 IDENT, // [a-zA-Z_]\w*
 TEMPLATE optional // <.*?>? ,
 [* & or space] at least once
) at least once

In Node executes immediately:

> /(([a-zA-Z_]\w*(?:<.*?>)?)[\*&\s]+)+/.exec("A<>\tA<>\t".repeat(10000)); null
null

It really doesn't seem to like the template piece but I've been staring at it and I can't see the issue, all the groups look they are disjoint to me. I can't see how the parser could go down different routes. Am I missing something?

@joshgoebel
Copy link

joshgoebel commented Nov 20, 2020

Ok, when I break it down to just (\w*(<.*?>)?)+ it becomes clear that the + and * are competing. :-) Just had to add a $ to the end and then add a character it will never match. Awesome.

Update: But wait, that's after I removed the [\*&\s]+... it seems with that there that the + and * wouldn't be in conflict? Every IDENT should be separated by a space, *, &, etc...

Sure enough in regex101:

(\w*(<.*?>)?[\*&\s]+)+$          -- fine
(\w*(<.*?>)?)+$          -- catastrophic

@joshgoebel
Copy link

joshgoebel commented Nov 20, 2020

I believe the problem is that [ IDENT, TEMPLATE, DELIMITER] is being treated as a union (of alternatives) when it is not... it's a sequence of elements (which is obviously very different).

Pretty sure we aren't handling this? Pretty sure checkDisjointAlternatives simply doesn't handle this? To properly handle a concatenation comparison such as:

original = [A,B,C] 
twoStar = [A,B,C][A,B,C]
isDisjoint(original,twoStar)

It seems you'd want to make sure A -> B and B -> C were disjoint (though perhaps this gets into polynomials... and then you'd want to check that C -> A was disjoint. Am I on the right track here? Is this type of checking out of scope or?

@joshgoebel
Copy link

joshgoebel commented Nov 20, 2020

I'm not certain quantify has correct behavior either.

// NFA ORIGINAL 
{
  type: 'Concatenation',
  elements: [
    {
      type: 'Quantifier',
      alternatives: [Array], // which will contain `elements` on the first item: [Quantifier, Quantier]
      max: Infinity,
      min: 1
    }
  ]
}
//TWOSTAR (after calling quantify)
{
  type: 'Concatenation',
  elements: [
    { type: 'Alternation', alternatives: [Array] },
    { type: 'Alternation', alternatives: [Array] }
  ]
}

It's not obvious this is the correct behavior - though perhaps it is within the scope of what NFA is trying to accomplish? I'm pretty sure these are not at all equivalent though if the goal was to only modify the original quantity from * to 2 - while otherwise leaving everything else the same... of course just this is in addition to the fact that I think we'd need more code just to handle this type of analysis.

Ok, this is REALLY hard to debug when the logger only shows a few levels deep... 🤨 so TWOSTAR does seem built correctly after all... the alternative on each is a single Concatenation node with the correct list of elements. So I think final answer is that the very complex isDisjointWith is buggy or else I'm missing something here about why this regex is still exponential.

@RunDevelopment
Copy link
Member Author

RunDevelopment commented Nov 20, 2020

I've been running it against our library but wonder if I've found a bug of false positive:

It's a true positive. To get an attack string from the output of the program, we need to find a tuple (x,y,z) for which all strings xy*z reject but this program only gives us the y component. The y component is always accepted by the analyzed loop, so just repeating it isn't enough. We still have to find a suffix z which causes the regex to reject the input string. Here's an example with x="", y=""A<>\tA<>\t", z="#":

/(([a-zA-Z_]\w*(?:<.*?>)?)[\*&\s]+)+$/.test("A<>\tA<>\t".repeat(13) + "#")

The $ is necessary because the regex will find and accept a substring of the input otherwise.

The reason why A<>\tA<>\t is ambiguous is because is can be matched as either:

[a-zA-Z_] < . . . . > [\*&\s]

or

[a-zA-Z_] < > [\*&\s] [a-zA-Z_] < > [\*&\s]

A simple fix for this issue is to replace <.*?> with <[^\r\n<>]*>.

It's important to note that in terms of exponential backtracking, lazy quantifiers are meaningless. If the input string doesn't contain any matching substrings, lazy quantifiers and greedy quantifiers will behave the same.


You perfectly illustrated why we need tooling that can not only detect but also fix exponential backtracking. These issues are extremely hard to understand and debug.

@joshgoebel
Copy link

joshgoebel commented Nov 20, 2020

It's important to note that in terms of exponential backtracking, lazy quantifiers are meaningless. If the input string doesn't contain any matching substrings, lazy quantifiers and greedy quantifiers will behave the same.

Yes, this is where I was getting hung up... I flagged the .* once or twice but kept coming back to it's being lazy... which (if I understand correctly) the problem with it's laziness is that in the worse case (no match)... it will FIRST exhaust the lazy route and then get more and more greedy (less and less lazy) until it's exhausted all possibilities... right? IE, becoming greedy at the very end. Is that the overall idea?

You perfectly illustrated why we need tooling that can not only detect but also fix exponential backtracking. These issues are extremely hard to understand and debug.

To my credit I correctly grasped 9 out of 10 or so. :-) Practice makes perfect I've been told. :) But yes, totally agree. Auto-correct for this would be nice. :)

@RunDevelopment
Copy link
Member Author

Your explanation is correct.

The only difference between greedy and lazy quantifiers is the order in which the regex engine will try out possible paths. Greedy quantifiers will try to repeat the quantified element first and try the element after the quantifier second. It's the reverse for lazy quantifiers. The easiest example for this is a? vs a??. a? == (a|) vs a?? == (|a). It's a little bit more complex for infinite quantifiers, but I hope that it's understandable: a* == (a(a(a(a(a(...)|)|)|)|)|) vs a*? == (|a(|a(|a(|a(|a(...)))))).

The regex engine will always return the first path that accepts the input but in the case of exponential backtracking, none of them will. This forces the regex engine to try out all possible paths and if there are exponentially many of those, then we'll get exponential runtime. Greedy vs lazy only determines in which order the regex engine will try out paths, but since we have to go through all of them, their order doesn't matter.

You perfectly illustrated why we need tooling that can not only detect but also fix exponential backtracking. These issues are extremely hard to understand and debug.

To my credit I correctly grasped 9 out of 10 or so. :-) Practice makes perfect I've been told. :)

Really good, but it still takes time... I.e. some of the issues in #2584 took me 15 minutes to understand and I wrote the detector...

@joshgoebel
Copy link

It's hard to believe no one sees this as a regex engine issue. I assume the PHP engine is more fully featured since regex101 can use it to do the really advanced iteration counting stuff. If we could pass a maximum number of iterations into the JS engine (or max execution time) or some such that would solve a lot of unknown/undetected runaway cases and allow people to set reasonable "safety rails".

@RunDevelopment
Copy link
Member Author

Max execution time is how most implement safeguards and should be available in JS.

There are non-backtracking regex engines (e.g. RE2 or Rust's regex) but they are less powerful. Backreferences can only be implemented with backtracking and general lookarounds are almost impossible to implement with backtracking.

There are also memorization techniques that prevent exponential runtime but they usually slow down the regex engine. However, this is probably not the case for some new memorization techniques (e.g. this one) but they are quite new.

I assume the PHP engine is more fully featured

PHP uses PCRE. This is the regex dialect targeted by most regex tools.

@joshgoebel
Copy link

joshgoebel commented Nov 21, 2020

Max execution time is how most implement safeguards and should be available in JS.

You mean should be as in "already are and I'm not knowing of them" or should be as in "they really SHOULD be added ASAP"? :-)

PHP uses PCRE. This is the regex dialect targeted by most regex tools.

Yes, but I was really talking bout the engine, not the dialect. I only meant that evidently it has more robust mitigation tools. I was just talking to the PHP Highlight.js port maintainer and learned a lot. PHP's built in backtracking limits are evidently quite tight. You can't even get about 8 in that sample you provided before it shuts you down with backtracking error (going over 1_000_000 iterations).

But then I wanted to see what browser are doing these days and came away scratching my head:

  • Node.js (for comparison): exponentially slow and apparent at 13+... I let it CPU spin for a minute then killed it.
  • Chrome (latest-ish): exponentially slow and apparent at 13+... (I didn't try to kill it, I have 100000 tabs open - though I hope it would just crash a single process or show an error after 10s or something)
  • Safari 14: I couldn't find a limit. I was over repeat(2_000_000) with no issues (though I'm unsure if it's actually working or just short circuiting internally). Didn't seem slow either.
  • Edge (latest): Up around 400_000-500_000 repeat it takes a few seconds, and if you go much higher it immediately throws a "Maximum call stack size exceeded" error.

I'm guessing Safari and Edge have mitigations/backtracking limits in place, but Googling isn't finding anything useful. Do you know anything more here? Perhaps the risk in browser isn't as bad as I thought initially (though Chrome's behavior and popularity is an obvious problem). I'm not sure how you could ensure predictable behavior though by just silently hiding the error rather than throwing plus it also seems kind of sketchy for the implementation to lie if it didn't actually finish the whole evaluation.

https://news.ycombinator.com/item?id=20310669

This mentions Safari but only to suggest ~"They silently fail, also.", much like PHP does if you don't check if there was an error generated or not. But I was looking for some sort of official announcement from Edge or Safari regarding their behavior. I think it'd be something you'd document or announce as a security win.

@RunDevelopment
Copy link
Member Author

You mean should be as in "already are and I'm not knowing of them" or should be as in "they really SHOULD be added ASAP"? :-)

MUST be added ASAP. :)

But then I wanted to see what browser are doing these days

Quick notes:

  • Chrome, Firefox, and Node all use v8's regex engine.
  • Chromium Edge also uses v8. Idk what old Edge (<= 18) uses.
  • Safari (and older versions of FF) use(s) the YARR regex engine.

So there are 2 regex engines to focus on and with improvements to v8, the problem will go away for about 85% of users.

Also, I found the backtracking limit in Webkit. It will stop after 1M backtracking steps, kinda, the actual metric is disjunctions. I was also able to verify that yarr silently fails because it only checks for the match return code and ignores all error codes. Is silently failing and therefore incorrectly implementing the JS spec a good solution? Idk, but that's what it does.

@joshgoebel
Copy link

Chromium Edge also uses v8. Idk what old Edge (<= 18) uses.

I was definitely referring to the latest Edge - we don't care about the former [pre Chrome engine] (as much). It definitely has VERY, VERY different behavior from Node.js and Chrome (in my testing) for some reason. I'd guess they added a limit like Safari, but that's pure speculation based on the runtime behavior.

Also, I found the backtracking limit in Webkit.

Thanks! I assumed some of this might be answerable since a lot of the engines here are OSS but had no idea where to start looking. :-)

Is silently failing and therefore incorrectly implementing the JS spec a good solution?

No, I think it's pretty terrible... :-) But perhaps it's better than the entire internet exploding? :-)

@RunDevelopment
Copy link
Member Author

But perhaps it's better than the entire internet exploding? :-)

Well, Safari and other Webkit-based browsers are just that, browsers. But v8 and NodeJS are a different issue. There are many bindings for v8 to other programming languages (e.g. Rust (see Deno)) many of which are used to run servers, so fixing it there is somewhat of a priority. The v8 team seems to be interested in reducing backtracking but not specifically exponential backtracking.

@joshgoebel
Copy link

Well, Safari and other Webkit-based browsers are just that, browsers. But v8 and NodeJS are a different issue.

For sure, I'd just like to see a knob (like in PHP) regardless of the default. So someone can tune it depending on their use case or even write an adaptive process that perhaps tried two different settings, thru "weird" jobs to a different queue, etc... would all depend on what you were doing of course. But more flexibility usually better than less.

@RunDevelopment
Copy link
Member Author

@mAAdhaTTah This one might be the most important to review for now. We just had another detectable case of exponential backtracking in #2659.

@mAAdhaTTah
Copy link
Member

@RunDevelopment Looking at this now.

The ultimate goal regarding EB should be to use a third-party detector instead of implementing our own.

Is this something that could go into refa? There is a bunch of code here in Prism that feels like lib code, and it would be awesome to be able to import like a checkBacktracking function that maybe provides a list of hints or something that we could then use to assert as you have. I'm not necessarily going to hold this up for that, but it might be worth considering.

tests/pattern-tests.js Outdated Show resolved Hide resolved
Copy link
Member

@mAAdhaTTah mAAdhaTTah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems legit. Nice work!

@RunDevelopment
Copy link
Member Author

RunDevelopment commented Nov 28, 2020

Is this something that could go into refa?

No. refa is just a math library implementing FA operations and the conversion between REs and NFAs.

There is a bunch of code here in Prism that feels like lib code

It kinda is. The only reason why I don't publish this as its own library is that the analysis isn't ""correct"". There are just too many false negatives for this to be used in the wild, I don't want people to rely on this analysis.

This is very much experimental code but it's kinda the best thing available for JS right now AFAIK.

@RunDevelopment RunDevelopment merged commit 05afbb1 into PrismJS:master Nov 28, 2020
@RunDevelopment RunDevelopment deleted the test-exp-backtracking branch November 28, 2020 23:12
@mAAdhaTTah
Copy link
Member

There are just too many false negatives for this to be used in the wild, I don't want people to rely on this analysis.

A library with those caveats would be useful to people. It's useful to us :). Just a thought, obviously you don't have to spend time building a lib you think is a bad idea.

@RunDevelopment
Copy link
Member Author

RunDevelopment commented Nov 29, 2020

you don't have to spend time building a lib you think is a bad idea.

I do that all the time. What I don't want is to support and maintain my bad ideas.

@joshgoebel
Copy link

Seconding the external library idea. I think it would be super useful even with the caveat that it can't detect everything.

But for now I can copy and paste.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants