Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding an option to change the regex engine to RE2 #1683

Closed
efebarlas opened this issue Jul 9, 2021 · 10 comments
Closed

Adding an option to change the regex engine to RE2 #1683

efebarlas opened this issue Jul 9, 2021 · 10 comments

Comments

@efebarlas
Copy link
Contributor

efebarlas commented Jul 9, 2021

What problem do you want to solve?
The regex engine of Node.js uses a backtracking algorithm which may result in slow regex validation times for schemas containing unsafe regexes, possibly resulting in ReDoS if the web service waits for the validation to be completed.
What do you think is the correct solution to problem?
We can add an option to Ajv to determine the regex engine to use (RE2 or Node.js). RE2 is a regex engine which guarantees worst-case linear time complexity (although it runs slower than a backtracking algorithm for some regexes). It's best if this option would default to Node.js for reasons I will detail below.

The vast majority of regexes behave the same in RE2 as in Node.js (the RE2 test suite works by performing regression testing with a traditional backtracking algorithm and a variety of regex engine flavors). However, there are known, and acknowledged, mismatches between RE2 and Node.js, for example the inclusion of vertical tabs as a whitespace character. There could be other, unknown mismatches as well. It is known that different regex engines may interpret regexes slightly differently, as detailed in this post.

Then, a developer could opt-in to using RE2 if they're sure their regex works correctly in RE2. A try-catch block would try to compile the regex in RE2. If it doesn't - meaning the regex contains extended features such as backreferences, which RE2 cannot support -, we can fallback on the Node.js engine and use self.logger.log to display a brief warning about the fallback.
Will you be able to implement it?
Yes.

There are other options for integrating RE2, and I can explain them if this solution isn't found suitable.

Also related: ajv-validator/ajv-formats#7

@epoberezkin
Copy link
Member

epoberezkin commented Jul 16, 2021

See #1684 (comment)

The option regExp would have type:

type RegExpEngine = (pattern: string) => RegExpLike

interface RegExpLike {
  test: (s: string) => boolean
}

@efebarlas
Copy link
Contributor Author

efebarlas commented Jul 16, 2021

Should the interface support a 'test' function or should it be something like this:

interface RegExpLike { 
  compile: (s: string) => boolean
  match: (s: string) => boolean 
}

This would allow for separation of regex compilation in ajv.compile from regex validation in ajv.validate, right?

@epoberezkin
Copy link
Member

Needs to be double checked but I don’t think Ajv uses anything other than test at the moment

@efebarlas
Copy link
Contributor Author

I'm having difficulties with running the RegExpEngine function as part of standalone validation code:

export function usePattern({gen, it: {opts}}: KeywordCxt, pattern: string): Name {
const u = opts.unicodeRegExp ? "u" : ""
return gen.scopeValue("pattern", {
key: pattern,
ref: new RegExp(pattern, u),
code: _`new RegExp(${pattern}, ${u})`,
})
}

Here, what might be the value for 'code'? I tried useFunc, but this function is used with import statements, so I'm not sure how I can supply the code of a function to codeGen.

@epoberezkin
Copy link
Member

It’s the same problem as with formats - there is an option code.formats to supply code for formats, and ajv-formats quietly populates it.

so there would have to be a separate option code.regexp to pass code snippet to import regexp engine (that would default to the require of function in runtime folder that uses normal RegExp internally).

@efebarlas
Copy link
Contributor Author

efebarlas commented Aug 12, 2021

I made changes to the original pull request: opts.code now has a 'regExp' property which can be used to provide an engine. I included two lib/runtime files for importing the default engine & the re2 engine. I also added re2 to the list of developer dependencies.

I can prepare documentation + test files if necessary.

@epoberezkin
Copy link
Member

getting there - commented in the PR. Yes - please add tests/docs - I will release. Thank you!

@efebarlas
Copy link
Contributor Author

@epoberezkin I added some light documentation and added tests using the test cases in extras/pattern.json. Please let me know if anything is needed before a release. Thanks!

@davisjam
Copy link

Fixed by #1684?

@epoberezkin Would you like some prose to update the description here? https://ajv.js.org/security.html#redos-attack

@epoberezkin
Copy link
Member

Correct.

some prose to update the description here? https://ajv.js.org/security.html#redos-attack

@davisjam good idea - thank you!

epoberezkin added a commit that referenced this issue Dec 15, 2021
* Update ReDoS section of security.md

* Update docs/security.md

* Update docs/security.md

* Update docs/security.md

Co-authored-by: Evgeny Poberezkin <2769109+epoberezkin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants