Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bugfix] Fix redos in preprocessRFC2822 regex #6015

Merged

Conversation

vovikhangcdv
Copy link
Contributor

@vovikhangcdv vovikhangcdv commented Jun 8, 2022

Fixes: #6012

Using the "look-ahead" mechanism regex in preprocessRFC2822() to resolve the problem Regular Expression Denial of Service (ReDoS)#6012

Fixes: [moment#2936](moment#6012)

Directly match the comment tokens in preprocessRFC2822 regex to resolve the problem [Regular Expression Denial of Service (ReDoS)moment#6012](moment#6012)
change the direct matching regex to a local backtracking regex to support all characters in the token comment
@ichernev
Copy link
Contributor

@vovikhangcdv can you explain a bit how this is being fixed?

From the linked issue it looks like any open-close relationship that we have to track (not sure if there are more), has the same issue. Also is this compatible all the way back to Roman Empire JavaScript?

@vovikhangcdv
Copy link
Contributor Author

Hi @ichernev,
Great to see your response. There are some things we need to clarify.


  1. How this is being fixed?

Firstly, look back to the original regex: /\([^)]*\)|[\n\t]/g.
The hotspot is \([^)]*\). The process would be, sequentially:

image

  1. Try to match the ( character.
  2. Try to match all those characters that are not ), as many times as possible (greedy).
  3. Try to match the ) character.

This process will be executed for the full input string and repeated with substring (remove the start character) until match all the three steps or all characters are checked.

Consider the evil payload format: "(".repeat(X) (the worst-case). This payload will exploit the greedy mechanism of step 2 and the backtracking mechanism of the failure matching process.
For each (, step 2 will check all the rest of the substring characters (it costs X - num_of_checked_substring substeps) which certainly fails and the process has to backtrack and repeat for total X substrings. This is a polynomial complexity problem.

Now, look to the suggested fix: /\((?:(?!\().)*\)|[\n\t]/gs.
The hotspot (after remove a non-capture group (?:)) is:\(((?!\().)*\)

image

The matching process would be similar to the old regex, except that step 2 uses non-greedy searching but a "look-ahead" mechanism. It will immediately fail when it matches the second ( characters (cost one step instead of X - num_of_checked_substring substeps like the above example). It means the matching process only costs linear steps based on the length of the input string.


  1. There are a lot of regular expressions used in this project. Are there any similar issues in this project? How we can find and fix it?

The mission of moment is a third-party package that helps other projects and applications achieve their goals. I believe that eliminating all possible security vulnerabilities is crucial work for the community. Give a huge thank you and all the awesome contributors of this project. I see a lot of effort to reduce the risk of ReDos attack vector in the past (#2936, #4163, and other relative commits). After carefully looking at the codebase, this is the only ReDoS vulnerability left I could find. To prevent this kind of vulnerability in the future, we just need to carefully use efficient regexes. The recheck (see the online version) is a useful tool to check the complexity of regex, just make sure that only the safe or linear regexes should be used.


  1. Is this compatible all the way back to Roman Empire JavaScript?

I am not sure what you asking about. But if you are concerned about the compatibility of the fix, I have tested the fix and confirm that it passed 160,555 test cases of unit tests. Or if I missing something, please let me know, and we will figure it out together.


References:

@emer7
Copy link

emer7 commented Jun 22, 2022

Hi is there update to this fix? Thank you

@vovikhangcdv
Copy link
Contributor Author

Hi @ichernev, could you review the PR?

@emer7
Copy link

emer7 commented Jul 5, 2022

Hi @ichernev, any update for this PR? Thanks!

@ichernev
Copy link
Contributor

ichernev commented Jul 5, 2022

Hey, sorry for the delay, I'll release a build in the coming days with the fix, I was a bit worried about the lookahead (if it was supported), but the current implementation uses whitelist, which is better.

@vovikhangcdv
Copy link
Contributor Author

Hey, sorry for the delay, I'll release a build in the coming days with the fix, I was a bit worried about the lookahead (if it was supported), but the current implementation uses whitelist, which is better.

Good to see you back. If you can confirm the issue, could you validate my report on Hunter?. It will help my work too. Thank you, @ichernev.

@ichernev
Copy link
Contributor

ichernev commented Jul 5, 2022

How about we change (/\([^)]*\)|[\n\t]/g to (/\([^()]*\)|[\n\t]/g? As far as I get it you should avoid matching more open brackets, so we can just prevent that.

@ichernev
Copy link
Contributor

ichernev commented Jul 5, 2022

@vovikhangcdv

I can merge the fix with [^()], I can give further attribution to you if necessary (i.e make a security advisory in github), for reporting it. I'll use your excellent description from a few comments above.

update regex to avoid matching more open brackets from @ichernev suggestion
@vovikhangcdv
Copy link
Contributor Author

Hey @ichernev,
I can confirm your solution solves the security issue and work well. I have added a commit change from your suggestion.

@vovikhangcdv
Copy link
Contributor Author

vovikhangcdv commented Jul 6, 2022

@vovikhangcdv

I can merge the fix with [^()], I can give further attribution to you if necessary (i.e make a security advisory in github), for reporting it. I'll use your excellent description from a few comments above.

As a researcher, being credited on moment security advisory or assigned CVE is my pleasure. I would appreciate that. Thank you @ichernev.

@ichernev
Copy link
Contributor

ichernev commented Jul 6, 2022

Reading through the specification of the date format, it looks like formally a comment (stuff in brackets) can have a nested comment inside:

FWS             =       ([*WSP CRLF] 1*WSP) /   ; Folding white space
                        obs-FWS

ctext           =       NO-WS-CTL /     ; Non white space controls

                        %d33-39 /       ; The rest of the US-ASCII
                        %d42-91 /       ;  characters not including "(",
                        %d93-126        ;  ")", or "\"

ccontent        =       ctext / quoted-pair / comment

comment         =       "(" *([FWS] ccontent) [FWS] ")"

CFWS            =       *([FWS] comment) (([FWS] comment) / FWS)

note how comment includes ccontent which might include comment. Also note that CFWS is part of the obs- tokens, short for obsolete, so if this was obsolete in 2001 it might be safe to move on :)

To be fair, the Ruby standard library fails to parse dates with comments, so I'm not 100% sure we should deal with this crap. But if we do, there should be code to track open/closed parenthesis, if there is anything unbalanced the parsing should fail (because there are no brackets allowed in the actual string). This code is linear, so won't introduce bottlenecks.

If we keep the current comment approach (disallow open and close paren inside comment), it will "capture" (and remove) only the inner most comment, and the parsing will fail if there are nested comments (which, given the legacy-ness of all of this might be fine :)).

@vovikhangcdv
Copy link
Contributor Author

Yeh, I had noticed about nested comment in RFC2822 when trying to fix the issue before and was quite struggling with it :)). But looking at the old version of what moment did, I thought that was intended ignoring and done the same way :).

@ichernev ichernev merged commit 7aebb16 into moment:develop Jul 6, 2022
ichernev added a commit that referenced this pull request Jul 6, 2022
ichernev pushed a commit that referenced this pull request Jul 6, 2022
* fix ReDoS in preprocessRFC2822 regex

Fixes: [#2936](#6012)

Disallow nested rfc2822 comments to prevent quadratic regex execution time (i.e each open bracket is considered at most twice).
@vovikhangcdv vovikhangcdv deleted the fix-redos-in-preprocessRFC2822 branch July 6, 2022 16:00
@ichernev
Copy link
Contributor

ichernev commented Jul 6, 2022

Advisory link: GHSA-wc69-rhjr-hc9g

@JamieSlome
Copy link

@ichernev - would it be possible to add the following report URL to the advisory?

jhuckaby added a commit to jhuckaby/Cronicle that referenced this pull request Jul 29, 2022
michaeljmarshall pushed a commit to datastax/pulsar-admin-console that referenced this pull request Aug 30, 2022
eventualbuddha added a commit to votingworks/vxsuite that referenced this pull request Mar 20, 2023
eventualbuddha added a commit to votingworks/vxsuite that referenced this pull request Mar 21, 2023
* chore: upgrade luxon

Mitigation for CVE-2023-22467 (see moment/moment#6015 (comment))

* chore: update `@types/luxon` and fix usage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regular Expression Denial of Service (ReDoS)
4 participants