Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent stack overflows by limiting recursion #501

Closed
wants to merge 1 commit into from

Conversation

46bit
Copy link

@46bit 46bit commented May 18, 2022

Addresses #305

Until now it has been possible to trigger a stack overflow by passing crafted inputs (e.g. a string of 1000 (). This is problematic when using the library to parse user input.

This commit adds a recursion limit so long as the library is compiled with std. The limit ensures that malicious inputs cannot trigger a stack overflow, by preventing excessively deep recursion.

Since the SQL parser is so big, I've added this protection to every Parser method that returns a Result. The actual recursive patterns in the AST are more limited, but with such a large parser it's much harder to identify them and put targeted protections in place.

Copy link
Collaborator

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @46bit -- This is quite cool and close I think.

Given how low level the sqlparser crate is, I would like to minimize the dependencies as much as possible. https://crates.io/crates/scopeguard has a bunch of unsafe as well .

The use of Rc and unsafe to check a value that requires annotation everywhere somewhat overly complex to me

I was wondering how hard it would be to make a macro that avoids the need for scope_guard entirely. For example Instead of:

    pub fn parse_msck(&mut self) -> Result<Statement, ParserError> {
        check_recursion_depth!(self);

I wonder if it would be possible to create, via macro functions like

    pub fn parse_msck(&mut self) -> Result<Statement, ParserError> {
        self.remaining_depth -= 1;
        if self.remaining_depth == 0 {
            return Err(ParserError::RecursionLimitExceeded);
        }
        let res = parse_msck_impl(self)?
        self.remaining_depth += 1;
    }
    pub fn parse_msck_impl(&mut self) -> Result<Statement, ParserError> {
      // existing code here
    }

We could make it even more encapsulated perhaps via a RAA struct like:

struct StackChecker<'a> {
 parser: &'a mut Parser,
} 

impl StackChecker {
  pub fn try_new(&mut parser) -> Result<Self> {
    // stack check here, and decrement depth
  }
}

impl Drop for StackChecker {
  fn drop(&mut self) {
   self.parser.remaining_depth -= 1
}
}

If we truly need scopeguard, I think in this case we should contemplate incorporating the code by value (aka copy it in)

examples/parse_select.rs Outdated Show resolved Hide resolved
@46bit
Copy link
Author

46bit commented May 26, 2022

@alamb Thank you for the detailed review, I agree with pretty much all your points :) The overflow protection seems effective (we've been fuzzing it for a solid week now and haven't found a workaround) yet I definitely agree on finding a nicer implementation. I'll revisit this PR in the next few weeks as soon as time allows and work on getting it into a better place.

@alamb
Copy link
Collaborator

alamb commented May 26, 2022

@alamb Thank you for the detailed review, I agree with pretty much all your points :) The overflow protection seems effective (we've been fuzzing it for a solid week now and haven't found a workaround) yet I definitely agree on finding a nicer implementation. I'll revisit this PR in the next few weeks as soon as time allows and work on getting it into a better place.

Awesome -- I can't wait to see it. Thank you for your contribution and sticking with it.

@coveralls
Copy link

Pull Request Test Coverage Report for Build 2352192388

  • 130 of 132 (98.48%) changed or added relevant lines in 2 files are covered.
  • 72 unchanged lines in 5 files lost coverage.
  • Overall coverage decreased (-0.6%) to 89.879%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/parser.rs 125 127 98.43%
Files with Coverage Reduction New Missed Lines %
src/parser.rs 1 83.69%
tests/sqlparser_redshift.rs 1 98.11%
tests/sqlparser_snowflake.rs 2 96.43%
tests/sqlparser_postgres.rs 16 97.76%
tests/sqlparser_common.rs 52 97.04%
Totals Coverage Status
Change from base Build 2328175823: -0.6%
Covered Lines: 8365
Relevant Lines: 9307

💛 - Coveralls

Until now, it's been able to trigger a stack overflow crash by providing a
string with excessive recursion. For instance a string of 1000 left brackets
causes the parser to recurse down 1000 times, and overflow the stack.

This commit adds protection against excessive recursion. It adds a field to
`Parser` for tracking the current recursion depth. Every function that returns
a `Result` gains a recursion depth check. This isn't quite every method on the
`Parser`, but it's the vast majority.

An alternative implemention would be to only protect against AST recursions,
rather than recursive function calls in `Parser`. That isn't as easy to implement
because the parser is so large.
@alamb
Copy link
Collaborator

alamb commented Oct 1, 2022

Closing as stale -- please reopen if you still plan to work on it

@alamb
Copy link
Collaborator

alamb commented Dec 14, 2022

I have an proposal, based partly on this PR, here: #764

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants