Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fuzz: compiling '\P{any}' panics by tripping an assertion in the compiler #722

Closed
BurntSushi opened this issue Oct 19, 2020 · 0 comments
Closed
Labels

Comments

@BurntSushi
Copy link
Member

BurntSushi commented Oct 19, 2020

Specifically, this one:

assert!(!ranges.is_empty());

Normally, regexes like [^\w\W] with empty classes are banned at translation time. But it looks like \P{any} (which is empty) slipped through. So we should just improve the ban to cover that case.

However, empty character classes are occasionally useful constructs for injecting a "fail" sub-pattern into a regex, typically in the context of cases where regexes are generated. Indeed, the NFA compiler in regex-automata handles this case fine:

$ regex-cli debug nfa thompson '\P{any}' -B
      parse time:  48.809µs
  translate time:  17.48µs
compile nfa time:  18.638µs
   pattern count:  1

thompson::NFA(
>000000: alt(2, 1)
 000001: \x00-\xff => 0
^000002: sparse()
 000003: MATCH(0)
)

Where it's impossible to ever move past state 2. Arguably, it might be nicer if it were an explicit "fail" instruction, but an empty sparse instruction (a state with no outgoing transitions) serves the purpose as well.

So once #656 is done, we should be able to relax this restriction.

This bug was found by OSS-Fuzz.

@BurntSushi BurntSushi added the bug label Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant