Skip to content

Commit

Permalink
Mirror Onigmo quirk for escaped number chains ...
Browse files Browse the repository at this point in the history
Any escape of a number above 7 is treated as a literal escape (i.e. a redundant escape) IFF it is followed by another number.

```ruby
/\99/ # matches "99"
/9\9/ # SyntaxError
```

This is because Onigmo enters an octal-parsing branch for `\\\d{2,}` and then bails if any number is > 7.

fixes #87
  • Loading branch information
jaynetics committed Oct 10, 2023
1 parent 9e62735 commit 104ed92
Show file tree
Hide file tree
Showing 5 changed files with 22 additions and 0 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Expand Up @@ -7,6 +7,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Fixed

- handle a corner case where parsing redundant number escapes raised an error
* e.g. `parse(/\99/)`, which in Ruby is a valid Regexp that matches `99`
* thanks to [Markus Schirp](https://github.com/mbj) for the report

## [2.8.1] - 2023-06-10 - Janosch Müller

### Fixed
Expand Down
7 changes: 7 additions & 0 deletions lib/regexp_parser/scanner/scanner.rl
Expand Up @@ -267,6 +267,13 @@
fret;
};

[8-9] . [0-9] { # special case, emits two tokens
text = copy(data, ts-1, te)
emit(:escape, :literal, text[0, 2])
emit(:literal, :literal, text[2])
fret;
};

meta_char {
case text = copy(data, ts-1, te)
when '\.'; emit(:escape, :dot, text)
Expand Down
3 changes: 3 additions & 0 deletions spec/parser/refcalls_spec.rb
Expand Up @@ -77,6 +77,9 @@

specify('parse invalid reference') do
expect { RP.parse('\1') }.to raise_error(/Invalid reference/)
expect { RP.parse('1\1') }.to raise_error(/Invalid reference/)
expect { RP.parse('\8') }.to raise_error(/Invalid reference/)
expect { RP.parse('8\8') }.to raise_error(/Invalid reference/)
expect { RP.parse('(a)\2') }.to raise_error(/Invalid reference/)
expect { RP.parse('\k<1>') }.to raise_error(/Invalid reference/)
expect { RP.parse('\k<+1>') }.to raise_error(/Invalid reference/)
Expand Down
4 changes: 4 additions & 0 deletions spec/scanner/escapes_spec.rb
Expand Up @@ -27,6 +27,10 @@
include_examples 'scan', 'a\0124', 1 => [:escape, :octal, '\012', 1, 5]
include_examples 'scan', '\712+7', 0 => [:escape, :octal, '\712', 0, 4]

# special case: "out-of-bound octal escapes" are not treated as backrefs
include_examples 'scan', '\80', 0 => [:escape, :literal, '\8', 0, 2]
include_examples 'scan', '\80', 1 => [:literal, :literal, '0', 2, 3]

include_examples 'scan', 'a\xA', 1 => [:escape, :hex, '\xA', 1, 4]
include_examples 'scan', 'a\x24c', 1 => [:escape, :hex, '\x24', 1, 5]
include_examples 'scan', 'a\x0640c', 1 => [:escape, :hex, '\x06', 1, 5]
Expand Down
2 changes: 2 additions & 0 deletions spec/scanner/sets_spec.rb
Expand Up @@ -40,6 +40,8 @@
include_examples 'scan', '[\7]', 1 => [:escape, :octal, '\7', 1, 3]
include_examples 'scan', '[\77]', 1 => [:escape, :octal, '\77', 1, 4]
include_examples 'scan', '[\777]', 1 => [:escape, :octal, '\777', 1, 5]
include_examples 'scan', '[\8]', 1 => [:escape, :literal, '\8', 1, 3]
include_examples 'scan', '[\88]', 1 => [:escape, :literal, '\8', 1, 3]
include_examples 'scan', '[\\[]', 1 => [:escape, :set_open, '\[', 1, 3]
include_examples 'scan', '[\\]]', 1 => [:escape, :set_close, '\]', 1, 3]
include_examples 'scan', '[a\-]', 2 => [:escape, :literal, '\-', 2, 4]
Expand Down

0 comments on commit 104ed92

Please sign in to comment.