Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid autocorrection with Style/RedundantRegexpCharacterClass #8948

Closed
tas50 opened this issue Oct 26, 2020 · 6 comments
Closed

Invalid autocorrection with Style/RedundantRegexpCharacterClass #8948

tas50 opened this issue Oct 26, 2020 · 6 comments
Labels

Comments

@tas50
Copy link
Contributor

tas50 commented Oct 26, 2020

Expected behavior

Regex autocorrect is a valid regex.

Actual behavior

The offense is partially corrected and an invalid regex is produced. This is a similar problem to the one that was fixed in #8913, but the issue still happens against current master with that fix.

Steps to reproduce the problem

Autocorrect:

MATCHES_CHEF_GEM ||= %r{/chef-[\d]+\.[\d]+\.[\d]+}.freeze

Correct is to:

MATCHES_CHEF_GEM ||= %r{/che-[\d+.[\d+.[\d+}.freeze

RuboCop version

master
@marcandre
Copy link
Contributor

@ysakasin did you want to have a look?

@marcandre marcandre added the bug label Oct 26, 2020
@ysakasin
Copy link
Contributor

@marcandre Yes, I look it.

@amatsuda
Copy link
Contributor

Let me just throw in another simple broken example.
While this correctly auto-corrects,
/OK[\d]/

this fails to get the position of [
/だめ[\d]/

because Regexp::Expression::CharacterSet#ts returns the value based on bytesize instead of String length in Ruby.

@marcandre
Copy link
Contributor

because Regexp::Expression::CharacterSet#ts returns the value based on bytesize instead of String length in Ruby.

Ohhh. That's a different issue though, and should probably be fixed in the regexp parser gem...

@ysakasin
Copy link
Contributor

Examples

ysakasin@mac:~/s/rubocop|master⚡?
➤ cat a.rb
/[\d]+/
%r{[\d]+}

/abc[\d]+/
%r{abc[\d]+}

/OK[\d]/
/だめ[\d]/
ysakasin@mac:~/s/rubocop|master⚡?
➤ bundle exec exe/rubocop a.rb -a
Inspecting 1 file
An error occurred while Style/RedundantRegexpCharacterClass cop was inspecting /Users/ysakasin/src/rubocop/a.rb:8:0.
To see the complete backtrace run rubocop -d.
E

Offenses:

a.rb:1:2: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, [\d] can be replaced with \d.
/[\d]+/
 ^^^^
a.rb:2:1: C: [Corrected] Style/RegexpLiteral: Use // around regular expression.
%r{[\d]+}
^^^^^^^^^
a.rb:2:2: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, r{[\d] can be replaced with {[\d.
%r{[\d]+}
 ^^^^^^
a.rb:4:5: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, [\d] can be replaced with \d.
/abc[\d]+/
    ^^^^
a.rb:5:1: E: Lint/Syntax: premature end of char-class: /ac[\d+/
(Using Ruby 2.4 parser; configure using TargetRubyVersion parameter, under AllCops)
%r{ac[\d+}
^^^^^^^^^^
a.rb:5:1: C: [Corrected] Style/RegexpLiteral: Use // around regular expression.
%r{abc[\d]+}
^^^^^^^^^^^^
a.rb:5:5: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, bc[\d] can be replaced with c[\d.
%r{abc[\d]+}
    ^^^^^^
a.rb:7:4: C: [Corrected] Style/RedundantRegexpCharacterClass: Redundant single-element character class, [\d] can be replaced with \d.
/OK[\d]/
   ^^^^

1 file inspected, 8 offenses detected, 7 offenses corrected

1 error occurred:
An error occurred while Style/RedundantRegexpCharacterClass cop was inspecting /Users/ysakasin/src/rubocop/a.rb:8:0.
Errors are usually caused by RuboCop bugs.
Please, report your problems to RuboCop's issue tracker.
https://github.com/rubocop-hq/rubocop/issues

Mention the following information in the issue report:
1.0.0 (using Parser 2.7.2.0, rubocop-ast 1.0.1, running on ruby 2.7.1 x86_64-darwin19)
ysakasin@mac:~/s/rubocop|master⚡?
➤ cat a.rb
/\d+/
%{[\d+}

/abc\d+/
%r{ac[\d+}

/OK\d/
/だめ[\d]/

@amatsuda
Copy link
Contributor

Ohhh. That's a different issue though, and should probably be fixed in the regexp parser gem...

Right. It seems like their problem.
But I guess regexp_parser behaves this way by design so that they could correctly handle grapheme thing (I mean, I just guess so).
And I suppose that it'd be hard to request a scanner to hold both byte-based and string-based sizes...

ysakasin added a commit to ysakasin/rubocop that referenced this issue Oct 28, 2020
…terClass with %r

This commit uses `RegexpNode::RegexpParser#body` (rubocop#8960) instead of `Parser::Source::Range#adjust` in Style/RedundantRegexpCharacterClass.

When regular expression are written by %r literal, RedundantRegexpCharacterClass made incorrect loc as following.
```
%r{abc[\d]+}
    ^^^^^^
```

This bug are caused by calling`node.loc.begin.adjust` to get range of character class with %r regexp.
`Parser::Source::Range#adjust` adjust based on `Parser::Source::Range#begin_pos` but it with %r literal doesn't indicate start of regular expression source, point to location of '%'.
So, RedundantRegexpCharacterClass makes incorrect loc if %r literal are used in code.

```
r = %r{abc}
    ^      # `node.loc.begin.begin_pos`
```

```
r = %r{abs}
       ^   # `node.loc.begin.end_pos`
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants