New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Fixer for missing unicode flag in no-misleading-character-class #15279
Conversation
Hi @mathiasvr!, thanks for the Pull Request The pull request title isn't properly formatted. We ask that you update the message to match this format, as we use it to generate changelogs and automate releases.
Read more about contributing to ESLint here |
Thanks for the PR. I’m not sure this fix is safe. I’m general, we don’t do fixes that change how code works because it can be difficult for end users to identify if it creates a problem. In this case, I fear that the change may fall into that category. @eslint/eslint-tsc what do others think? |
I agree. People often run autofix on save or as a commit hook, so we want to be extra careful about fixers. This seems like a good fit for a suggested fix. |
Auto fixing code as a commit hook sounds like a terrible idea since you generally want your commits to be in the state you committed them. |
Good point @btmills. @mathiasvr can you change this to a suggestion? |
@nzakas Do I just change the type to
Edit: Ah, I think I understand, change the fix to suggest. |
This reverts commit 36396ba.
Okay I changed it to a suggestion. Just wondering, why would you enable this rule and not want an auto fix but only a suggestion? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks good! Can you add tests that verify the suggestion output is correct? For example:
{
code: "var r = /[👍]/",
errors: [{
messageId: "surrogatePairWithoutUFlag",
suggestions: [{ messageId: "suggestUnicodeFlag", output: "var r = /[👍]/u" }]
}]
},
why would you enable this rule and not want an auto fix but only a suggestion?
For cases like this that would alter behavior, we want to bring them to the developer's attention as a suggestion instead of invisibly making the change in autofix, even if we're pretty sure the change is the one they want.
Add tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I like how you also added tests where there was already a non-u
flag to make sure adding the Unicode flag didn’t clobber the existing flag(s).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two concerns about this change.
First, adding the u
flag can produce syntactically invalid regular expressions. For example, this is valid without the flag:
var r = /[👍]\a/;
When user applies the suggestion, it will cause a parsing error because \a
isn't allowed with the u
flag:
var r = /[👍]\a/u; // Parsing error: Invalid regular expression: /[👍]\a/: Invalid escape
Parsing errors will be noticeable right away, but in the case of RegExp constructors, users might become aware of the errors caused by suggestions from this rule only when they run the code.
var r = new RegExp("[👍]\\a", "u"); // no parsing errors, but this will throw in runtime
Second, suggestions are allowed to change the behavior of the code, but that should generally be limited to the reported problem. Consider the following example:
// K is U+212A
var regex = /^(?:[👍]|\W!)$/i;
console.log(
regex.test("👍") // false
);
console.log(
regex.test("K!") // true
);
Adding the u
flag fixes the problem with [👍]
, but since the flag applies to the whole regular expression, it also changes the behavior of \W
, which isn't related to the reported problem:
// K is U+212A
var regex = /^(?:[👍]|\W!)$/iu;
console.log(
regex.test("👍") // true
);
console.log(
regex.test("K!") // false!
);
For that reason, I think that suggestions that add the u
flag would be more suitable for the require-unicode-regexp
rule, as proposed in #15089.
@mdjermanovic I don't see a problem with allowing a suggestion to add the flag as changes might be necessary in either case. If wanting to allow this behaviour the rule should not be enabled. If not having suggestion for this rule, I think the |
My concern about this suggestion is its scope, because it adds a flag that applies to the whole regular expression, not just to the reported character class. The flag correctly fixes a specific problem reported by this rule, but it can also have side effects on the rest of the pattern, so it could introduce new problems that the user may not be aware of.
That's a good point, maybe we shouldn't provide suggestions in that rule either, except when we're sure that the suggested fix doesn't require any further manual fixes (i.e., the behavior with and without the flag is same). |
I’m not sure this is significant enough to change the path forward here. Suggestions were created specifically for situations where the fixes will potentially cause a change in behavior. In this case, we can’t limit the scope of the change, but it seems like the suggestion is still valid. |
This fix is way beyond the scope of the reported problem. The suggestion message should at least indicate that the fix may have side effects on behavior that is unrelated to the reported problem, and also that it can affect the validity of the regular expression (assuming that we're not checking if the regex will be valid with the |
It seems like we aren’t quite connecting on this. All suggestions have possible side effects, so it doesn’t make sense to mention that in a suggestion. It sounds like you might be saying that the fix could create an invalid regular expression? Can you give an example of that so we can discuss? @btmills you also approved this PR. What do you think? |
By side effects, I mean changes in behavior not related to the reported problem. I wouldn't expect that from a suggestion. Consider this example, user wants a regex that matches const regex = /^(?:[👍]|.{3})$/;
regex.test("👍"); // false. This is a bug, the problem is [👍]
regex.test("abc"); // true
regex.test("👶👶👶"); // false After the suggestion is applied: const regex = /^(?:[👍]|.{3})$/u;
regex.test("👍"); // true, the bug is fixed
regex.test("abc"); // true
regex.test("👶👶👶"); // true! This is a side effect, .{3} now works differently
I left one example in #15279 (review), here's a similar one: /[👍]\R/; // syntactically valid
/[👍]\R/u; // syntax error Additionally, the second version could become valid in the future (proposal-regexp-r-escape), but with completely different semantics than the original one. |
@mdjermanovic I think this is why we now do it as a suggestion and not an automatic fix, so the side effects doesn't happen without explicitly applying it. If a user enable this rule they should also want all the side effects as well, since they will ultimately have to disable it if not, no matter if it is auto-fixable, suggestion or just a warning. |
Not necessarily. There are other ways to fix the problem, for example
Yes, but there's no notice about side effects. In the |
The problem of not matching the emoji cannot be fixed without the u flag right? We agree that the suggestion of adding that flag may not be enough to fix the whole regex, but how will you avoid a lint error without disabling the rule or adding the flag? |
I don’t see the introduction of side effects as a problem here. As already discussed, suggestions are allowed to have side effects and it’s up to the end user to verify that their code still works. Autofixes cannot have side effects. If there is a more narrow fix to suggest, we should also suggest that. Part of the benefit of suggestions is that we can specify multiple options. We definitely shouldn’t suggest something that is syntactically invalid. I was under the impression that the earlier comment had already been addressed. If that’s not the case, we should fix that. |
TSC Summary: This PR seeks to add suggestions to no-misleading-character-class, specifically adding the “u” flag if it would result in a valid regular expression. Both nzakas and btmills have approved the PR, mdjermanovic has concerns about the side effects of the suggestions. We do allow side effects in suggestions in core rules. TSC Question: Do we want to change our expectations around suggestions? (https://eslint.org/docs/developer-guide/working-with-rules#providing-suggestions) Can we merge this? |
It can be fixed by moving the character out of the class. const regex = /^(?:👍|.{3})$/;
regex.test("👍"); // true, the bug is fixed
regex.test("abc"); // true
regex.test("👶👶👶"); // still false This fix changes behavior related to the reported problem, but without side effects on things that are not considered problems by this rule. There is a similar plugin rule |
Per 16-December-2021 TSC Meeting, this PR is accepted. The conclusion is that suggestions are allowed to change the behavior of the code even when the said behavior isn't directly related to the reported problem, and that there is no need to add a note about possible side effects in the suggestion message. On the other hand, suggestions that produce syntactically invalid code should be avoided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the previous comment, we should check if the regex would be syntactically valid with the u
flag.
For example, rule shouldn't provide the u
flag suggestion for the following regex literal:
/^[👍]\a$/;
For that purpose, we can use regexpp.RegExpValidator
. Here's an example in prefer-regex-literals
rule.
if (node.arguments.length === 1) { | ||
return fixer.insertTextAfterRange(patternNode.range, ', "u"'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First argument may be parenthesised:
new RegExp(("[👍]")); // regex is /[👍]/
In that case, the suggested fix wouldn't work as intended:
new RegExp(("[👍]", "u")); // regex is /u/
This was also discussed in the TSC meeting, and we agreed that it isn't necessary for this PR. It's fine to implement only the |
@mathiasvr are you still working on this? |
Co-authored-by: Milos Djermanovic <milos.djermanovic@gmail.com>
@nzakas Sorry, I don't have time to look into finishing the requested changes at the moment. |
Continued in #15867 |
Prerequisites checklist
What is the purpose of this pull request? (put an "X" next to an item)
[ ] Documentation update
[ ] Bug fix (template)
[ ] New rule (template)
[ ] Changes an existing rule (template)
[x] Add autofixing to a rule
[ ] Add a CLI option
[ ] Add something to the core
[ ] Other, please explain:
What changes did you make? (Give an overview)
Added autofixing for missing unicode "u" flag in case regex contains surrogate pair characters.
Is there anything you'd like reviewers to focus on?