Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

mahfouz72 · 2024-04-13T16:38:09Z

Resolves: #14747
Resolves: #4175

removed deep scan from processExpression and just scanned the first level of a given node (the direct children only not the whole subtree)
adding some tokens that got missed from having no deep scan

Diff Regression config: https://gist.githubusercontent.com/mahfouz72/2b480a919e3a98f5609aeab17fb79b1b/raw/23cfa105f73bc7149494f1e0712be2376ad13fbd/parenpadbase.xml

Diff Regression patch config: https://gist.githubusercontent.com/mahfouz72/79f7d3b270b52b91da18557854ce6e60/raw/9d6c487e7b1df6d511ec2a808566db21759b91e1/parenpadpatch.xml

mahfouz72 · 2024-04-13T16:42:02Z

src/main/java/com/puppycrawl/tools/checkstyle/checks/whitespace/ParenPadCheck.java

            }
-            else if (currentNode.hasChildren() && !isAcceptableToken(currentNode)) {
-                // Traverse all subtree tokens which will never be configured
-                // to be launched in visitToken()
-                currentNode = currentNode.getFirstChild();
-                continue;
-            }
-
-            // Go up after processing the last child
-            while (currentNode.getNextSibling() == null && currentNode.getParent() != ast) {
-                currentNode = currentNode.getParent();
-            }
            currentNode = currentNode.getNextSibling();
        }


removed the deep scan and just check the first child and all of its siblings

PS D:\test> cat src/Test3.java public class Test3 { int i = (( (5*4) + ( 4 + 2))); } PS D:\test> java -jar checkstyle-10.14.2-all.jar -t src/Test3.java COMPILATION_UNIT -> COMPILATION_UNIT [2:0] `--CLASS_DEF -> CLASS_DEF [2:0] |--MODIFIERS -> MODIFIERS [2:0] | `--LITERAL_PUBLIC -> public [2:0] |--LITERAL_CLASS -> class [2:7] |--IDENT -> Test3 [2:13] `--OBJBLOCK -> OBJBLOCK [2:19] |--LCURLY -> { [2:19] |--VARIABLE_DEF -> VARIABLE_DEF [3:4] | |--MODIFIERS -> MODIFIERS [3:4] | |--TYPE -> TYPE [3:4] | | `--LITERAL_INT -> int [3:4] | |--IDENT -> i [3:8] | |--ASSIGN -> = [3:10] | | `--EXPR -> EXPR [3:13] | | |--LPAREN -> ( [3:13] <-- this and all its siblings only checked | | |--LPAREN -> ( [3:14] | | |--PLUS -> + [3:22] | | | |--LPAREN -> ( [3:16] <-- no deep scan so this will not be picked | | | |--STAR -> * [3:18] | | | | |--NUM_INT -> 5 [3:17] | | | | `--NUM_INT -> 4 [3:19] | | | |--RPAREN -> ) [3:20] | | | |--LPAREN -> ( [3:24] | | | |--PLUS -> + [3:28] | | | | |--NUM_INT -> 4 [3:26] | | | | `--NUM_INT -> 2 [3:30] | | | `--RPAREN -> ) [3:31] | | |--RPAREN -> ) [3:32] | | `--RPAREN -> ) [3:33] | `--SEMI -> ; [3:34] `--RCURLY -> } [4:0]

mahfouz72 · 2024-04-13T16:43:32Z

src/main/java/com/puppycrawl/tools/checkstyle/checks/whitespace/ParenPadCheck.java

+            TokenTypes.TYPECAST,
+            TokenTypes.STAR,
+            TokenTypes.PLUS,
+            TokenTypes.MINUS,
+            TokenTypes.DIV,
+            TokenTypes.MOD,
+            TokenTypes.LAND,
+            TokenTypes.LOR,
+            TokenTypes.LNOT,


add some tokens that got missed due to having no deep scan (we may need to add some more tokens but those were the obvious one from the test and input files)

mahfouz72 · 2024-04-13T16:47:12Z

src/main/java/com/puppycrawl/tools/checkstyle/checks/whitespace/ParenPadCheck.java

-     */
-    private boolean isAcceptableToken(DetailAST ast) {
-        return acceptableTokens.get(ast.getType());
-    }
-


we don't need this anymore we needed it to avoid checking the same node twice. while doing a deep scan when we found a subtree that is acceptableToken we don't visit it because it will be picked up with visitToken() later

now there is no deep scan for tokens so we don't need this check. I removed this method, the related unnecessary class fields and the test corresponding to it

mahfouz72 · 2024-04-13T16:49:06Z

Github, generate report

mahfouz72 · 2024-04-13T16:59:16Z

Github, generate site

github-actions · 2024-04-13T17:03:26Z

https://checkstyle-diff-reports.s3.us-east-2.amazonaws.com/0f44b8f_2024170214/index.html

https://checkstyle-diff-reports.s3.us-east-2.amazonaws.com/0f44b8f_2024170214/checks/whitespace/parenpad.html#Properties

github-actions · 2024-04-13T17:40:02Z

https://checkstyle-diff-reports.s3.us-east-2.amazonaws.com/0f44b8f_2024173934/reports/diff/index.html

mahfouz72 · 2024-04-13T18:14:14Z

there are huge differences.
I want to know if I'm on the right track before proceeding to add more tokens.

romani · 2024-04-13T22:43:51Z

Please extend link check suppression files with lines as CI suggesting.

mahfouz72 · 2024-04-14T00:29:19Z

@romani done. should I add all tokens that got missed and identified in the regression report in this PR?

romani · 2024-04-14T02:54:27Z

I do not recommend to rush to add tokens, some of them not added for good reason. Let's add one by one, with full attention to regression diff report.

mahfouz72 · 2024-04-14T03:08:02Z

one by one in separate PRs or in this PR?
and also what about the tokens added till now? all of them were added to cover the missing parnes that are not checked in input files after having no deepscan

nrmancuso · 2024-05-01T13:01:49Z

I do not recommend to rush to add tokens, some of them not added for good reason. Let's add one by one, with full attention to regression diff report.

@romani I don't think we can do this without a bunch of hacks; as soon as we stop "deep scanning", we need to extend the tokens for this check to keep the behavior consistent.

@mahfouz72 can you help us to understand what other tokens we may need to add here, and how you are discovering them?

mahfouz72 · 2024-05-01T13:55:55Z

@nrmancuso we need to add any possible token that could be used under EXPR token. for example, all mathematical, logical, and bitwise operators should be added. for now, I added some of them I discovered those from the failing unit tests after having no deep scan. but there are more tokens that we didn't use in the input file

Examples:

bitwise: & , | , ~ , ^ , << , >>
comparison : < <= > >= == !=
assignments : += -= = /= *= etc..

I am discovering them from the regression report and in general as I stated above I need to think of any token that can be used under expression and can be surrounded by paren

what do you think should I start working in this way? and pay full attention to the report to have consistent behaviour

rnveach · 2024-05-01T23:16:19Z

@nrmancuso @romani Could this be a sign we need a ParenPadExpression check? Will someone ever want different spacing for different expression tokens?

mahfouz72 · 2024-05-01T23:41:10Z

Will someone ever want different spacing for different expression tokens?

IMO, No, no one will need to violate this x = (3+5) but not this x += (3+5). That why I see this step #14792 (comment) making the check weird and unnecessary a mess of tokens (we are talking about 15 ~ 20 new tokens)

but at the same time, how to pick those cases that got missed after having no deep scan without adding all those tokens....

I don't know if this a good design but can we leave the deep scan and while scanning we skip tokens that are not any of the tokens mentioned here #14792 (comment) so we will have
isMathematicalOperator() , isBitwiseOperator(), isLogicalOperator() etc... and skip the validation while deep scanning based on the type of token
So basiaclly this solution is almost the same as #14792 (comment) but we won't explicity add them as new token of the check we will be checking them internally while deep scanning and ignore all other tokens

rnveach · 2024-05-02T00:01:07Z

but can we leave the deep scan and while scanning we skip tokens that are not any of the tokens mentioned here

What is the difference between leaving the deep scanning (dropping the issue) or leaving the deep scanning while checking this internal only list ? My understanding was you built the list from the deep scanning (and/or regression).

I was thinking along the lines we break this check apart. We drop expression support here and create an expression only check (we want to do this for another check, #5945 ), specify all the tokens you suggested, but make them non-configurable (must stop on them) and no deep scanning. I can't really imagine people wanting different configs for different expression tokens. If someone does, we have this isolated check to make it easier to explain.

The problem with deep scanning is its easy to lose control and its harder to understand what it is exactly looking at. We had issues before where a check went beyond its boundaries in scanning. We are sort of having a discussion like this for UnusedLocalVariableCheck regarding pitest. Another example is #5234 and #5124 which I specifically mentioned branchContains is a dangerous method to use all the time since there is no restriction on how deep it can search.

mahfouz72 · 2024-05-02T00:23:17Z

What is the difference between leaving the deep scanning (dropping the issue) or leaving the deep scanning while checking this internal only list ?

if we leave deep scan and check this internal list only. we will avoid validating tokens thay definitely should not be validated example: RECORD_PATREN_DEF that I have in the issue related to this PR

one of the main problems of the deepscan that we was checking token that not in configuration if we enforce checking this internal list only we will avoid this problem

We drop expression support here and create an expression only check

this is a good solution I am leaning to it. if we really want to remove the deep scan not just leave it and do this list hack to enforce it check only specific toke n

mahfouz72 commented Apr 13, 2024

View reviewed changes

mahfouz72 force-pushed the remove-deepscan branch from b1677ed to 0f44b8f Compare April 13, 2024 16:47

mahfouz72 force-pushed the remove-deepscan branch from 0f44b8f to c89bd82 Compare April 13, 2024 17:07

mahfouz72 force-pushed the remove-deepscan branch from c89bd82 to a3f279c Compare April 13, 2024 18:53

mahfouz72 force-pushed the remove-deepscan branch from a3f279c to c16e3bc Compare April 14, 2024 00:00

Issue checkstyle#14747: fix ParenPad to not flag unsupported Tokens

5a8389d

mahfouz72 force-pushed the remove-deepscan branch from c16e3bc to 5a8389d Compare April 14, 2024 00:07

nrmancuso self-assigned this May 7, 2024

rnveach added the high demand label May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

mahfouz72 commented Apr 13, 2024

mahfouz72 Apr 13, 2024

mahfouz72 Apr 13, 2024

mahfouz72 Apr 13, 2024

mahfouz72 commented Apr 13, 2024

mahfouz72 commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

mahfouz72 commented Apr 13, 2024

romani commented Apr 13, 2024

mahfouz72 commented Apr 14, 2024

romani commented Apr 14, 2024

mahfouz72 commented Apr 14, 2024

nrmancuso commented May 1, 2024

mahfouz72 commented May 1, 2024

rnveach commented May 1, 2024

mahfouz72 commented May 1, 2024 •

edited

rnveach commented May 2, 2024 •

edited

mahfouz72 commented May 2, 2024 •

edited

Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

Are you sure you want to change the base?

Issue #14747: fix ParenPad to not flag unsupported Tokens #14792

Conversation

mahfouz72 commented Apr 13, 2024

mahfouz72 Apr 13, 2024

Choose a reason for hiding this comment

mahfouz72 Apr 13, 2024

Choose a reason for hiding this comment

mahfouz72 Apr 13, 2024

Choose a reason for hiding this comment

mahfouz72 commented Apr 13, 2024

mahfouz72 commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

github-actions bot commented Apr 13, 2024

mahfouz72 commented Apr 13, 2024

romani commented Apr 13, 2024

mahfouz72 commented Apr 14, 2024

romani commented Apr 14, 2024

mahfouz72 commented Apr 14, 2024

nrmancuso commented May 1, 2024

mahfouz72 commented May 1, 2024

rnveach commented May 1, 2024

mahfouz72 commented May 1, 2024 • edited

rnveach commented May 2, 2024 • edited

mahfouz72 commented May 2, 2024 • edited

mahfouz72 commented May 1, 2024 •

edited

rnveach commented May 2, 2024 •

edited

mahfouz72 commented May 2, 2024 •

edited