Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #14726: new check: ConstructorsDeclarationGroupingCheck #14749

Closed
wants to merge 1 commit into from

Conversation

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 6, 2024

hey @romani @nrmancuso @rnveach the new module implementation is almost done, all the CIs are passing expect semaphore. It was complaining about creating a new PR at contribution repo for adding the new module and something about pitest.

I have made the PR about adding the new module in the contribution repo and added its link in the PR description but I'm not familiar with the Pitest error it is giving. Can you help me with this?

Here is the error link: https://checkstyle.semaphoreci.com/jobs/c43bc698-c72f-472b-82e6-ceadb034f92e

Edit: I'll generate the regression report once the semaphore CI error is resolved.

@Zopsss Zopsss force-pushed the ctor-14726 branch 2 times, most recently from 7a3c06b to f8722e4 Compare April 8, 2024 08:45
@romani
Copy link
Member

romani commented Apr 8, 2024

[INFO] --- spotbugs-maven-plugin:4.8.3.1:check (default) @ checkstyle --- com.puppycrawl.tools.checkstyle.checks.coding.ConstructorsDeclarationGroupingCheck.visitToken(DetailAST) checks a map with containsKey(), before using get() [com.puppycrawl.tools.checkstyle.checks.coding.ConstructorsDeclarationGroupingCheck] At ConstructorsDeclarationGroupingCheck.java:[line 94] MUI_CONTAINSKEY_BEFORE_GET

please fix it, and rebase on most recent master.


from Semathore:

List of missing files in pitest profiles: src/main/java/com/puppycrawl/tools/checkstyle/checks/coding/ConstructorsDeclarationGroupingCheck.java

add our class to

checkstyle/pom.xml

Lines 3261 to 3268 in 571e547

<targetClasses>
<param>com.puppycrawl.tools.checkstyle.checks.coding.AbstractSuperCheck*</param>
<param>com.puppycrawl.tools.checkstyle.checks.coding.ArrayTrailingCommaCheck*</param>
<param>com.puppycrawl.tools.checkstyle.checks.coding.AvoidDoubleBraceInitializationCheck*</param>
<param>com.puppycrawl.tools.checkstyle.checks.coding.AvoidInlineConditionalsCheck*</param>
<param>com.puppycrawl.tools.checkstyle.checks.coding.AvoidNoArgumentSuperConstructorCallCheck*</param>
<param>com.puppycrawl.tools.checkstyle.checks.coding.CovariantEqualsCheck*</param>
<param>com.puppycrawl.tools.checkstyle.checks.coding.DeclarationOrderCheck*</param>

@Zopsss Zopsss force-pushed the ctor-14726 branch 2 times, most recently from 93dd081 to c2eae4f Compare April 8, 2024 17:54
@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 8, 2024

add our class to

Done


please fix it, and rebase on most recent master.

to fix that error, I did the following changes:

Before:

final DetailAST parent = ast.getParent().getParent();
if (parents.containsKey(parent)) {

After:

final DetailAST parent = ast.getParent().getParent();
final Integer previousCtorLineNo = parents.get(parent);
if (previousCtorLineNo != null) {

but now Pitest / test (coding - 1) CI is failing. Before changing the code it was passing but after the change it started giving error. Here is the error link: https://github.com/checkstyle/checkstyle/actions/runs/8604545703/job/23578894912?pr=14749#step:6:499

there are also many new mutation survivals, but this was not happening before the code change. Can you please explain me the reason behind failing Pitest CI when the code is not changed that much and its logic is also the same as before and also why there are so many new surviving mutations?. @romani

Copy link
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items

src/xdocs/checks/coding/index.xml Outdated Show resolved Hide resolved
*/

@FileStatefulCheck
public class ConstructorsDeclarationGroupingCheck extends AbstractCheck {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make it stateless?
It will work well when we make checkstyle multithreading.

Copy link
Contributor Author

@Zopsss Zopsss Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Firstly I tried making it stateless using other ways but the mvn clean verify was giving errors so I have added the suppressions to make it work.

"com.puppycrawl.tools.checkstyle.checks.coding.ConstructorsDeclarationGroupingCheck"
+ ".parents"

"com.puppycrawl.tools.checkstyle.checks.coding.ConstructorsDeclarationGroupingCheck"

After adding the suppressions, mvn clean verify was running perfectly


Update: pitest coding-1 has still many surviving mutations. Should I suppress them? Because it is showing surviving mutators for those lines which are required to make the check work properly and mutating them would make test cases fail.

Copy link
Member

@mahfouz72 mahfouz72 Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mutating them would make test cases fail.

@Zopsss how is that? if the tests fail when hardcoding the mutation then how did they survive in the first place maybe you hardcode the mutation in the wrong way

Update: Add the Test class to the TargetTests of the pitest-coding-1 profile in pom.xml

Copy link
Contributor Author

@Zopsss Zopsss Apr 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is that?

NVM it was showing that many surviving mutations because I didn't added the Test class in the TargetTests of pitest-coding-1. Thanks for helping me out. Now it is showing only 3 surviving mutations :)

@romani Looking into the surviving mutations now 🫡


Update: all the surviving mutations has been killed. Now generating the diff report.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am "unresolving" this item to continue discussion. We should be able to make this check stateless by using all class definitions as required tokens (CLASS_DEF_, ENUM_DEF, etc...), then checking class members. This should allow us to make this check stateless, perhaps at the cost of a slightly larger memory footprint. @Zopsss @romani thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to make this check stateless by using all class definitions as required tokens (CLASS_DEF_, ENUM_DEF, etc...), then checking class members.

do you mean we should use an approach like this?

public void visitToken(DetailAST ast) {
final int parentType = ast.getParent().getType();
final int[] tokenTypes = {
TokenTypes.CLASS_DEF,
TokenTypes.ENUM_DEF,
TokenTypes.INTERFACE_DEF,
TokenTypes.LITERAL_NEW,
TokenTypes.RECORD_DEF,
};
if (TokenUtil.isOfType(parentType, tokenTypes)) {
checkOverloadMethodsGrouping(ast);
}
}

The OverloadMethodsDeclarationOrderCheck uses this approach and it is stateless however the visitToken() gets called every time for class, enum, interface, etc... Then we're travelling the whole block to check that all the methods are grouped together. By this approach we can make the check stateless but then we will need to travel the whole class, enum, records, etc, we will be checking each and every tokens of the block, this is a time consuming method as we're travelling through the whole block and checking every tokens.

We're currently calling the Check for only CTOR and COMPACT_CTOR tokens only, by using this implementation we're saving a lot of time instead of travelling the whole block. This is an optimized method to check that all constructors are grouped together or not.

We're also clearing the HashMap every time before calling the visitToken(), so the HashMap does not contain any unnecessary data:

@Override
public void beginTree(DetailAST rootAst) {
allObjBlocks.clear();
}

If we make the current implementation stateless then we cannot use this optimized approach. The check will take more time to execute. So I guess if we're able to make the Check optimized then we should be using the current approach. The Check implementation will also become complex if we use the OverloadMethodsDeclarationOrder's approach. Correct me if I'm wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimized

Consider: https://softwareengineering.stackexchange.com/questions/80084/is-premature-optimization-really-the-root-of-all-evil

by using this implementation we're saving a lot of time instead of travelling the whole block

Can you prove this matters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider: https://softwareengineering.stackexchange.com/questions/80084/is-premature-optimization-really-the-root-of-all-evil

by using this implementation we're saving a lot of time instead of travelling the whole block

Can you prove this matters?

I gone through the page tried to understand how premature optimization should be avoided in some places if the optimization is not necessarily required. However I would like to give some light to the following paragraph:

What this means is that, in the absence of measured performance issues you shouldn't optimize because you think you will get a performance gain. There are obvious optimizations (like not doing string concatenation inside a tight loop) but anything that isn't a trivially clear optimization should be avoided until it can be measured.

The biggest problems with "premature optimization" are that it can introduce unexpected bugs and can be a huge time waster.

In the last line, it says that premature optimization can introduce unexpected bugs and it can be a time waster. In our case the current implementation is bug free, easy to understand and does not contain complex code. The other approach you're suggesting might be difficult to implement and understand, it can also introduce new bugs which might take quite a time to solve.


#14749 (comment)

Here, Mr. Roman suggested that if we didn't find any way to make the Check stateless then it is okay to keep them stateful. I indeed agree that the Stateless approach you're suggesting will work but again it will be difficult to understand, might take some time to implement, might contain bugs and will perform lots of travelling in the Tree.

But I'm not sure how much operating resources does a stateful check and a stateless check requires. I'm not sure how much they can affect on the performance of the Checkstyle.

If you think I'm wrong somewhere, please let me know. I'm not experienced as you're, this is my first time having a conversation on optimizing the system performance.

Copy link
Member

@rnveach rnveach Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't read all this but just a few comments. We aren't multithreaded any time soon, so its not a big deal if we can't change the type of check here. Since we don't have an implementation of multithreading there is no telling how big a performance there may be doing either type. I don't see stateless being a big performance improvement. It just means we don't have to instantiate the check for each new file it processes. While new instances are cheap, it really depends on how we will do that instantiation and populate any property values which will decide which type of check is better for performance as well as the impact from the check not having any memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading @rnveach's comment, do we still need to transition to the stateless check or is it okay to keep this check stateful? @nrmancuso @romani please tell me what do you think about this?

@Zopsss Zopsss force-pushed the ctor-14726 branch 4 times, most recently from de6049e to 7452e65 Compare April 10, 2024 17:53
@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 10, 2024

GitHub, generate report

Copy link
Contributor

Report generation failed on phase "make_report",
step "Generate report".
Link: https://github.com/checkstyle/checkstyle/actions/runs/8636195498

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 11, 2024

GitHub, generate report

Copy link
Contributor

Report generation failed on phase "make_report",
step "Generate report".
Link: https://github.com/checkstyle/checkstyle/actions/runs/8640443207

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 11, 2024

GitHub, generate report

Copy link
Contributor

Report generation failed on phase "make_report",
step "Download files".
Link: https://github.com/checkstyle/checkstyle/actions/runs/8640859071

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 11, 2024

GitHub, generate report

Copy link
Contributor

Report generation failed on phase "make_report",
step "Generate report".
Link: https://github.com/checkstyle/checkstyle/actions/runs/8642972336

@romani
Copy link
Member

romani commented Apr 11, 2024

OOM - https://github.com/checkstyle/checkstyle/actions/runs/8642972336/job/23695060359#step:9:2443
please use smaller projects for validation.

you can make two project list files, and run report generation two times, one after another.

Copy link
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items

config/checkstyle-non-main-files-suppressions.xml Outdated Show resolved Hide resolved
/**
* Specifies different Object Blocks scope.
*/
private final Map<DetailAST, Integer> allObjBlocks = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a state.
If we can not avoid state ( fields of class) , we should clean/reset fields at the begging of each file validation. Same instance of Check is reused over all files to validate.

If you see no way to make it stateless, it is ok. You can just explain a reason on why stateless is not possible, and stay with stateful.

Copy link
Contributor Author

@Zopsss Zopsss Apr 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a state. If we can not avoid state ( fields of class) , we should clean/reset fields at the begging of each file validation. Same instance of Check is reused over all files to validate.

Done. Used the beginTree() method to achieve this:

@Override
public void beginTree(DetailAST rootAst) {
allObjBlocks.clear();
}

Edit: looking into the surviving mutation.


If you see no way to make it stateless, it is ok. You can just explain a reason on why stateless is not possible, and stay with stateful.

I guess we can't make this check stateless because the check requires tracking the scope of the overloaded constructors, so we need a global variable to keep track of this. A local variable will not work in this situation. Here is the example of what I'm trying to say:

private enum Testing {
one;
int x;
Testing() {}
Testing(String f) {}
String str;
Testing(int x) {} // violation
private abstract class Inner {
Inner() {}
int x;
String neko;
Inner(String g) {} // violation
}
Testing(double d) {} // violation
}

@romani

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Copy link
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items

Copy link
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items

@Override
public void visitToken(DetailAST ast) {
final DetailAST currObjBlock = ast.getParent();
final Integer previousCtorLineNo = allObjBlocks.get(currObjBlock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we can not simply check previous sibling on each ctor?
We might need some collection to keep track if ctor is first or not. This collection will just track that ctor is detected, so all other ctors should be in siblings.

Copy link
Contributor Author

@Zopsss Zopsss Apr 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work in most of the situations. We can keep track if the current CTOR is first or not and check for its previous sibling, but if there is another block between the constructors then this approach will not work. For example:

private enum Testing {
one;
int x;
Testing() {}
Testing(String f) {}
String str;
Testing(int x) {} // violation
private abstract class Inner {
Inner() {}
int x;
String neko;
Inner(String g) {} // violation
}
Testing(double d) {} // violation
}

Here, the Testing(double d) {} occurs after an inner class and based on our check's implementation we will loose track of the previous CTOR and its position if we use a normal variable like a boolean or something to keep track of the previous constructor as it will get reset. That's why I used HashMap to keep track of all blocks and see if there is a constructor occurred in current block or not. We're getting the previous CTOR's line number of the current block and checking if it is null or not, null means there is no previous CTOR in the current block. If there is no CTOR occurred, then put the current block in the HashMap and store the constructor's line number. But if there is a constructor already occurred in the current block, then we check for the previous sibling, if the previous sibling is not a constructor then we will log the violation and if the previous sibling is a constructor then we will update the previous constructor's line number to current constructor's line number.

And we have provided the last occurred constructor's line number in the violation so user can know where the previous constructor is located.

This is the very early implementation of this check: https://github.com/checkstyle/checkstyle/blob/feba1977f846e7cb4fbed8ee76239ca9ee7a5140/src/main/java/com/puppycrawl/tools/checkstyle/checks/coding/ConstructorsDeclarationGroupingCheck.java

In this, I used a simple boolean variable to keep track of the previous CTOR, but this type of approach will fail for the example I mentioned above.

Here is an example in the diff report: https://checkstyle-diff-reports.s3.us-east-2.amazonaws.com/7452e65_2024163113/reports/diff/spotbugs/index.html#A15

Our check would not able to detect this if we simply check the previous CTOR without tracking the block of the constructor.

@romani

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Copy link
Member

@nrmancuso nrmancuso Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this approach might also mitigate the issue here. In this case we would always operate, statelessly, in the context of a single class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah by using that approach we will be operating the check statelessly but that approach will be more time consuming, am I right? I've given my opinion here

@Zopsss Zopsss force-pushed the ctor-14726 branch 3 times, most recently from f545db5 to 32f4b3b Compare April 14, 2024 20:42
@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 14, 2024

The mvn clean verify was not working properly locally, in this branch as well as master branch, Idk why that was happening. So I have pushed the changes without running it, I added some new examples in the template file but didn't generate those examples in the xml file, so I guess that will give some errors. I will fix those soon once mvn clean verify issue is resolved.

@romani
Copy link
Member

romani commented Apr 15, 2024

Did you share with us how it is failing on your local?

Copy link
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 15, 2024

GitHub, generate site

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 15, 2024

Did you share with us how it is failing on your local?

There was some issue with Maven. I restarted the laptop and then it started working fine.

Copy link
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Items

config/checkstyle-non-main-files-suppressions.xml Outdated Show resolved Hide resolved
@romani
Copy link
Member

romani commented Apr 16, 2024

GitHub, generate report

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 17, 2024

GitHub, generate report

I've generated the regression reports above, here are those:

#14749 (comment)
#14749 (comment)

there were no false positives/negatives found. The report was fine. @romani

this is the generated documentation report link: #14749 (comment)

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 23, 2024

@romani can you please review the PR?

@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 26, 2024

@nrmancuso @rnveach If you have time can also you please review this PR? it would be really helpful if you or @romani reviews this PR

@nrmancuso nrmancuso self-assigned this Apr 27, 2024
@Zopsss
Copy link
Contributor Author

Zopsss commented Apr 28, 2024

Hey @nrmancuso thanks for self assigning the PR... Here are the regression reports and documentation reports:

#14749 (comment)

I needed to divide the project list into 2 parts because it was giving Out Of Memory error for single project list.

Here is the explanation of how we're keeping track the previous CTOR & it's line number:

#14749 (comment)

I thought this would help you to understand the working of the module easily

Comment on lines +16 to +18
public MyRecord(int x, int y, int z) {
this(x+y, z);
}
Copy link
Member

@nrmancuso nrmancuso Apr 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Zopsss @romani it looks like we are emulating the behavior of OverloadMethodsDeclarationOrder where a user may need to run the check multiple times in order to make checkstyle happy. Is there some technical limitation that prevents us from using the following logic:

  • First constructor is where constructors should live
  • If there is something that is not a constructor after the first one, all other constructors are flagged

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we can do this, I agree with this. Waiting for @romani's opinion. If he agrees then I'll start working on the new implementation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nrmancuso @romani If this is the way to want to move, we should probably create an issue for the other check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nrmancuso I've changed the implementation according to your new idea. Should I push the new changes or wait for other maintainers approval?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nrmancuso I've changed the implementation according to your new idea. Should I push the new changes or wait for other maintainers approval?

If you’ve already done the work, you can checkout a new branch from what you’ve done, and open a draft PR. Then you can reset this branch from the origin. This way you have both versions preserved and we can see what the other looks like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Here is the draft PR: #14844

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Here is the draft PR: #14844

This new PR makes it even more obvious that we should do this, IMO.

This is my vision:

  1. Create a list of all children of the class/enum/whatever def
  2. Create sublist from first constructor , taking while the AST is a constructor
  3. Take the difference of the original list and the sublist
  4. Filter out non-constructors
  5. log all remaining items

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the implementation stateless on that PR, can you please check that out? and sorry for the late reply I wasn't active for few days

@Zopsss
Copy link
Contributor Author

Zopsss commented May 8, 2024

The work on this module will continue here: #14844

The module implementation has been changed based on this: #14749 (comment) #14749 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants