Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #7921: Resolve Pitest Issues - UnusedImportsCheck #7937

Merged
merged 1 commit into from May 17, 2020
Merged

Issue #7921: Resolve Pitest Issues - UnusedImportsCheck #7937

merged 1 commit into from May 17, 2020

Conversation

AmrDeveloper
Copy link
Contributor

Issue #7921

Description:
We don't need the variable collect to check if we should start collecting or not because it will always true because every java fill will contain package keyword or class that will make collect = true

Reports:
Diff Report : https://amrdeveloper.github.io/checkstyle-reports/7921/diff/
pitest-after : https://amrdeveloper.github.io/checkstyle-reports/7921/pitest-after
pitest-before : https://amrdeveloper.github.io/checkstyle-reports/7921/pitest-before/

@rnveach
Copy link
Member

rnveach commented Mar 22, 2020

Diff Report : https://amrdeveloper.github.io/checkstyle-reports/7921/diff/

Diff is not enough. Only 1 instance was used while check has multiple properties.
See https://github.com/checkstyle/contribution/tree/master/checkstyle-tester#checkstyle-pitest-regression

We don't need the variable collect

Prove to me without a doubt that the best way forward is to remove the variable.

Did you check the history of the variable and see why it was added in the first place?

every java fill will contain package keyword

Not every Java class will have a package.

@rnveach rnveach self-requested a review March 22, 2020 14:21
@rnveach rnveach self-assigned this Mar 22, 2020
@AmrDeveloper
Copy link
Contributor Author

@rnveach

Diff is not enough. Only 1 instance was used while check has multiple properties.

Report Updated

Not every Java class will have a package.

You are right

Prove to me without a doubt that the best way forward is to remove the variable.

@Override
public void visitToken(DetailAST ast) {
      if (ast.getType() == TokenTypes.IDENT) {
            if (collect) {
                 processIdent(ast);
            }
       }
       else if (ast.getType() == TokenTypes.IMPORT) {
           processImport(ast);
       }
       else if (ast.getType() == TokenTypes.STATIC_IMPORT) {
           processStaticImport(ast);
       }
       else {
           collect = true;
           if (processJavadoc) {
               collectReferencesFromJavadoc(ast);
           }
       }
  }

The code will run processIdent(ast); only if the current type is IDENT and collect is true
We are sure that in else branch the collect will always be true so we can just collect in else branch when type is IDENT

@Override
public void visitToken(DetailAST ast) {
    if (ast.getType() == TokenTypes.IMPORT) {
        processImport(ast);
    }
    else if (ast.getType() == TokenTypes.STATIC_IMPORT) {
        processStaticImport(ast);
    }
    else {
        if (ast.getType() == TokenTypes.IDENT) {
            processIdent(ast);
        }
        if (processJavadoc) {
            collectReferencesFromJavadoc(ast);
        }
     }
}

So the logic now is the same but I moved first if branch to process in the right place without variable

@rnveach
Copy link
Member

rnveach commented Mar 23, 2020

Diff is good now.

Did you check the history of the variable and see why it was added in the first place?

This was not answered.

We are sure that in else branch the collect will always be true

You can make that assumption but you have to remember the order of the ASTs coming in. The first 3 checks are IDENT, IMPORT, STATIC IMPORT and everything else falls in the else. To me it seems the purpose of collect is to prevent us from collecting IDENTs inside the first 3 checks and to wait until we hit the main class of the file, which is what the else is. IMPORTs have IDENTs for the text of the import and it seems we don't want to pick them up as part of the IDENT process.

Example:

import test;

IMPORT -> import [1:0]
|--IDENT -> test [1:7]
`--SEMI -> ; [1:11]

You see what I am saying?

I am still not convinced. My statement above, in my eye, shows the logic for having the variable and it's check.

Now I want to know why regression didn't find anything. What happens in the mutated branch when we executed with ASTs that would have been blocked in the unmutated branch such that a difference in violations don't occur.

@romani
Copy link
Member

romani commented Apr 5, 2020

@AmrDeveloper , please rebase to resolve conflict.

@AmrDeveloper
Copy link
Contributor Author

@rnveach
Sorry for delay

You are right in your example

import test;

and we can see the difference between old code and new code
because in old code will return true that test in the referenced but in java
because CommonUtil.baseClassName will return the input if it can't find . in the import
but in java every import must have at last one . because the format is

iimport PackageName.ClassName 

or

i import PackageName.*

so if we changed the implementation to make baseClassName return for example "" if it can't find at last one dots
the new code will work correctly even with your example but the new code problem is that it collect unneeded data in referenced list

so the problem now how we can make UT that can show the difference between two solutions like import.test and make regression fail with new code

@rnveach
Copy link
Member

rnveach commented Apr 5, 2020

I would not focus on differentiating between import and import static. The purpose of this is to prove why we need collect.

To me it seems the purpose of collect is to prevent us from collecting IDENTs inside the first 3 checks and to wait until we hit the main class of the file, which is what the else is. IMPORTs have IDENTs for the text of the import and it seems we don't want to pick them up as part of the IDENT process.

I would focus on this from my above statement.

What happens if we collect an INDENT we shouldn't be collecting? How can we use that to kill the mutation with a new UT?

@AmrDeveloper
Copy link
Contributor Author

@rnveach
Now I think that we shouldn't remove collect from the solution because it will collect many unneeded IDENT in memory
and we should find another solution it will be better if we can find UT that solve Pitest Issues

@rnveach
Copy link
Member

rnveach commented Apr 6, 2020

Yes. I would next look at what IDENTs it is collecting and how it uses IDENTs and does it use them to determine if a violation is done or not.

@romani
Copy link
Member

romani commented Apr 22, 2020

@AmrDeveloper , are you stuck , do you need help ?

@AmrDeveloper
Copy link
Contributor Author

@romani Sorry for delay i will try again to find another solution without take more memory resource

@AmrDeveloper
Copy link
Contributor Author

@rnveach
I have tried to change visitToken Implementation but the problem is that the only difference between remove collect or not is the size of the referenced list
We can only see the different with corner case import statement with only one word without any dots
so I added new UI with non-compiled file that contains the corner case

@rnveach rnveach assigned romani and unassigned rnveach May 17, 2020
@rnveach rnveach requested a review from romani May 17, 2020 03:46
@romani romani merged commit 274740d into checkstyle:master May 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants