Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change ValueRangeAnalysis to a data-flow analysis #2073

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

baloghadamsoftware
Copy link
Contributor

Currently, SpotBugs includes a value-range analysis which is only used
for detecting redundant conditions. However, it could be used for
detecting other kind of errors, such as arithmetic errors as well. This
PR refactors this analysis to a data-flow based one while keeping the
functionality of the existing redundant condition analysis based on top
of it.


Make sure these boxes are checked before submitting your PR -- thank you!

  • Added an entry into CHANGELOG.md if you have changed SpotBugs code

Ádám Balogh added 2 commits May 27, 2022 18:23
Currently, SpotBugs includes a value-range analysis which is only used
for detecting redundant conditions. However, it could be used for
detecting other kind of errors, such as arithmetic errors as well. This
PR refactors this analysis to a data-flow based one while keeping the
functionality of the existing redundant condition analysis based on top
of it.
@baloghadamsoftware
Copy link
Contributor Author

An example for the usefulness of this PR is finding arithmetic errors such as division by zero: baloghadamsoftware#3

@baloghadamsoftware
Copy link
Contributor Author

I tested this on large projects. No crashes and no speed drop.

@baloghadamsoftware
Copy link
Contributor Author

I found a true positive (redundant condition) in an open-source project using this new approach: True positive in MATSim-Libs

@baloghadamsoftware baloghadamsoftware marked this pull request as ready for review June 8, 2022 14:01
Copy link
Member

@iloveeclipse iloveeclipse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no time to do a real review, here just a formal one. In general, with enough test coverage, I think it makes sense.

private static final Map<String, TypeLongRange> typeRanges;

static {
typeRanges = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be Collections.unmodifiableMap()?
Also I would expect it to be UPPER_CASE as it is a constant map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This is also from the old code, but now now this whole map is constant so I made it unmodifiable.

return typeRanges.containsKey(signature);
}

private SortedMap<Long, Long> map = new ConcurrentSkipListMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give some meaningful name, like rangesMap and explanation what is the key and what is the value.

Is there a good reason to use concurrent map here? Do we really expect the set to be modified by multiple threads? If so, class javadoc should explain that this is supposed to be thread safe data structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was a misunderstanding. I got several ConcurrentModificationExceptions but they were because I tried to merge the same instance into itself. A check was missing to prevent this.

import java.util.Set;

public class TypeLongRange {
long min, max;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the fields be final? If not, something should be done to avoid unintentional change of values for cases where they are used as constant values (like in LongRangeSet.typeRanges)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

@@ -0,0 +1,65 @@
package edu.umd.cs.findbugs.ba;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and in all other classes, please add license headers.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the classes got license headers now, except the VariableData class and the ones in the spotbugs-tests and spotbugsTestCases directories.


private void add(Long start, Long end) {
if (map.isEmpty()) {
map.put(start, end);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this "if" block is not thread safe, and two threads can enter the condition at same time. If this type is supposed to be thread safe, this is a bug.

Actually entire method is not thread safe, in almost all cases below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above: this class does not need to be thread safe.

branch = null;
}

public void meetWith(ValueRangeMap other) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meetWith is unusual. Did you mean mergeWith ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw meetWith() in LockSet which is the fact for LockAnalysis. Thus since ValueRangeMap is the fact for ValueRangeAnalysis I used the same name. However, I can change it if you think mergeWith is a better name.

@@ -116,7 +116,7 @@ public void sawOpcode(int seen) {
fieldSummary.addWrittenOutsideOfConstructor(fieldOperand);
}
}
} else if (seen == Const.PUTSTATIC && !Const.STATIC_INITIALIZER_NAME.equals(getMethodName())) {
} else if (!Const.STATIC_INITIALIZER_NAME.equals(getMethodName())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this from unrelated fix? Please don't mix multiple fixes in one PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely. For some reasons RedundantConditions can now find more true positives. This is one of them. If I do not fix it then the build fails because of SpotBugs errors.

@@ -170,7 +170,7 @@ public void visit(Code code) {
if (returnUnknown > 0) {
priority++;
}
if (returnNew > 0 && priority > NORMAL_PRIORITY) {
if (priority > NORMAL_PRIORITY) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this from unrelated fix? Please don't mix multiple fixes in one PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the other true positive.

@baloghadamsoftware
Copy link
Contributor Author

Thank you for your comments, @iloveeclipse! They were really useful and I did agree with all of them. Now I am considering adding test cases but I could not find any test case for analysises yet. We would need some kind of greybox testing here. Some framework where we can check facts of different data flows at some given points of the CFG. Do you know about one?

@iloveeclipse
Copy link
Member

Do you know about one?

Not really, I would have to search in the code.

Note: I will be mostly offline for a week from tomorrow, so don't expect any feedback from me in that time.

@baloghadamsoftware
Copy link
Contributor Author

Do you know about one?

Not really, I would have to search in the code.

Note: I will be mostly offline for a week from tomorrow, so don't expect any feedback from me in that time.

No problem, you already helped a lot. I think I found something that I could maybe use for testing.

}

@Override
public void meetInto(ValueRangeMap fact, Edge edge, ValueRangeMap result) throws DataflowAnalysisException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should split the logic of this method to edgeTransfer() and meetInto()?

@baloghadamsoftware
Copy link
Contributor Author

@KengoTODA Could you take a look on this one, please? This PR could be a huge improvement which could pave the way to more complex detectors based on the actual possible values of the variables.

@baloghadamsoftware
Copy link
Contributor Author

baloghadamsoftware commented Mar 1, 2023

@KengoTODA? and/or @iloveeclipse?

Copy link
Collaborator

@JuditKnoll JuditKnoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primitive type signatures are used in more classes (LongRangeSet, ValueRangeAnalysis, RedundantConditions) more times, wouldn't it be worth to use constants instead of the hardcoded signatures?
I think this feature seems really promising, and could improve the capabilities of Spotbugs in a great sense.



/**
* This class describes an edge in the CFG which belongs to a branch statemen: IFCMP edges and the fallback edges
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a missing 't' from the end of the word "statement".

import edu.umd.cs.findbugs.classfile.Global;
import edu.umd.cs.findbugs.classfile.MethodDescriptor;

public class ValueRangeAnalysis extends ForwardDataflowAnalysis<ValueRangeMap> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really nice, if this class had a short javadoc explaining the purpose of the class. However, I'm not sure, what exactly decides whether a class needs javadoc or the licence header.

@@ -193,7 +398,7 @@ private int getPriority(MethodDescriptor methodDescriptor, RedundantCondition co
return priority;
}

/**
/*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the deletion of this '*' intentional?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants