Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positives after upgrade from 3.3.4 -> 4.0.0 #1580

Closed
pethers opened this issue Nov 23, 2018 · 10 comments
Closed

False positives after upgrade from 3.3.4 -> 4.0.0 #1580

pethers opened this issue Nov 23, 2018 · 10 comments
Labels

Comments

@pethers
Copy link

pethers commented Nov 23, 2018

Lots of false positives after upgrade to 4.0.0

Full report at https://www.hack23.com/jenkins/view/SystemQualityAssesment/job/Citizen-Intelligence-Agency-Complete-sonar-report/555/dependency-check-jenkins-pluginResult/

aws-java-sdk-core-1.11.455.jar (cpe:/a:cce-interact:interact:1.11.455, com.amazonaws:aws-java-sdk-core:1.11.455, cpe:/a:interact:interact:1.11.455) : CVE-2006-1643, CVE-2006-1642, CVE-2007-4177, CVE-2006-1644

quartz-2.3.0.jar (org.quartz-scheduler:quartz:2.3.0, cpe:/a:jenkins:jenkins:2.3) : CVE-2018-1000169, CVE-2017-2610, CVE-2017-2611, CVE-2017-1000504, CVE-2017-2609, CVE-2017-2601, CVE-2017-2602, CVE-2017-2603, CVE-2017-2604, CVE-2017-2606, CVE-2017-2607, CVE-2017-2608, CVE-2017-1000354, CVE-2017-1000398, CVE-2017-1000355, CVE-2017-1000399, CVE-2017-1000396, CVE-2017-1000353, CVE-2017-1000356, CVE-2018-6356, CVE-2017-2612, CVE-2017-1000391, CVE-2017-2613, CVE-2017-1000394, CVE-2017-1000395, CVE-2018-1000170, CVE-2017-1000392, CVE-2017-1000393, CVE-2018-1000067, CVE-2017-2598, CVE-2018-1000068, CVE-2017-1000400, CVE-2017-2599, CVE-2017-1000401, CVE-2017-17383, CVE-2017-2600, CVE-2016-9299, CVE-2018-1999043, CVE-2018-1999042, CVE-2018-1000195, CVE-2018-1999005, CVE-2018-1999004, CVE-2018-1000193, CVE-2018-1999007, CVE-2018-1000194, CVE-2018-1999006, CVE-2018-1999001, CVE-2018-1999045, CVE-2018-1000192, CVE-2018-1999044, CVE-2018-1999003, CVE-2018-1999047, CVE-2018-1999002, CVE-2018-1999046

json-20180813.jar (cpe:/a:light:light:-, org.json:json:20180813, cpe:/a:all-for-one:all_for_one:-) : CVE-2018-12056

commons-lang-2.6.0.redhat-7.jar (cpe:/a:linux:util-linux:2.6.0, commons-lang:commons-lang:2.6.0.redhat-7) : CVE-2011-1677, CVE-2011-1676, CVE-2011-1675

@pethers
Copy link
Author

pethers commented Nov 23, 2018

Looks like it's enough if version match even when other identifiers are different

@malejpavouk
Copy link

malejpavouk commented Nov 27, 2018

I can confirm the behavior that it is enough if the version matches...I have plenty of FPs such as
micrometer-core-1.1.0.jar -> cpe:/a:git:git:1.1.0, cpe:/a:git_project:git:1.1.0

@GFriedrich
Copy link

Having the same issue after switching from 3.X to 4.X:

netflix-eventbus-0.3.0.jar: ids:(cpe:/a:git_project:git:0.3.0, com.netflix.netflix-commons:netflix-eventbus:0.3.0, cpe:/a:git:git:0.3.0) : CVE-2008-5516, CVE-2010-2542, CVE-2010-3906, CVE-2013-0308, CVE-2014-9938, CVE-2015-7082, CVE-2015-7545, CVE-2017-14867
archaius-core-0.7.4.jar: ids:(com.netflix.archaius:archaius-core:0.7.4, cpe:/a:git:git:0.7.4, cpe:/a:git_project:git:0.7.4) : CVE-2008-5516, CVE-2010-2542, CVE-2010-3906, CVE-2013-0308, CVE-2014-9938, CVE-2015-7082, CVE-2015-7545, CVE-2017-14867
servo-core-0.10.1.jar: ids:(cpe:/a:docker:docker:0.10.1, com.netflix.servo:servo-core:0.10.1) : CVE-2014-0047, CVE-2014-5277, CVE-2014-6407, CVE-2014-9358, CVE-2015-3627, CVE-2015-3630, CVE-2015-3631, CVE-2016-3697, CVE-2017-14992, CVE-2017-7297
netflix-infix-0.3.0.jar: ids:(cpe:/a:git_project:git:0.3.0, cpe:/a:git:git:0.3.0, com.netflix.netflix-commons:netflix-infix:0.3.0) : CVE-2008-5516, CVE-2010-2542, CVE-2010-3906, CVE-2013-0308, CVE-2014-9938, CVE-2015-7082, CVE-2015-7545, CVE-2017-14867
jersey-apache-client4-1.19.1.jar: ids:(cpe:/a:oracle:oracle_client:1.19.1, com.sun.jersey.contribs:jersey-apache-client4:1.19.1) : CVE-2006-0550

Unfortunately I require a working 4.X version and can't move back to 3.X as I moved on to Gradle 5.

@GFriedrich
Copy link

GFriedrich commented Dec 5, 2018

I've had some time to dig deeper into the problem and found the following issue:
With version 4.0.0 of the dependency check the lucene library got updated.
At this place the lucene index is queried to get the matching CPEs for a vendor/product:

final TopDocs docs = cpe.search(searchString, MAX_QUERY_RESULTS);
for (ScoreDoc d : docs.scoreDocs) {
if (d.score >= 0.08) {

As you can see, afterwards a check is done so that the matching score is bigger than 0.08.
I've tested the library "com.netflix.hystrix:hystrix-core:1.5.12" and the results were:

  • With the old lucene version I've got 25 results all below 0.08 (so all results were dropped)
  • With the new lucene version I've got 25 results but now all above 0.08

In fact it was a drastical score increase: The max score with the old version was "0.038690303" but with the new version it is "41.03853".
So the question it now: What triggered the increase of the lucene result score?!

@GFriedrich
Copy link

GFriedrich commented Dec 5, 2018

After I have read some news on the changes of Lucene 6 and 7, I can tell you what's going on here:

With Lucene 6 the scoring algorithm was changed from TF*IDF to BM25 - so it is expected that scores look different now with Lucene 6.

Additionally with Lucene 7 the query normalization was dropped. That means that all versions before Lucene 7 tried to bring the score to a value between 0 and 1. But with the new version the score can have any value.

And finally the developers of Lucene mention that it is discouraged (and already was for several Lucene versions) to use the score to compare the results to "numbers" or to other results. The score should only be used to sort the resulting documents of a query.

This in turn means: The current approach is broken and needs a different solution than before. Simply comparing the score to "0.08" will not work anylonger.

@malejpavouk
Copy link

malejpavouk commented Dec 7, 2018

I did some checks, if it would be possible to downgrade Lucene (from users build). Unfortunatelly its not possible as there are lucene version related changes inside Engine.java. For a quick fix the easy way is to revert this commit in dependency check and run release: e0d644b

(this version is vulnerable according to the git message. I do not think its that big issue considering how long lucene actually runs...).

@jeremylong Can you please provide your suggestion as author of the plugin? We as Gradle 5 users do not really have any option to go forward or backwards...


Edit: it seems that to make it "working" again, we just need to normalize it manually. Its rather simple to do using the maxScore reported by topDocs:
https://lucene.apache.org/core/7_0_0/core/org/apache/lucene/search/TopDocs.html#getMaxScore--

It would be still a broken approach though, but it will at least make it as functional* as it was before.

*provided that the 0.08 constant works reasonably for the changed algorithm...hard to guess, no idea why its 0.08


Edit2: More investigation. Its not that simple. The issue is that with maxScore normalization, we'll get only normalized score per query. So it still ends up with values way above 0.08 for hystrix (when you search and there is nothing relevant, all the results will not be relevant). It seems that the root cause is here:

Query normalization's goal was to make scores comparable across queries, which was only implemented by the ClassicSimilarity. Since ClassicSimilarity is not the default similarity anymore, this functionality has been removed. Boosts are now propagated through Query#createWeight.

https://lucene.apache.org/core/7_5_0/MIGRATE.html

I have also in my experiment added usage of ClassicSimilarity after this line:

The results are still the same (FPs still present). It seems to me that the only quickfix is downgrade back to version 5, or 6 (and set ClassicSimilarity)

@jeremylong
Copy link
Owner

I have done additional testing on this and should have a solution in the near future and plan on building a set of test cases so that this issue does not occur again (i.e. automated test cases should fail). I apologize about the 4.0.0 release - it was done fairly quickly due to the published vulnerabilities in some of the dependencies.

@GFriedrich the score is being used to filter out completely erroneous matches. Yes, the use of the score for more then sorting is discouraged - however, I could not come up with an effective way to filter the results. If you tell Lucene to give you 20 results - you will get 20 results even if the query:

antlr org.antlr http://www.antlr.org Antlr 3.4 Runtime antlr-runtime

returns

getid3

The filter of 0.08 no longer works as the scores are, as you pointed out, no longer between 0 and 1. In some initial testing a score filter of 30 as a minimum score appears to work. But I have several more rounds of testing to do to confirm this. I am also researching other mechanisms of filtering the results.

@malejpavouk
Copy link

I can verify that I do no longer see any issue in 4.0.1

@GFriedrich
Copy link

I've "reopened" the issue via #1637 as there are still several false positives.

@lock
Copy link

lock bot commented Jan 19, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Jan 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants