New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use SAP/project-kb MSR2019 as a source of Java CVEs #68
Comments
I have a concern related to MSR2019 dataset. I'm not sure this is a good dataset to support first, for the first support of Java in ossf-cve-benchmark. While I think it's a good sources of CVEs for the Java ecosystem, at the same time I don't think it represents well what developers will introduced themselves on their code. I studied the MSR2019 dataset in 2019 when it was released and it was made of 1282 entries:
|
I am not planning to import it. But perhaps someone else will open a big PR. Meta: I think all CVE sources will be biased in some way, hopefully the CVE selectors can be used to select a reasonable subset of CVEs that are relevant for a given comparison. Relatedly, we already have some biasing in the form of I was not aware of the MSR2019 contents, but that does seem overly biased towards certain kinds of code. I have not done a similar analysis for the benchmark entries of this repository, but I hope they are less biased since they were simply sourced from a stream of GitHub security advisories, with reasonable effort limits on finding commits and weakness locations. Perhaps the CVE entries should have additional meta-information about the project in question: |
|
As suggested in #67 (comment).
(Remember to check licensing for the data set)
The text was updated successfully, but these errors were encountered: