New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add binary version parsing for nodejs, php and python #6524
base: main
Are you sure you want to change the base?
feat: Add binary version parsing for nodejs, php and python #6524
Conversation
Levente Kovacs seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
I signed the cla... Not sure why it's not triggering as signed. On my forked branch, every test passed, so if you trigger the tests it should be good. |
Hmm, while all tests are passing, I do not see the new binary results when I build the trivy from source and run the same experiment as in #6457, I'm not sure why that is the case. EDIT: Guess I realized that I not only need to implement the parsing, but the analyzer logic as part of fanal. I'll try that. |
I have spent some time with this. Now python version is detected correctly by trivy and added into the sbom in a way that xeol detects it properly according to my use case. I looked into implementing vulnerability scanning as well for standalone binaries, but I think at first only having it part of sboms is a step forward. There are a few problems regarding implementing this with standalone binaries:
The only way I see to implement this nicely is by adding CPE support and also putting together a Get function for the NVD vulnsrc linked in 3rd point. Other way: just hack together an NVD get interface since it'd be used only for the scanner as a fallback in case of generic binary is detected, not for regular vulnerability scanning. Now that I know how to implement these things, I think I can put together the remaining parts (Java, PHP, NodeJS binary as part of package detection and SBOMs) in the upcoming days and the PR would be ready. |
I have added support for PHP and Nodejs standalone binaries and it works like a charm with the sboms. I'm dropping support for Java because I found that it's harder to correctly determine the package type (JRE vs. JDK), and I won't go down the rabbithole since it's not part of my usecase. Only issue I discovered yesterday is that this will break lots of integration tests which I'll have to adjust those accordingly. After I'll adjust integration tests, I think this PR can be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @kovacs-levent
Thanks for this PR.
I left some comments. Some these comments same for nodejs, php and python.
Also parsers are very similar.
I think we can merge them to 1 package. Something like this:
parser:
├── executable
│ ├── exe.go
│ ├── nodejs
│ ├── parse.go
│ ├── parse_test.go
│ └── testdata
│ ├── python
│ ├── parse.go
│ ├── parse_test.go
│ └── testdata
│ └── php
│ ├── parse.go
│ ├── parse_test.go
│ └── testdata
about analyzers:
looks like we can add new logic here - https://github.com/aquasecurity/trivy/blob/main/pkg/fanal/analyzer/executable/executable.go
If a php/python/npm binary is found, use its parser and add the library.
wdyt?
@@ -124,6 +124,7 @@ require ( | |||
github.com/alecthomas/chroma v0.10.0 | |||
github.com/antchfx/htmlquery v1.3.0 | |||
github.com/apparentlymart/go-cidr v1.1.0 | |||
github.com/aquasecurity/go-dep-parser v0.0.0-20240213093706-423cd04548a5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go-dep-parser
in Trivy now - https://github.com/aquasecurity/trivy/tree/main/pkg/dependency/parser
return nil, err | ||
} | ||
|
||
if bytes.HasPrefix(data, []byte("\x7FELF")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about windows
and macos
formats?
} | ||
|
||
// Python's version pattern is [NUL]3.11.2[NUL] | ||
re := regexp.MustCompile(`^\d{1,4}\.\d{1,4}\.\d{1,4}[-._a-zA-Z0-9]*$`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment with a link where you get this regex?
|
||
var libs []types.Library | ||
libs = append(libs, types.Library{ | ||
ID: packageID(name, vers), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use dependent.ID
once. So you can just use it here.
ID: packageID(name, vers), | |
ID: dependency.ID(ftypes.PythonGeneric, name, version), |
} | ||
} | ||
|
||
return "python", vers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to determine package name?
Looks like it might be confusing if all detected binaries are named "python"
for _, tt := range tests { | ||
t.Run(tt.name, func(t *testing.T) { | ||
a := pythonBinaryAnalyzer{} | ||
fileInfo, err := os.Lstat(tt.filePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we can use 1 file and change names, permissions, etc. in test.
Then we can reduce the number of copies of test files.
@@ -65,7 +65,6 @@ func (s *scanner) Scan(target types.ScanTarget, _ types.ScanOptions) (types.Resu | |||
} | |||
|
|||
logger := log.WithPrefix(string(app.Type)) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we don't need this change.
@@ -0,0 +1,82 @@ | |||
// Ported from https://github.com/golang/go/blob/e9c96835971044aa4ace37c7787de231bbde05d9/src/cmd/go/internal/version/exe.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like Go 1.22 doesn't have this file. Can you please update the link and double check - we may need to add some changes.
pythonLibNameRegex := regexp.MustCompile("^libpython[0-9]+(?:[.0-9]+)+[a-z]?[.]so.*$") | ||
pythonBinaryNameRegex := regexp.MustCompile("(?:.*/|^)python(?P<version>[0-9]+(?:[.0-9]+)+)?$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be great if you add some comment about these regex (e.g. link to docs)
TypePip Type = "pip" | ||
TypePipenv Type = "pipenv" | ||
TypePoetry Type = "poetry" | ||
TypePythonGeneric Type = "python" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like we can use executable for new binaries
Thanks for the feedback, I will make the changes when I have the time, this week’s been busy, but I’ll try to make progress when I have the time. |
Description
Implement binary version parsing for at least key binaries. The feature should be extended upon, but first round, I'm going to be focusing stuff which matters for my use-case (PHP, NodeJS, Python)... I'm keeping Java binary parsing since it's already been implemented by laurentdelosieresmano in the related PR👍 Since go-dep-parser was moved to this Repo, I'm reopening the PR and adding Python dep parsing.
Related issues
Related PRs
Checklist