New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Add Check for published SBOM #3903
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
…into check-for-published-sbom
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #3903 +/- ##
==========================================
- Coverage 74.94% 70.28% -4.67%
==========================================
Files 223 232 +9
Lines 16046 16723 +677
==========================================
- Hits 12026 11754 -272
- Misses 3253 4230 +977
+ Partials 767 739 -28 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR!
A few questions:
- Should the probe be aware of cases where there's no release assets? For example, a GitHub release may have no assets, if it's only source code release. Is an SBOM needed in this case? Note that the release could contain a container release as a GitHub package or something else. We could (maybe?) update the code to detect the former case, but the latter seems much harder.
- How do we determine if the project should create an SBOM or not, depending on the type of release (application, library, ?) - see https://blog.deps.dev/zillions-of-sboms/. Is there a document describing when an SBOM makes sense in each ecosystem?
- Remediation depends on point (2) above. We need a good story to tell folks how to remediate. Ideally SBOMs should come for free out of the package managers, which can build enough context to decide if an SBOM is needed or not (or always generate an empty SBOM?). @steiza @di @woodruffw @JoelMarcey is this something you intend to build for npm / pypi / homebrew / rust?
- Since GH has an API to generate SBOMs, why should maintainers generate an SBOM themselves?
Somewhat related but broader questions: Are registries interested in accepting SBOMs for applications. @steiza @di @woodruffw @JoelMarcey how are you thinking about this from npm / pypi / homebrew / rust side?
checker/raw_result.go
Outdated
|
||
const ( | ||
// sources of license information used to assert repo's license. | ||
SbomOriginationTypeAPI SbomOriginationType = "repositoryAPI" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can you add a comment what each type represents? API seems to be GitHub / GitLab API? Not sure what the others are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed one to be more clear and added comments. lmk what you think
clients/githubrepo/tarball.go
Outdated
@@ -28,7 +28,7 @@ import ( | |||
"strings" | |||
"sync" | |||
|
|||
"github.com/google/go-github/v53/github" | |||
"github.com/google/go-github/v59/github" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need this update in this PR? If not, please revert to reduce the noise in the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this because the sbom api stuff wasn't in v53. Looking again, it was added in v54 of the package. I could back it down to that for less of a jump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming nothing breaks, might as well go to latest. Since it's required for the change, I'm fine with it here or in a separate PR.
//go:embed *.yml | ||
var fs embed.FS | ||
|
||
const Probe = "sbomCICDArtifactExists" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: There's some ongoing discussion about whether we should consolidate similar probes in a single one #3824. Just an FYI, out of scope for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I've been seeing activity to that end. Fine with consolidating/changing if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so far i've mainly been targeting situations where there are X different tools to satisfy something (fuzzer, SAST, dependency update)
Maintained, and License both still have multiple probes which check for slightly related things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see a good argument for consolidating the release and CICD artifact checks into one, but open to suggestions
clients/gitlabrepo/sbom.go
Outdated
|
||
func (handler *sbomHandler) checkCICDArtifacts(latestpipelines []graphqlPipelineNode) error { | ||
// Originally intended to check artifacts from latest release pipeline, | ||
// but changed to check latest default branch pipeline to align with github check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could also follow what the Signed-Release check does, ie check the last X releases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mention in the notes for this check, but initially I wanted to check pipeline artifacts for the latest release which made the most sense to me. However in an effort to match the check across clients (as much as possible), I moved to the latest pipeline on the default branch.
If we're going to stay with checking pipelines on the default branch for GL, it may be better to check a certain #, since sbom jobs may not run on certain pipeline types.
return findings, Probe, nil | ||
} | ||
|
||
for i := range sbomFiles { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this loop match the SBOM to its corresponding artifact, as done for https://github.com/ossf/scorecard/blob/main/probes/releasesAreSigned/impl.go#L61?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this and probes/sbomReleaseArtifactExists/impl.go
to add the value to the finding.
Both clients as written handle the possibility that there are no release assets, in which case it just continues on with the remainder of the check. For a source only release I'd expect an associated SBOM, as ideally each released version would have an associated SBOM for downstream users (assuming the project should be creating an SBOM). For releases that include a container, we could potentially check for an SBOM layer in the container, but the effort involved in that may not be worth it. This would be a situation where the workflow that builds the container in preparation for release would ideally also generate an SBOM for it and save it as an artifact (at the very least) to be picked up by our checks, and to be available for downstream users.
I'm not sure about this. I know applicability is one of the sticking points for this check in general. There was talk of a Maintainers Annotation feature which could help here, but I'm not sure the status of that. I'm unaware of any such document describing when an SBOM makes sense in each ecosystem. @idunbarh do you have any insight on this?
This I actually forgot to mention in the notes doc I posted, I have since added it. I do hit the SBOM api endpoint for Github repos and, assuming a good response, count that generated SBOM as a release artifact and award appropriate points. As far as generating an SBOM themselves in GH repos, I would leave it up to the maintainers to decide if that is necessary for their project. i.e. if they need to generate one for a release container or in another format. I'm hesitant to award full points for the endpoint BOM, as it may not be comprehensive. I haven't seen anything from GL yet, but would expect to see a similar SBOM endpoint eventually. |
The I don't believe the OpenSSF SBOM Everywhere SIG has provided any guidance on what ecosystems should or should not generate SBOMs. The SBOM Naming Convention Doc would be the ideal place for clarification. |
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
Yeah we really need a doc on this, per ecosystem, per-use case. What we found in https://blog.deps.dev/zillions-of-sboms/, is that most open-source projects are libraries, so often times an SBOM is not needed (there are nuances, of course). Keep me in the loop if this works starts. |
For a first integration, one possibility could be to not include these probes in a default check. Users who want to detect the presence of SBOMs in a repo could then run the probes / checks they care about, using the context and knowledge they have about the repo. For example, organizations (@UlisesGascon @netomi) could use their own policy to decide whether to run this probe or not on their projects. For arbitrary projects, I think we need more investigation to reduce noise (false positives, score decrease) |
Yeah, I was going to bring this up. Most projects that'll ever be handled by Scorecard likely don't need an SBOM (simple libraries).
Alternatively, make this a "bonus points" check: 10/10 if the latest release has an SBOM, $INCONCLUSIVE otherwise. |
Coming from the Security Tooling WG discussions, one of the desired outcomes is to measure the impact other parts of OpenSSF are having around SBOM adoption. In this case the Security Tooling WG's SBOM Naming Conventions and future SBOM Strike Force that will be working directly with large projects to implement SBOMs. I see Scorecard as an mechanism to help adoption and also measure adoption. I'm all for making it a bonus points check over a check that would be disabled by default. |
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
SBOMs for the Rust ecosystem, particularly around crates.io, is on our radar, but at this point not our highest priority. The Rust Secure code working group [1] is looking at this, Microsoft has a concerted effort happening [2] and there are other tools [3] and protocols [4] out there. But there is still a lot of questions in the air on how best to generate SBOMs. What is the standard? We have [1] https://www.rust-lang.org/governance/wgs/wg-secure-code |
To extend what @JoelMarcey said: We've spent months revamping cyclonedx-rust-cargo as part of a funded project by the german Sovereign Tech Fund. We are currently writing a PoC for the RFC around having native "SBOM support-supprort" in Rust itself. This is coordinated with @arlosi from Microsoft who started the RFC based on an initial conversation we had around this last year. This is a direct result of our work on the CycloneDX generator where we saw which gaps there currently are (e.g. parsing Cargo.toml alone is not sufficient, neither is cargo.lock so we currently call I obviously hope that crates.io and other tools will be able to benefit from this when this is done. And with cargo-auditable we have a tool to embed SBOMs in generated binary artifacts directly and as part of the project we extended it to write CycloneDX as well. All of this is very high priority here in Europe due to the upcoming Cyber Resilience Act which will make SBOMs more or less mandatory for every digital product as of 2027 or so (depending on when they finally sign it).
In short: Stuff is happening for the Rust ecosystem at least. |
I want to ++ this: I think many (especially smaller) projects already struggle to consume scores and treat them as a faulty dimension reduction of the project's actual security posture, so giving them an additional negative weight based on the absence of data rather than the presence of negative data will probably make it harder to justify adoption 🙂 |
This pull request has been marked stale because it has been open for 10 days with no activity |
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
…into check-for-published-sbom
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashearin I have a few high level questions in order of importance before I review implementation specifics (you can ignore the nits for now)
- Determining what probes we want after our OSS NA conversation. I believe we settled on
hasSBOM
or similar. And based on your code here, we may want to distinguish betweenhasSourceSBOM
vshasReleaseSBOM
. It may also possible to make use offinding.Location.Type
if we just want a generichasSBOM
probe. There are various file types which distinguish between source files vs something that lives at a URL
Lines 30 to 43 in 8789bbb
const ( // FileTypeNone must be `0`. FileTypeNone FileType = iota // FileTypeSource is for source code files. FileTypeSource // FileTypeBinary is for binary files. FileTypeBinary // FileTypeText is for text files. FileTypeText // FileTypeURL for URLs. FileTypeURL // FileTypeBinaryVerified for verified binary files. FileTypeBinaryVerified ) - Seeing if we need to add
ListSBOM
to the repo interface or not. If all we're doing (for now?) is looking at source files and release artifacts, I don't think we need to add the method to the interface. (see specificclients/repo_client.go
conversation for details) - Even if we don't expose a check, we might need to have the raw-data collection side of things for now (as should probes have their own raw data and supported request types? #3718 is still open)
clients/repo_client.go
Outdated
ListStatuses(ref string) ([]Status, error) | ||
ListSboms() ([]Sbom, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same naming comment (ListSBOM
). but also a discussion:
I think given all the external discussion around SBOMs, i could see this being an interface method. But we could also potentially get this information through ListFiles
, ListReleases
, and ListSuccessfulWorkflowRuns
(which may need generalized for workflow runs regardless of status).
There is potentially data from other sources, which wouldn't be accessible from the other methods. As your implementation uses elsewhere, GitHub has an SBOM endpoint, and GitLab has a feature request for it.
Is an auto-generated SBOM something that is worth fetching? If it's just an API that anyone can call on any github/gitlab repo, wouldn't every repo have an SBOM then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is an auto-generated SBOM something that is worth fetching? If it's just an API that anyone can call on any github/gitlab repo, wouldn't every repo have an SBOM then?
Thought about this last week after we talked. If what we want to track is adoption, I think fetching the auto generated boms would obviously greatly skew those numbers. Plus, for the groups that care about seeing sboms in repos, the platform generated ones are easily dismissed as not enough (at least for the time being). I can remove the github function to fetch/check there until it's needed.
I think given all the external discussion around SBOMs, i could see this being an interface method. But we could also potentially get this information through
ListFiles
,ListReleases
, andListSuccessfulWorkflowRuns
(which may need generalized for workflow runs regardless of status).
I considered this when making the first draft. Assumed the preference was to keep things separate to allow for more configuration for the users/future dev work. I could rework this to get the info from existing methods if preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it can be done purely as ListFiles
and ListReleases
, I view that as a simpler first step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ListFiles
and ListReleases
can replace most of the functionality, but the tricky one is ListSuccessfulWorkflowRuns
. Since GIthub doesn't associate workflow artifacts with a workflow directly, I'd have to make an api call for each workflow returned to check for artifacts. Unless I update it to use graphql and return more info.
For Gitlab, the pipeline info returned by the Listcheckruns could be used to get pipeline artifacts. (The List workflowRuns might also work I'd have to double check).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The graphql approach could be possible for fetching more data. It's also possible to just focus on source and release SBOMs first, and address CI/CD artifacts in a followup
probes/sbomExists/def.yml
Outdated
implementation: > | ||
The implementation checks whether an sbom file is present in the source code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the probe name be more specific if its only checking for SBOMs managed in the source?
hasSourceSBOM
or similar?
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
Signed-off-by: Allen Shearin <allen.p.shearin@gmail.com>
URL string // SBOM Asset URL | ||
File File // SBOM File Object |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically File has a Type
field which includes. Does it make sense to combine it?
// FileTypeURL for URLs.
FileTypeURL
case finding.OutcomeNotApplicable: | ||
dl.Info(&checker.LogMessage{ | ||
Type: finding.FileTypeSource, | ||
Offset: 1, | ||
Text: f.Message, | ||
}) | ||
case finding.OutcomeTrue: | ||
switch f.Probe { | ||
case hasSBOM.Probe: | ||
dl.Info(&checker.LogMessage{ | ||
Type: finding.FileTypeSource, | ||
Path: f.Message, | ||
Text: existsMsg, | ||
}) | ||
score += scoreProbeOnce(f.Probe, m, 5) | ||
case hasReleaseSBOM.Probe: | ||
dl.Info(&checker.LogMessage{ | ||
Type: finding.FileTypeURL, | ||
Path: f.Message, | ||
Text: releaseMsg, | ||
}) | ||
score += scoreProbeOnce(f.Probe, m, 5) | ||
default: | ||
e := sce.WithMessage(sce.ErrScorecardInternal, "unknown probe results") | ||
return checker.CreateRuntimeErrorResult(name, e) | ||
} | ||
case finding.OutcomeFalse: | ||
switch f.Probe { | ||
case hasSBOM.Probe: | ||
dl.Warn(&checker.LogMessage{ | ||
Type: finding.FileTypeSource, | ||
Path: f.Message, | ||
Text: "SBOM file not found in project", | ||
}) | ||
existsMsg = f.Message | ||
case hasReleaseSBOM.Probe: | ||
dl.Warn(&checker.LogMessage{ | ||
Type: finding.FileTypeURL, | ||
Path: f.Message, | ||
Text: "SBOM file not found in release artifacts", | ||
}) | ||
releaseMsg = f.Message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally the logging in other checks is done via:
checker.LogFinding(dl, f, logLevel)
|
||
message := fmt.Sprintf("%s. %s.", existsMsg, releaseMsg) | ||
return checker.CreateResultWithScore(name, message, score) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally the check reason is a single reason. Any of the individual reasons for exact score are exposed via the details.
No SBOM:
"Project does not have a SBOM file"
SBOM:
"Project has a SBOM file"
Release SBOM:
"Project publishes an SBOM file as part of a release or CICD"
{ | ||
name: "Negative outcome = Min Score", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
terminology might need updating after the true / false positive/negative change.
"no SBOM is min score" or something like that?
{ | ||
name: "Exists in Source: Positive outcome.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
terminology might need updating after the true / false positive/negative change.
"source only SBOM half credit" or something like that?
|
||
type SBOMHandler struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sbomHandler
to not expose implementation detail
outcome: | ||
- If SBOM artifacts are found, the probe returns OutcomePositive for each SBOM artifact. | ||
- If an SBOM artifact is not found, the probe returns a single OutcomeNegative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update for OutcomeTrue and OutcomeFalse
loc := &finding.Location{ | ||
Type: SBOMFile.File.Type, | ||
Path: SBOMFile.File.Path, | ||
LineStart: &SBOMFile.File.Offset, | ||
LineEnd: &SBOMFile.File.EndOffset, | ||
Snippet: &SBOMFile.File.Snippet, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should be able to save some lines with:
loc := SBOMFile.File.Location()
outcome: | ||
- If an SBOM file(s) is found, the probe returns OutcomePositive for each SBOM file. | ||
- If an SBOM file is not found, the probe returns a single OutcomeNegative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar True/False update
loc := &finding.Location{ | ||
Type: SBOMFile.File.Type, | ||
Path: SBOMFile.File.Path, | ||
LineStart: &SBOMFile.File.Offset, | ||
LineEnd: &SBOMFile.File.EndOffset, | ||
Snippet: &SBOMFile.File.Snippet, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same loc := SBOMFile.File.Location()
optimization is possible.
Allen also mentioned gating the check behind experimental flag |
The probe code is in a good spot. You'll also need to add the raw data collection to support running scorecard/pkg/scorecard_result.go Lines 250 to 251 in c11d89b
|
What kind of change does this PR introduce?
Adds check for published sboms. This PR is still a draft as there is more discussion on applicability and implementation to be had. This PR is meant to spur those conversations.
What is the current behavior?
N/A
What is the new behavior (if this is a feature change)?**
*Tests were not added as the implementation may change, tests will be added prior to merge.
Adds a multi probe check for a Software Bill of Materials for a scanned repository.
More information regarding implementation and considerations for this check can be found here
Which issue(s) this PR fixes
Fixes #3574, #1476
Special notes for your reviewer
Does this PR introduce a user-facing change?
For user-facing changes, please add a concise, human-readable release note to
the
release-note
(In particular, describe what changes users might need to make in their
application as a result of this pull request.)