BUG: Issues with contributor scoring #3996

siralmat · 2024-04-03T11:28:27Z

Describe the bug

The Contributors check does not work as described in the docs:

Risk: Low (lower number of trusted code reviewers)
This check tries to determine if the project has recent contributors from multiple organizations (e.g., companies). [...]
The check looks at the Company field on the GitHub user profile for authors of recent commits. To receive the highest score, the project must have had contributors from at least 3 different companies in the last 30 commits; each of those contributors must have had at least 5 commits in the last 30 commits.

Looking at 30 recent commits for the ossf/scorecard repo, there are two contributors with >1 commit (dependabot and @spencerschrock). The expected output would be one organisation (Google) and a score of 3/10.

However, the actual report shows 49 organizations and a score of 10/10:

So the main issue is that the check isn't using the 30 recent commits retrieved from the GraphQL endpoint. Instead it sends a separate query to api.github.com/repos/ossf/scorecard/contributors, which returns a list of the top 30 contributors across the lifetime of the project - not the authors of the 30 most recent commits.

After spotting this, I realized there are actually a few other things going on that I thought were worth raising. I've tried to summarise them as concisely as possible below. (Apologies for abandoning your bug report template!)

Conflating Github Organization membership with the Company profile field

The docs say that the check is based on the 'Company' profile field, but it also checks a user's public Github Organization memberships. This seems intentional, so I assume the docs need updating - though for reasons outlined further down, I don't think it's good to treat these interchangeably.

Contributors (and organizations) may be counted more than once

As I understand it, the contributors check essentially flattens the list of contributor orgs+companies then counts the number of unique entries (with some deduplication). This can lead to surprising results, like a project with a single contributor that scores 10/10 because the user is a member of 3+ organizations.

Often these organizations are not meaningfully distinct either, as seen in the list of organizations on the ossf/scorecard report: e.g. chainguard/chainguard-dev/chainguard-images shouldn't really be treated as 3 separate entities.

No safeguards against spoofing

There are several ways to create a false or misleading contributor list on a scorecard:

The company profile field is a free text field. If I declare myself an OSSF member on my profile, the scorecard will report this as though I were a verified member of the OSSF Github organization.
Anybody can create a new organization with a name that is deceptively similar to an existing organization.
Many orgs use RBAC to provide limited Github access to contractors, interns, or other temporary/unvetted users, and these users are indistinguishable from trusted users.
It's trivial to spoof commits that are attributed to other users, meaning you can force virtually any organization to appear on your project's scorecard. (I don't think there's any way to block or detect this except by limiting the tool to checking verified commits.)

For a quick example: if you generate a scorecard for my demo repo, you'll see that @ossf and @google are active contributors :)

Closing thoughts

The goal of the Contributors check is to 'evaluate the number of 'trusted code reviewers', with the implication that projects are more trustworthy when a higher number of organizations contribute to them. If we accept that premise (I personally have some reservations), I still don't see how you can reliably report on this using the signals that Github offers.

I think the scorecard is a great concept and see a lot of value in surfacing quick, practical summaries of project risk indicators. But I'm concerned that embedding unreliable or low-value indicators in a security tool will lead people to make misinformed risk decisions, while adding to alert fatigue and stress for overloaded OSS maintainers and security teams.

I'd really love to hear OpenSSF's position on this and how it relates to the longterm vision for this project.

The text was updated successfully, but these errors were encountered:

adonm · 2024-04-03T11:40:06Z

As an individual contributor showing up as part of 2 orgs for nbdev-squ, I can confirm the above issue (github org doesn't map directly to legal entity - one entity may have several orgs) and would appreciate a more formal mechanism for verifying contributor counts and providence (like signed commits or similar).

siralmat added the kind/bug Something isn't working label Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Issues with contributor scoring #3996

BUG: Issues with contributor scoring #3996

siralmat commented Apr 3, 2024 •

edited

adonm commented Apr 3, 2024 •

edited

BUG: Issues with contributor scoring #3996

BUG: Issues with contributor scoring #3996

Comments

siralmat commented Apr 3, 2024 • edited

Describe the bug

Conflating Github Organization membership with the Company profile field

Contributors (and organizations) may be counted more than once

No safeguards against spoofing

Closing thoughts

adonm commented Apr 3, 2024 • edited

siralmat commented Apr 3, 2024 •

edited

adonm commented Apr 3, 2024 •

edited