Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Code-Review missing review markers #4038

Closed
emaste opened this issue Apr 17, 2024 · 7 comments · Fixed by #4086
Closed

BUG: Code-Review missing review markers #4038

emaste opened this issue Apr 17, 2024 · 7 comments · Fixed by #4086
Labels
check/Code-Review kind/bug Something isn't working

Comments

@emaste
Copy link

emaste commented Apr 17, 2024

Describe the bug
Code-Review misses review indicators used by FreeBSD.

Reproduction steps
Steps to reproduce the behavior:

  1. Inspect Code-Review for freebsd-src https://securityscorecards.dev/viewer/?uri=github.com/freebsd/freebsd-src
  2. Review recent FreeBSD commits https://github.com/freebsd/freebsd-src/commits/main/
  3. Observe Reviewed by: tags and/or Phabricator URLs indicating review took place, e.g. freebsd/freebsd-src@1b74875

Expected behavior

  • Match Reviewed by: as a synonym for Reviewed-by: as evidence of review
  • Consider following Phabricator review links and extracting review state

Additional context
I suspect FreeBSD may be one of the more significant remaining open source users of Phabricator, and for us review evidence should be included in the commit message directly (via Reviewed by:). It may or may not be worth the effort to parse Phabricator state for other projects that use Phabricator and do not indicate the review state in the commit message. LLVM used to be an example of such a project, but they use a completely GitHub-based workflow now.

@emaste emaste added the kind/bug Something isn't working label Apr 17, 2024
@spencerschrock
Copy link
Contributor

We do try and detect Phabricator review links:

// Given m, a commit message, find the Phabricator revision ID in it.
func getPhabricatorRevisionID(c *clients.Commit) string {
m := c.Message
match := rePhabricatorRevID.FindStringSubmatch(m)
if match == nil || len(match) < 2 {
return ""
}
return match[1]
}

Where the regex we currently use is:

var (
rePhabricatorRevID = regexp.MustCompile(`Differential Revision:\s*(\w+)`)

I see many commits with content like:

Differential Revision: https://reviews.freebsd.org/D44814
Differential Revision: https://reviews.freebsd.org/D44815

We do get matches on some of the commits, including freebsd/freebsd-src@1b74875

2024/04/30 13:26:54 1b7487592987c91020063a311a14dc15b6e58075 {Phabricator https} Phabricator https

But our regex doesn't handle the link, so all of these changes count as 1 revision ("https"). So unrelated commits are getting grouped as one, leading to an under reporting of freebsd's code review practices.

@emaste
Copy link
Author

emaste commented May 1, 2024

Phabricator wants the full URL for auto-closing - see e.g. LLVM's (now obsolete) Phabricator documentation: https://releases.llvm.org/14.0.0/docs/Phabricator.html, so the regex should be changed to match any non-space characters.

Unfortunately I couldn't find a canonical reference for the full URL with a quick look. We used just the D#### tag when we first started using Phabricator but Phabricator was only adding a reference to a commit, not auto-closing the review.

FWIW our commit message template has:

...
# Pull Request: <https://github.com/freebsd/<repo>/pull/###>
# Differential Revision:        <https://reviews.freebsd.org/D###>
#
# "Pull Request" and "Differential Revision" require the *full* GitHub or
# Phabricator URL.  The commit author should be set appropriately, using
# \`git commit --author\` if someone besides the committer sent in the change.

@spencerschrock
Copy link
Contributor

spencerschrock commented May 2, 2024

What do you think about:

Differential Revision:[^\r\n]*(D\d+)

So it matches the D### on the same line as Differential Revision

Which when run against the repo detects more code review activity:

go run main.go --repo freebsd/freebsd-src --checks Code-Review --format json --show-details | jq
{
  "date": "2024-05-02T12:39:59-07:00",
  "repo": {
    "name": "github.com/freebsd/freebsd-src",
    "commit": "4510f2ca9170927309a423274e03f1eb8e27da27"
  },
  "scorecard": {
    "version": "devel",
    "commit": "unknown"
  },
  "score": 5.0,
  "checks": [
    {
      "details": null,
      "score": 5,
      "reason": "Found 16/29 approved changesets -- score normalized to 5",
      "name": "Code-Review",
      "documentation": {
        "url": "https://github.com/ossf/scorecard/blob/main/docs/checks.md#code-review",
        "short": "Determines if the project requires human code review before pull requests (aka merge requests) are merged."
      }
    }
  ],
  "metadata": null
}

@emaste
Copy link
Author

emaste commented May 2, 2024

That RE lgtm, and 16/29 seems plausible for the fraction of recent commits with evidence of review based on a quick look in the repo.

@emaste
Copy link
Author

emaste commented May 2, 2024

I'm curious how multiple commits with the same phabricator review link are handled, though -- it's possible that the same code review may be shared by multiple commits (e.g., there may be a bugfix and new functionality that are closely related and make sense to review altogether, but the two parts are committed separately).

@spencerschrock
Copy link
Contributor

I'm curious how multiple commits with the same phabricator review link are handled, though -- it's possible that the same code review may be shared by multiple commits (e.g., there may be a bugfix and new functionality that are closely related and make sense to review altogether, but the two parts are committed separately).

Under the hood they get grouped into the same changeset, in this case by Phabricator revision ID

for i := range commits {
rev := detectCommitRevisionInfo(&commits[i])
if rev.ID == "" {
rev.ID = commits[i].SHA
}
if changeset, ok := changesetsByRevInfo[rev]; !ok {
newChangeset := checker.Changeset{
ReviewPlatform: rev.Platform,
RevisionID: rev.ID,
Commits: []clients.Commit{commits[i]},
}
if rev.Platform == checker.ReviewPlatformGitHub {
newChangeset.Reviews = getGithubReviews(&commits[i])
newChangeset.Author = getGithubAuthor(&commits[i])
}
changesetsByRevInfo[rev] = newChangeset
} else {
// Part of a previously found changeset.

So if you have 4 commits, 2 of which come from the same Phabricator review, and 2 commits which don't have reviews.

this will get grouped into 3 groups, (phabricator revision, unreviewed commit 1, unreviewed commit 2), and be considered 1/3 reviewed.

@spencerschrock
Copy link
Contributor

spencerschrock commented May 2, 2024

Another scenario is when a commit references more than one review (like the commit you linked):

Differential Revision: https://reviews.freebsd.org/D44814
Differential Revision: https://reviews.freebsd.org/D44815

I think the regex would only associate it with D44814, but either way it counts as reviewed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
check/Code-Review kind/bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants