Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does "Checkout pull request HEAD commit instead of merge commit" mean? #426

Open
sunshowers opened this issue Jan 16, 2021 · 16 comments · May be fixed by #1627
Open

What does "Checkout pull request HEAD commit instead of merge commit" mean? #426

sunshowers opened this issue Jan 16, 2021 · 16 comments · May be fixed by #1627

Comments

@sunshowers
Copy link

sunshowers commented Jan 16, 2021

Edit 2023-10-23: I no longer believe in what I wrote below -- I've seen data which has changed my belief into checking out the merge commit is the right default. I'll leave this open since there's a lot of passion here, and maybe this deserves clearer documentation.

Hi there!

The README has a section saying how to "[c]heckout pull request HEAD commit instead of merge commit". It isn't really clear what this means, and I tried looking at the source code and it didn't really make sense to me either.

Could you add some more information about this to the README? What is a "merge commit" in this situation?

Some information that might be useful:

  • a table with the default behavior and with the behavior with ref set to ${{ github.event.pull_request.head.sha }}
  • an example commit DAG pointing at the result of this action with and without this option

My experience with CI has led me to believe that there's just one way to check out a pull request (the most recent commit on the branch at the time the PR was initiated or last updated) so when I came across this option I was really surprised.

@ericsciple
Copy link
Contributor

The docs below may help, let me know

https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/merging-a-pull-request

https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-request-merges

By default a merge commit is checked out. If you look at the SHA that is checked out, it doesn't match your local commit SHA. Rather it's the SHA of a merge commit (your branch + base branch)

@sunshowers
Copy link
Author

Thanks for the response!

By default a merge commit is checked out. If you look at the SHA that is checked out, it doesn't match your local commit SHA. Rather it's the SHA of a merge commit (your branch + base branch)

  1. What if the repository is not configured to have merges? The codebases I manage all maintain a linear history, and either use github's rebase UI or a land queue service.
  2. What if there are merge conflicts between the base branch and the PR? What commit gets checked out then?
  3. I will probably end up always using the head commit, but more broadly, in my personal opinion, this is not the right default. I believe the code that should be tested against is exactly what the PR author put up. What evidence and/or arguments would it take to convince a switch away from the current default?
  4. I think the answers to the above should be documented in the readme. This is genuinely surprising behavior to me and I've never experienced a CI system which does this.

@sunshowers
Copy link
Author

Would there be any interest in revisiting this decision at all?

@jasonkarns
Copy link

jasonkarns commented Mar 3, 2021

To further clarify: the merge commit is checked out for pull_request events. This action does checkout the head of a branch for push events, so that option is available for workflows without needing to configure the checkout action. This is the primary reason that I generally prefer running my builds on both push and pull_request events; a failure on only one of the builds provides valuable information in determining the cause of failure.

in my personal opinion, this is not the right default.

The default action for merging of a Pull Request is a merge commit, so (IMO) I think it is the appropriate default for the checkout action to run against what the pr "merge" button will generate (by default).

For the historical github flow (before squash and rebase merges were available on github), it would be more surprising (IMO) to have the HEAD run CI, since that doesn't provide any safety or guarantees which are the purpose of the CI jobs. Imagine a green CI check that fails immediately upon merge; that's a much more dangerous scenario since it actively promoting false negatives.

This is genuinely surprising behavior to me

Agreed.

and I've never experienced a CI system which does this.

FWIW, this is also the default behavior for travisci. (actually, I believe the default for travis is to run two builds: one for push and one for pr; but the pr build is built against the merge commit)

@einarpersson
Copy link

einarpersson commented Mar 27, 2023

I just spent 4 hours scratching my head trying to understand why I got different sha for HEAD and HEAD^ when running locally and when running in a github workflow. This explains it. Using git ls-remote was also helpful here.

I really wish this information would be more visible and the implications of it (especially for non-experts in git/github). I naively thought that running git --quiet diff HEAD~1 -- db/ would give the same result locally and when running in a pull request workflow. I wanted to compare the current and last commit. It worked locally, and in the PR user interface I can see my commit hashes. Now I understand why it didn't work (because of the hypothetical merge commit approach) but yeah, it took some time to figure out...

@einarpersson
Copy link

einarpersson commented Mar 27, 2023

Maybe something here about why you would like to do this (and explain why this isn't done by default). I understand that this might be "common knowledge" to some, but for me this was the first time I encountered this

image

I would like a warning sign ⚠️ 😄 or ℹ️

@ctcampbell
Copy link

  1. What if there are merge conflicts between the base branch and the PR? What commit gets checked out then?

Just want to highlight this is mentioned in the docs now:

Note: Workflows will not run on pull_request activity if the pull request has a merge conflict. The merge conflict must be resolved first.

@jamescrosswell
Copy link

For the historical github flow (before squash and rebase merges were available on github), it would be more surprising (IMO) to have the HEAD run CI, since that doesn't provide any safety or guarantees which are the purpose of the CI jobs. Imagine a green CI check that fails immediately upon merge; that's a much more dangerous scenario since it actively promoting false negatives.

So that's actually exactly what I just bumped into with the current behaviour.

Our repository has submodules and we have some bots that tell us when these need to be updated. In our main branch these have been moving forward so the main branch is a couple of versions ahead on some of these submodules.

In the mean time, we're preparing a major release on a separate version 4.0.0 branch. We've got lots of smaller PRs being merged into that v4 branch. For one of those PRs, the tests were all green and so we merged it into the v4 branch. As soon as we did that, things failed in the v4 branch (same code and tests that just passed in the PR).

The only difference between the two runs in CI was the merge target... first time round the merge target was the v4 branch (which didn't have any of the submodule updates applied). Second time round, the merge target was our main branch (which did have submodule updates applied)... so different results.

Specifically (and this is maybe too much detail) in our case the submodules are stuff running in other languages. As part of our build process, we generate some wrappers to do marshalling etc. so that we can make use of native code in those modules and we check to make sure that that generated code is the same, after build, as what was checked into source control. If any of the files are dirty, CI fails as it knows something funky has happened. If we instead checked out the PR head, I think the only risk would be if some code in the PR relied on generated/marshalling code that had undergone a breaking change - but that would imply a major version bump for one of the submodules we reply on...

So at least in our specific case, it seems like we'd be better off using ref: ${{ github.event.pull_request.head.sha }}... I'm not sure if there are couter-examples where that would result in merges that fail immediately.

@sunshowers
Copy link
Author

sunshowers commented Oct 23, 2023

FWIW, since my earlier post I've reversed my opinion -- I've seen proprietary data which has led me to believe that checking out the merge commit, while not as pure as always checking out the PR head, results in better outcomes overall.

@jamescrosswell
Copy link

@sunshowers I'd be really interested to hear specific examples of how/where ref: ${{ github.event.pull_request.head.sha }} could get us in trouble... if you've got any.

@sunshowers
Copy link
Author

It's not a problem per se, it's just that the data showed that using the merge conmit provided more relevant CI results.

@Dmitry1987
Copy link

@sunshowers I'd be really interested to hear specific examples of how/where ref: ${{ github.event.pull_request.head.sha }} could get us in trouble... if you've got any.

of course it can, for example in our CI we must run tests on content of feature branch of the PR merged with content of the dev trunk. if we run on feature branch content only, it will pass, but the moment it gets merged to trunk and code is mixed among them, it'll fail. I am actually wondering how running a CI on incoming branch only, can be useful as a quality gate for merge (i mean, after all we're usually validating that after the merge to trunk or some 'main', the code doesn't break the 'main' right? the tests run and all that. but if we don't run it on a merge commit where code is joined of both source and destination, those tests are invalid and won't protect the main branch from breaking imho). I hope I understood the discussion correctly =) .

@Dmitry1987
Copy link

but anyways, I came here while breaking my head searching for a solution of how to checkout the merge commit from pre-existing repo that I am caching in the ec2 AMI in order to not checkout many GBs of LFS and 180k+ files monorepo each time (so not doing a fresh checkout of PR during workflow, but manual 'git checkout' in cli, to the $GITHUB_SHA commit, which is the merge commit in pull_request event actions run). I don't understand why GITHUB_SHA leads to "not in tree" (doesn't exist) even if I fetch, and do 'pull --all' and anything else I was able to find online, nothing works like that commit is nonexistent. but if it's not, then why env variable is populated with such SHA? it should be the merge commit but I cannot check it out :( , does anyone have a suggestion? can I use this action in order to 'pull' merge commit inside existing folder? I tried to point 'path' to existing cached git repo but the action deletes it and re-clones fresh from scratch, but I need to keep existing files (at least the submodule with giant amount of LFS data) but it deletes everything when I use 'path'.

@Dmitry1987
Copy link

hahaaa i figured this out 🎉

          if [ "${GITHUB_EVENT_NAME}" == "pull_request" ]; then
            PR_NUMBER="${GITHUB_REF_NAME%%/*}"
            git fetch --no-tags --depth=1 origin +${GITHUB_REF}:refs/remotes/origin/pr/${PR_NUMBER}
            git checkout origin/pr/$PR_NUMBER
          else 
            git fetch --no-tags --depth=1 origin ${GITHUB_REF}
            git checkout $GITHUB_SHA
          fi

the fetch step with correct format of +${GITHUB_REF}:refs/remotes/origin/pr/${PR_NUMBER} was the key 💃

@vlad-obrizum
Copy link

a "fresh" green CI check cannot fail immediately upon merge

This is a powerful reason for the current behaviour,

Nonetheless, it is important to bear in mind that there is a substantial distinction between a "green" checkmark in the most recent pull request and one from a week ago. Ideally, I hope that GitHub could integrate this awareness into the icon to clearly convey this difference.

@tornike
Copy link

tornike commented Mar 6, 2024

@Dmitry1987 but if we don't run it on a merge commit where code is joined of both source and destination, those tests are invalid and won't protect the main branch from breaking imho.

What if base branch is updated after pull_request workflow checks out merge commit and tests are running, PR workflow will succeed but I think these tests will also be invalid.

I don't see options other then to also run tests on push event on the base branch.

github-merge-queue bot pushed a commit to Kitware/CDash that referenced this issue Apr 2, 2024
#2095 switched our CI provider from CircleCI to GitHub Actions, but
broke our integration with CDash.

The checkout action provided by GitHub runs workflows triggered by the
`pull_request` event against a [merge
commit](actions/checkout#426) instead of the
PR's head ref. This causes issues with the CDash integration since the
git hash CDash receives does not match the most recent hash on the pull
request. I was not able to resolve the callback issue, but the changes
in this PR improve the CDash build name being submitted, and assign runs
to appropriate build groups.

I plan to take another look at this issue at a future point to restore
our CDash callback.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants