Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR creation failing due to fetch failing #2764

Closed
kierans opened this issue Feb 15, 2024 · 43 comments · Fixed by #2867
Closed

PR creation failing due to fetch failing #2764

kierans opened this issue Feb 15, 2024 · 43 comments · Fixed by #2867

Comments

@kierans
Copy link

kierans commented Feb 15, 2024

Subject of the issue

My attempt to create a PR is failing with the error message. My use case is when the generated Swagger doc changes for my project, a PR is opened to merge the changes.

Create or update the pull request
  Attempting creation of pull request
  Error: fetch failed

This was working, but has recently stopped. The branch is created and pushed successfully. It’s just the PR step that is failing.

Steps to reproduce

jobs:
  documentation:
    name: Documentation
    runs-on: [self-hosted]
    permissions:
      contents: write
      pull-requests: write
    env:
      swaggerFile: ./swagger.json
    steps:
      - name: Checkout repo
        uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 18

      - name: Install project dependencies
        run: |
          npm ci

      - name: Update Swagger doc
        run: |
          npm run swagger-autogen

      - name: Check for changes
        id: changes
        run: |
          git diff --exit-code || echo "::set-output name=changed::true"

      - name: Create Pull Request
        id: cpr
        uses: peter-evans/create-pull-request@v6
        if: steps.changes.outputs.changed == 'true'
        with:
          base: main
          commit-message: 'Updated swagger doc'
          branch: docobot/githib_actions/swagger-doc
          committer: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
          author: ${{ github.actor }} <${{ github.actor_id }}+${{ github.actor }}@users.noreply.github.com>
          delete-branch: true
          title: 'Update Swagger Doc'
          body: |
            Update Swagger doc due to changes in ${{ github.sha }}
          labels: |
            documentation
@peter-evans
Copy link
Owner

Hi @kierans

I need more information to understand what is happening here.

Things that would help:

  • A complete workflow
  • A more complete snippet of the logs
  • A link to the run (if public)

@kierans
Copy link
Author

kierans commented Feb 16, 2024

@peter-evans Thanks for looking at this.

This issue is in the context of a corporate project. I’ve updated the description with more info about the workflow job. I’ve uploaded logs of the run. I’ve had to redact some of the logs, but I think there’s enough there for you to see what’s happening in the run. The GHA run is not public I so can’t give you the link.

@peter-evans
Copy link
Owner

Everything looks fine, so my guess is that this problem is caused by some issue with running on self-hosted. I think if you tried this on hosted (github.com) it would work. That could be a way you can narrow down the problem.

@kierans
Copy link
Author

kierans commented Feb 19, 2024

This was working with v5. Dependabot upgraded the action to v6. I’ve reverted back to v5

Looking at a comparison between v5 and v6 I’m wondering if the change to the way git fetch is invoked might be the cause? Maybe it’s a git version issue?

@peter-evans
Copy link
Owner

This was working with v5. Dependabot upgraded the action to v6. I’ve reverted back to v5

Can you confirm that it works now after reverting back. If that's the case I can try to narrow this down.

I’m wondering if the change to the way git fetch is invoked might be the cause?

Looking at the log, though, it's already past that point and created the PR branch successfully. The point where it fails is when it's calling the GitHub API to create the PR.

@peter-evans
Copy link
Owner

I think it's this change: main...infer-urls#diff-149194b9a050dce062d6db2902041d8d9b9c28245428d7435ed58a87fc25b3dc

I added a feature to infer the URLs of repositories so that PRs could be made across different GitHub platforms.

Are you using GHES or GHEC? It looks like the repo has a https://github.com/$repo URL, so perhaps not. Can you confirm your situation. The repo is hosted normally on github.com and you are using self-hosted runners?

This is the logic I added, which a suspect may not be working for your use case.

    if (githubServerHostname !== 'github.com') {
      options.baseUrl = `https://${githubServerHostname}/api/v3`
    } else {
      options.baseUrl = 'https://api.github.com'
    }

@kierans
Copy link
Author

kierans commented Feb 20, 2024

This was working with v5. Dependabot upgraded the action to v6. I’ve reverted back to v5

Can you confirm that it works now after reverting back. If that's the case I can try to narrow this down.

It is.

@kierans
Copy link
Author

kierans commented Feb 20, 2024

Are you using GHES or GHEC? It looks like the repo has a https://github.com/$repo URL, so perhaps not. Can you confirm your situation. The repo is hosted normally on github.com and you are using self-hosted runners?

Correct. The repo URL is https://github.com/$ORG/$REPO, and the organisation is using self hosted runners.

@peter-evans
Copy link
Owner

Correct. The repo URL is https://github.com/$ORG/$REPO, and the organisation is using self hosted runners.

I think it's unlikely that the "infer-urls" feature caused the issue then. 🤔
Just to confirm, it would be great if you could execute this on your self-hosted runner and let me know what the output is.

- run: echo $GITHUB_API_URL

@kierans
Copy link
Author

kierans commented Feb 21, 2024

> Run echo $GITHUB_API_URL
https://api.github.com/

@peter-evans
Copy link
Owner

> Run echo $GITHUB_API_URL
https://api.github.com/

This is what I would expect, so I don't think the "infer-urls" feature is the problem.

Perhaps this is proxy related. From the log it looks like you are behind a proxy for https requests. Please try setting the proxy explicitly like this:
https://github.com/peter-evans/create-pull-request?tab=readme-ov-file#proxy-support

It's possible that a library I was using has updated and somehow no longer automatically captures your proxy settings.

@kierans
Copy link
Author

kierans commented Feb 22, 2024

@peter-evans I tried setting the https_proxy. Unless I’m doiong something wrong, the proxy isn’t the issue.

Run peter-evans/create-pull-request@v6
  with:
    base: main
    ...
 env:
    httpProxy: http://172.17.0.1:3128
    https_proxy: http://172.17.0.1:3128
    
Create or update the pull request
  Attempting creation of pull request
  Error: fetch failed

@peter-evans
Copy link
Owner

peter-evans commented Feb 22, 2024

The workflow you posted starts with jobs:. Please can you show me the full workflow. Are you setting any environment variables at the top?

Also, please could you confirm that when you test v6 there are no other changes to the workflow, you are literally just changing the version of the action.

I'm leaning towards this being a proxy handling connection issue due to some changes to a library I'm using, but I'll try to figure this out! 😄

@peter-evans
Copy link
Owner

Not sure if this will fix it, but it's worth a shot. Please try this version of the action.

uses: peter-evans/create-pull-request@bump-proxy-agent

@kierans
Copy link
Author

kierans commented Feb 26, 2024

The workflow you posted starts with jobs:. Please can you show me the full workflow. Are you setting any environment variables at the top?

I am not setting anything extra in the workflow.

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - '**'
jobs:
  documentation:

...

The runner is setting the $http_proxy and $https_proxy env vars in the shell that the run is being run in.

As you can see from the logs the action is being run with https_proxy and httpProxy being set to a value

I’ve added the following step that creates the httpProxy var from the $https_proxy shell var to pass to the action ie:

 - name: Set up job
    run: |
        echo "httpProxy=${https_proxy}" >> $GITHUB_ENV
uses: peter-evans/create-pull-request@bump-proxy-agent
...
env:
    https_proxy: ${{ env.httpProxy }}

Also, please could you confirm that when you test v6 there are no other changes to the workflow, you are literally just changing the version of the action.

That is correct.

I'm leaning towards this being a proxy handling connection issue due to some changes to a library I'm using, but I'll try to figure this out! 😄

I’ve tried the bump-proxy-agent version and still getting “Fetch failed”.

Perhaps it’s worth trying to get more information from the error in the console to continue to aid the debugging efforts.

@peter-evans
Copy link
Owner

Perhaps it’s worth trying to get more information from the error in the console to continue to aid the debugging efforts.

Yes I think we need to try and narrow this down and make a minimal example that reproduces the problem. This must a rare edge case, because no other users have reported this (so far). This action is used a lot, so I would expect most issues to effect many users.

I'll have a think about how we can narrow this down.

@kierans
Copy link
Author

kierans commented Feb 28, 2024

Could it be a trust issue? The proxy will be issuing certs signed by the corporate CA. However, if the proxy/http library you’re using to make the call to create the PR doesn’t trust the proxy, that’s why the fetch could be failing. I’ve had this issue with other tools and have had to setup a trust store.

We might want to provide an option in the action to pass a trust store (PEM file) to the relevant lib to test this.

@peter-evans
Copy link
Owner

If it helps to track this down, I'm actually testing proxy support for the action in my test suite here:
https://github.com/peter-evans/create-pull-request-tests/blob/master/.github/workflows/test-command.yml#L1087-L1134

The test uses this image:
https://github.com/peter-evans/forward-proxy

Which is based on:
https://github.com/nadoo/glider

It seems to work fine. I don't really know much about proxies and trust stores so I'm not sure if that could be the issue.

@SatwaniGovind
Copy link

Hi @peter-evans any update on this?
I started facing this issue when I update to v6.
I was using v5, but as reported in #2790 , the PR creation step was failing.

@peter-evans
Copy link
Owner

Hi @SatwaniGovind

No, there is no update because I'm unable to reproduce this, and don't have a good idea of what the problem could be.

Perhaps you could help by explaining your use case in detail.

  • How are you using the action? (Ideally show workflow / link to runs / logs)
  • What runners are you using? (Github hosted, self-hosted, etc)
  • Are you using a proxy?
  • Any other details that would help us identify this issue

@SatwaniGovind
Copy link

  1. Due to security concerns, I can't share the workflow, but I can share the logs for the failure. They are as follows:

Create or update the pull request
Attempting creation of pull request
Error: fetch failed

  1. I am using self-hosted runners for the workflow that this action is a part of.
  2. No, the workflow is not using any proxy.

@raphperrin
Copy link

Am having the same issue with self-hosted runners.
Proxy is set in the runner itself

name: create-pr
on:
  workflow_dispatch:
  push:
    branches:
      - 'main'

permissions:
  contents: write
  pull-requests: write

jobs:
  updates:
    timeout-minutes: 10
    runs-on: [selfhosted-runner]
    steps:
    - name: Checkout
      uses: actions/checkout@v4.1.1

    - name:
	  run: echo "something" > README.md

    - name: Create PR
      uses: peter-evans/create-pull-request@v6.0.1
      with:
        token: ${{ secrets.GITHUB_TOKEN }}
        commit-message: "chore(deps): update readme"
        branch: feat/update-readme
        title: "chore(deps): update readme"
        labels: dependencies
        delete-branch: true

@SatwaniGovind
Copy link

So, if a non-self-hosted runner is used, will the wf run successfully?
@raphperrin

@raphperrin
Copy link

@SatwaniGovind yes or rollback to v5

@peter-evans
Copy link
Owner

To give us more time to figure this out, I've backported the fix for this issue to v5. So feel free to use v5 if it works for you as a workaround for this issue.

@peter-evans
Copy link
Owner

It looks like all of you are using self-hosted. So I think it's likely that this is affecting some particular setup for self-hosted instances.

Please could you give me information about the version of the Actions runner you are using. Also, if you could provide me with a Dockerfile for your self-hosted instance, that might be even better.

I'm still thinking that this could be proxy related. So any further details about that setup/configuration on the runner would also be great.

@peter-evans
Copy link
Owner

I've just tested my own self-hosted runner here: https://github.com/peter-evans/create-pull-request-tests-self-hosted

It works fine with v6. See run: https://github.com/peter-evans/create-pull-request-tests-self-hosted/actions/runs/8360055456

So this issue is not affecting all self-hosted setups, just some specific configurations it seems. Please share further details so we can narrow this down.

@kierans
Copy link
Author

kierans commented Mar 20, 2024

Thanks @peter-evans for your efforts here. I’ll try to share what I can, but I think most people who use self-hosted runners are corporates so it’s hard to share material. Chicken meet egg.

@SatwaniGovind
Copy link

Thanks @peter-evans for the fix in v5.0.3 . I just noticed one thing when I switched to v5.0.3 -- If a PR already exists from the same branch and the wf tries to update the same PR, the wf doesn't fail (like it was in v5) but it is not reflecting the new changes as well.

@peter-evans
Copy link
Owner

I just noticed one thing when I switched to v5.0.3 -- If a PR already exists from the same branch and the wf tries to update the same PR, the wf doesn't fail (like it was in v5) but it is not reflecting the new changes as well.

I've not made any other changes to v5 apart from backporting the fix, so any issue you may have is completely unrelated. Let's keep the discussion here on topic.

@0xbe7a
Copy link

0xbe7a commented Apr 5, 2024

I also have this problem in one of my runner environments. I have bisected the fault back to 21d8ea0. My other working environment runs within a Kubernetes cluster, but interacts with the same GitHub Enterprise server. In the working environment it is not required to use a proxy and my GHES URL is included in noProxy, whereas in my malfunctioning environment a proxy must be used to reach GHES. http_proxy, https_proxy, HTTP_PROXY, HTTPS_PROXY, no_proxy and NO_PROXY are correctly set. Using the node16 based action works, while running the commit introducing the node20 runtime triggers the failure.

Edit 1: I have captured all outgoing network traffic and for the final "create pull request" api call the action appears to not be respecting my proxy configuration but instead attempts to directly connect to my GHES which is not possible in this environment. All other interactions with my GHES before the failing api request are done using my HTTP proxy.

@0xbe7a
Copy link

0xbe7a commented Apr 5, 2024

If it helps to track this down, I'm actually testing proxy support for the action in my test suite here: https://github.com/peter-evans/create-pull-request-tests/blob/master/.github/workflows/test-command.yml#L1087-L1134

The test uses this image: https://github.com/peter-evans/forward-proxy

Which is based on: https://github.com/nadoo/glider

It seems to work fine. I don't really know much about proxies and trust stores so I'm not sure if that could be the issue.

Does this actually test if all requests are made using the proxy. I think you would need to block all direct connections to github.com for this test to catch the issue i am observing. I can also reproduce this without a GHES but instead with a self-hosted runner and the normal github.com instance.

@sdolender
Copy link

also noticed same issue on self-hosted runners ( behind proxy).

Maybe this case apply here:
Yes, in v8 of @octokit/request we switched to the Fetch API instead of node-fetch.

All usages of NodeJS based http(s).Agent will not work as they are incompatible with the fetch API.

octokit/rest.js#43

@JannikWibkerQC
Copy link

Octokit changed the behaviour of proxies as detailed in octokit/rest.js#43. Switching to undici should work:

(src/octokit-client.ts)

import {ProxyAgent, fetch as undiciFetch} from 'undici'

const proxyFetch = (proxyUrl: string): typeof undiciFetch =>
  (url, opts) => {
    return undiciFetch(url, {
      ...opts,
      dispatcher: new ProxyAgent({
        uri: proxyUrl,
        keepAliveTimeout: 10,
        keepAliveMaxTimeout: 10,
      })
    })
  }

export const getOctokit = (options: OctokitOptions & {baseUrl: string}) => {
  const baseUrl = options.baseUrl
  const proxyUrl = getProxyForUrl(baseUrl)
  const OctokitWithPlugins = Core.plugin(paginateRest, restEndpointMethods)

  const allOptions: OctokitOptions = {
    ...options,
    baseUrl,
    request: proxyUrl ? {fetch: proxyFetch(proxyUrl)} : undefined
  }

  return new OctokitWithPlugins(allOptions)
}

I tried creating a proxy setup with mitmproxy using which I could test this out properly but failed at that

@peter-evans
Copy link
Owner

@sdolender @JannikWibkerQC Thank you for pointing out this change in octokit. This is most likely the cause of the problem.

I've made a feature branch to test this. It seems to pass the tests I have for proxy support.
main...proxy-fix

Please test this version of the action and let me know if it solves the issue:

uses: peter-evans/create-pull-request@proxy-fix

@stagg
Copy link

stagg commented Apr 24, 2024

The proxy-fix version resolves the issue for our behind-a-proxy runner 🎉

@raphaelperrin24
Copy link

The fix also works for me

@peter-evans
Copy link
Owner

Thank you for testing!

The fix is released as v6.0.5 / v6.

@peter-evans
Copy link
Owner

I can also reproduce this without a GHES but instead with a self-hosted runner and the normal github.com instance.

@0xbe7a Please can you show me how you are reproducing the issue. I would like to correct my regression tests so that I can catch similar issues in future.

I tried to block direct connections, as you suggested, but it still seems to pass the test for v6.0.4. peter-evans/create-pull-request-tests@66679f5

@0xbe7a
Copy link

0xbe7a commented Apr 26, 2024

@0xbe7a Please can you show me how you are reproducing the issue. I would like to correct my regression tests so that I can catch similar issues in future.

I have not attempted to reproduce this with a GitHub hosted runner. Our self-hosted runner only has access to our http_proxy and all other outgoing permissions are blocked.

Looking at the test-setup i am not sure if the iptables rules is sufficient here. First from what i understand is that iptables resolves the domain name once at rule-creation time. From the manpage:

Hostnames will be resolved once only, before the rule is submitted to the kernel. Please note that specifying any name to be resolved with a remote query such as DNS is a really bad idea.

As github.com and api.github.com already resolve to different IPs, this is most likely not going to catch all connections.

@peter-evans
Copy link
Owner

I've fixed my proxy support test now. https://github.com/peter-evans/create-pull-request-tests/blob/32f87b10665d76f09808a07b2c4623a678b6bf89/.github/workflows/test-command.yml#L1087-L1118

This test now fails with v6.0.4, reproducing the fetch failed error. The latest version, v6.0.5 passes the test.

Apologies for breaking proxy support. This fix of the test should prevent it from happening again.

@0xbe7a
Copy link

0xbe7a commented Apr 26, 2024

That's pretty neat, thank you for your great effort here ❤️

@kierans
Copy link
Author

kierans commented May 1, 2024

@peter-evans Can confirm the fix works for me too. Thanks for your efforts here. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants