Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrapper fails with obscure error message if terraform validate is killed #276

Open
oscep opened this issue Dec 2, 2022 · 10 comments
Open

Comments

@oscep
Copy link

oscep commented Dec 2, 2022

Our GitHub Action performing terraform validate fails with an obscure error message on a commit it previously validated successfully:

Run terraform validate -no-color
/home/runner/work/_temp/12cb240b-5a9f-44d7-9df4-0c2e0c97750d/terraform-bin validate -no-color
/home/runner/work/_temp/12cb240b-5a9f-44d7-9df4-0c2e0c97750d/terraform:4211
  core.setOutput('exitcode', exitCode.toString(10));
                                      ^

TypeError: Cannot read properties of null (reading 'toString')
    at /home/runner/work/_temp/12cb240b-5a9f-44d7-9df4-0c2e0c97750d/terraform:4211:39
Error: Process completed with exit code 1.

Here is our .github/workflows/terraform_pr.yml:

name: Terraform PR Check

on: [pull_request]

jobs:
  terraform:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - name: HashiCorp - Setup Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.3.4

      - name: Terraform fmt
        id: fmt
        run: terraform fmt -check -recursive

      - name: Terraform Init
        id: init
        run: terraform init -backend=false

      - name: Terraform Validate
        id: validate
        run: terraform validate -no-color

Disabling the wrapper script with

     uses: hashicorp/setup-terraform@v2
     with:
       terraform_version: 1.3.4
      terraform_wrapper: false

and re-running the job led to this error message:

Run terraform validate -no-color
/home/runner/work/_temp/0bb3bce2-0453-4b29-9ccb-19976a647ace.sh: line 1:  1683 Killed                  terraform validate -no-color
Error: Process completed with exit code 137.

Therefore, I believe the error message presented above is caused by terraform being killed.
Is it possible to detect this and improve the error message?

@oscep oscep changed the title Upgrading to setup-terraform 2.x breaks previously working terraform validate invocation Upgrading setup-terraform from 1.3.2 to 1.4.0 or 2.x breaks previously working terraform validate invocation Dec 2, 2022
@oscep oscep changed the title Upgrading setup-terraform from 1.3.2 to 1.4.0 or 2.x breaks previously working terraform validate invocation Upgrading setup-terraform from 1.3.2 to 2.x breaks previously working terraform validate invocation Dec 2, 2022
@oscep oscep changed the title Upgrading setup-terraform from 1.3.2 to 2.x breaks previously working terraform validate invocation Upgrading setup-terraform from 1.3.2 to 1.4.0 (and 2.x) breaks previously working terraform validate invocation Dec 2, 2022
@oscep
Copy link
Author

oscep commented Dec 2, 2022

Running the check with debug output enabled leads to this message:

##[debug]Terraform exited with code null.

Screenshot 2022-12-02 at 15 18 31

@oscep oscep changed the title Upgrading setup-terraform from 1.3.2 to 1.4.0 (and 2.x) breaks previously working terraform validate invocation wrapper fails with obscure error message if terraform validate is killed Dec 2, 2022
@kishaningithub
Copy link

kishaningithub commented May 19, 2023

i am also getting the same error any workarounds or fixes?

@kishaningithub
Copy link

By looking at the code i see that the exitCode variable is populated by the result of exec node js function call

const exitCode = await exec(pathToCLI, args, options);

These are the reasons due to which node js exec call can return a null result

@bflad
Copy link
Member

bflad commented May 19, 2023

@kishaningithub do you believe your Terraform CLI command would've generated more than 1MB of output? Regardless, it seems reasonable to increase the command execution maxBuffer configuration in this case to try and prevent confusion if it would help reduce the possibility for this type of NodeJS error. I think we'd be happy to review that sort of contribution.

One potential workaround in this case is when the GitHub Actions outputs are not needed from the command would be to disable the NodeJS wrapper terraform_wrapper: false in the workflow configuration -- the command output is still available as text in the GitHub Actions logs and a non-zero command exit status would still fail the workflow.

@kishaningithub
Copy link

kishaningithub commented May 26, 2023

Upon further analysis i found that the exec() function being called is not the node js exec function but rather the exec in actions toolkit which does streaming on to stdout and stderr based on the registered handler. So the issue is not with the amount of lines being generated as such in my case.

In my case, the reason why terraform was getting killed was due to OOM killer (found it by looking at /var/log/kern.log).

It would be great IMO if we can log this error possibly by adding a try catch in the async function after await so that people facing this dont have to go to the kernel log to confirm if it was indeed the OOM killer that killed the process

If you accept, would be happy to raise a PR which does this logging

@danielkubat
Copy link

danielkubat commented Oct 31, 2023

I'm facing same issue as described above (node 20, setup-terraform v3)

image

@kishaningithub did you manage to fix this or you are still facing the issue?

@kishaningithub
Copy link

@danielkubat Is the above on a public github runner?

@danielkubat
Copy link

@kishaningithub nope, self-hosted runner on GKE (using ARC) with 8GB request/limit per pod.

@kishaningithub
Copy link

kishaningithub commented Oct 31, 2023

@danielkubat Can you check the logs (kernlog, dmesg etc) and find what killed the terraform process ?

If you think it was an OOM killer can you try increasing the specs atleast temporarily?

@danielkubat
Copy link

danielkubat commented Nov 7, 2023

@kishaningithub nope, self-hosted runner on GKE (using ARC) with 8GB request/limit per pod.

...but, I have multiple groups, and didn't set runner group correctly in action. In other words, issue was related to lack of memory.

@kishaningithub you was right, thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants