Skip to content
Helio Machado edited this page Jun 17, 2022 · 2 revisions

Publishing some old snippets I wrote months ago, moved from https://github.com/iterative/cml/issues/852#issuecomment-1014955752

Terraform with cml runner

The following code snippets produce a full trace-level log of the Terraform provider, useful to diagnose a lot of hard to reproduce bugs related to cml-runner --cloud and cloud instances.

GitLab — .gitlab-ci.yml

debug:
  when: always
  image: iterativeai/cml
  variables:
    TF_LOG: trace
    TF_LOG_PATH: /tmp/terraform.log
  script:
    - cml-runner
      --cloud=aws
      --cloud-region=us-west-1
      --cloud-type=t2.micro
      || true
    - cat "$TF_LOG_PATH"

GitHub — .github/workflows/debug.yml

on: push
env:
  TF_LOG: trace
  TF_LOG_PATH: /tmp/terraform.log
jobs:
  debug:
    runs-on: ubuntu-latest
    steps:
      - uses: iterative/setup-cml@v1
      - run: >-
          cml-runner
          --cloud=aws
          --cloud-region=us-west-1
          --cloud-type=t2.micro
          || true
      - run: cat "$TF_LOG_PATH"

Debugging GitLab CI/CD with tmate

debug:
  when: always
  script:
    - mkdir -p ~/.ssh && printf 'y\n\n' | ssh-keygen -q -t rsa -N '' -f ~/.ssh/id_rsa    
    - apt update && apt install --yes tmate expect    
    - TERM=xterm unbuffer ./tmate -FS /tmp/tmate.sock | cat

Using cml-runner with --cloud-ssh-private

cml runner ··· --cloud-ssh-private="$(cat ~/.ssh/id_rsa)"

You can get the instance address by setting the TF_LOG and TF_LOG_PATH environment variables and searching for instance address in the logs.

Debugging cml-runner --cloud=aws

on: push
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: iterative/setup-cml@v1
      - run: >-
          cml-runner
          --labels=test
          --cloud=aws
          --cloud-region=eu-west
          --cloud-type=g4dn.xlarge
          --cloud-spot
        env:
          REPO_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run:
    needs: deploy
    runs-on:
      - self-hosted
      - test
    steps:
      - run: |
          set -x
          cat /var/log/cloud-init.log || true
          cat /var/log/cloud-init-output.log || true
          journalctl -u cml || true
          nvidia-smi || true