Skip to content

Debugging CI guidelines

Andrew Nelson edited this page Dec 26, 2023 · 3 revisions

Debugging CI

This is a guide of how to debug and deal with CI related issues, specifically efficient workflows for when developing/modifying CI configurations. This is important because debugging CI configurations has several difficulties:

  • it often takes multiple iterations to get CI configurations correct. One wants each iteration to complete quickly to save time.
  • iterating in CI uses expensive resources. e.g. for numpy/scipy cirrus-ci costs money each time processes start up.
  • they're on remote resources, so it can be difficult to figure out what went wrong.

You're much better off doing all this iteration locally, before resorting to CI configuration debugging remotely. When you start debugging remotely you should:

  1. try running the jobs on your own fork first. This may require a bit of editing of the relevant CI config files so they run on your fork. e.g. change lines like if: "github.repository == 'numpy/numpy'" to `if: "github.repository == 'my_gh_username/numpy'"``.
  2. comment out the jobs that you're not working on, so they don't start running.
  3. use commit messages to skip CI entries you're not working on, e.g. [skip cirrus][skip actions].

Direct debugging on your own OS

This is probably the simplest way of debugging CI configuration/test execution, and should be the first port of call for a developer familiar with scipy/numpy building and development. Examine the CI configuration in the source repository and try to follow the same steps in installing any dependencies, and use the same build/test steps. You should typically use a virtualenv/conda environment to ensure isolation for the Python setup. However, installation of any dependencies will affect the state of your computer, e.g. by installing compilers/libraries/packages. This may make reproducibility difficult as it's harder to get back to a pristine OS state. Sometimes you would like a pristine OS state to debug in, i.e. something similar to that used in CI. This requires virtualisation.

In addition, if you wish to perform cibuildwheel runs you are recommended to use virtualisation, because it requires the official CPython install, etc.

Debugging in a container (virtualisation)

Using docker to debug Linux configurations

docker can run on Linux/macOS/Win, which will allow you to debug Linux based builds on any of those OS. In particular, ARM based macs allow you to run x86_64 or aarch64 platforms (possibly others) which makes them a good OS for debugging.

You first need to install Docker. Once Docker is running (this will be Docker Desktop on macOS/win), then you can start an image running from a terminal. Some relevant invocations for starting the container are:

# run an aarch64 container
docker run -it --platform=linux/aarch64 -v $PWD:/io ubuntu:22.04

# an x86_64 container
docker run -it --platform=linux/amd64 -v $PWD:/io ubuntu:22.04

# a musl x86_64 container
docker run -it --platform=linux/amd64 -v $PWD:/io quay.io/pypa/musllinux_1_1_x86_64

If you're not on a mac you may have to omit the platform flag. I assume that you're in the root directory of the repo, the -v flag will then mount the repo in the /io directory of the container. Other relevant containers are those used by cibuildwheel. The cibuildwheel images have a variety of python versions. Basically you should be using the same type of image that the CI agent runs.

Once you've started the image navigate to /io in the terminal, and issue the same commands that the CI configuration uses in order to debug. If things go wrong simply exit the container and restart it.

Building linux wheels using cibuildwheel inside a docker container

TODO

Using tart to debug macOS in a container

If you're on macOS you can use tart to start a macOS virtual machine:

# use homebrew to install the `tart` virtualisation utility
brew install cirruslabs/cli/tart

# download a relevant image (it's a v large download)
tart clone ghcr.io/cirruslabs/macos-sonoma-xcode:latest sonoma-xcode

The macOS images used by cirrus-ci can be found at https://github.com/orgs/cirruslabs/packages?tab=packages&q=macos.

Then you can start the VM container with:

tart run --dir=$PWD sonoma-xcode &

you should see a Sonoma GUI window appear. The --dir flag specifies a directory from the host to be shared with the VM. Here I assume that you're in the root directory of the repo (hence $PWD). This directory is shared to "/Volumes/My Shared Files". You can then debug any macOS CI configuration inside that VM, i.e. follow the same steps listed in the CI configuration. It's also possible to SSH to the VM from the host computer using:

ssh admin@$(tart ip sonoma-xcode)

The password is admin. That particular Sonoma image is exactly the same as that used by numpy/scipy on cirrus-ci (it has Xcode, git, etc, installed).

To recover a pristine VM you will need to remove any changes that you made tart delete sonoma-xcode && tart clone ghcr.io/cirruslabs/macos-sonoma-xcode:latest sonoma-xcode.

This comment may be useful for some points.

Local build of macOS wheels using cibuildwheel in a macOS (tart) VM

Follow the above steps to start a macOS VM.

  • Once the tart UI has started install the CPython you want to target (e.g. https://www.python.org/ftp/python/3.12.1/python-3.12.1-macos11.pkg). This is because when you eventually run cibuildwheel in the container it needs an official CPython install. Only when cibuildwheel is run in CI will it use other python installs. It's possible to run CI configurations locally, this will be addressed in a separate section.
  • Navigate to Applications/CPython and run the install certificates command file. Otherwise you won't be able to use pip to install anything.
  • SSH into the VM from the host computer.
brew update
brew upgrade
brew install micromamba
/opt/homebrew/opt/micromamba/bin/micromamba shell init -s bash -p ~/micromamba
source ~/.bash_profile
# also installs the necessary compilers (i.e. gfortran) required to build wheels
micromamba create -n dev3 -c conda-forge python compilers
micromamba activate dev3
pip install cibuildwheel

# assuming that you shared the repo directory with the VM
cd /Volumes/My\ Shared\ Files 
# alternatively use the necessary git commands to clone/checkout the repository location/branch you're trying to build
# git clone https://github.com/scipy/scipy

# this needs to be the same cpython version installed from python.org.
export CIBW_BUILD=cp312-macosx_arm64

cibuildwheel --platform macos

Running cirrus-ci configurations locally

One can run the actual cirrus-ci configurations used by scipy/numpy on a local computer using the cirrus-cli. Follow these installation instructions. On macOS it's simply brew install cirruslabs/cli/cirrus. On macOS you should install tart as well (see above). On Linux you need to install docker.

In the root directory of the repo you'll need to create a .cirrus.yml file. When the agent runs on cirrus-ci it uses the .cirrus.star file to dynamically generate a yml configuration. So you'll need to inspect the .cirrus.star file to figure out from yml files it references to extract the relevant code you're trying to run.

For scipy the yml files are at ci/cirrus_general_ci.yml and ci/cirrus_wheels.yml. For numpy they're at tools/ci/cirrus_arm.yml and tools/ci/cirrus_wheels.yml. In the simplest approach you could copy one of the files (depending on what you're debugging), placing it in the root of the repo. Then you use:

cirrus run

# if you're on macOS and you've already downloaded images you can use
cirrus run --tart-lazy-pull

Running github actions configurations locally

Here one uses the act utility. On macOS you can install with brew install act. There are install and usage instructions on the act repo. You will also have to have docker installed (see above).

I'm still figuring out how act works, so this is a work in progress.

# the --container-architecture flag is necessary if you're on macOS with M* architecture because the GH runs are on x86_64 computers.
# the -W flag specifies which GHA workflow you want to use
# the -j flag specifies which job ID you want to run
# the "on: push" tells act what kind of event is occurring. You could use "on: pull_request", "on: workflow_dispatch", "on: schedule" instead.
# the --remote-name flag is the remote name that will be used to retrieve url of git repo (default "origin"). You can get the remote names by typing `git remote -v`.
act "on: push" -W .github/workflows/linux.yml -v -j full --container-architecture linux/amd64 --remote-name numpy

# you can specify the GITHUB_REPOSITORY using an env var instead of using `--remote-name`. You can change other github context values (https://docs.github.com/en/actions/learn-github-actions/contexts#github-context) in this manner. e.g. `GITHUB_SHA`, ...
act "on: push" -W .github/workflows/linux.yml -v -j full --container-architecture=linux/amd64 --env GITHUB_REPOSITORY=numpy/numpy

It's also possible to specify the GH event information using the -e flag. I don't know how to create the json payload yet. This json payload contains variables such as github.repository, etc. If those variables are necessary for the action to run then we'll have to work on understanding this event payload more completely.

Clone this wiki locally