Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support qemu in GitHub Actions #482

Merged
merged 28 commits into from
Jan 1, 2021

Conversation

asfaltboy
Copy link
Contributor

@asfaltboy asfaltboy commented Dec 12, 2020

Looks like it works generally, tests are failing as we need to update expected versions, and some more

We should also discuss if we want to enable all architectures by default, or set the default to only x86/64 architectures, as this may considered a type of "breaking change".

fixes #364

- currently just prints out envs
@asfaltboy asfaltboy changed the title Support quemu on GitHub Support quemu in GitHub Actions Dec 12, 2020
@henryiii
Copy link
Contributor

Isn’t this just #469, but without the setup step? I am liking the idea of letting users activate this manually, then just expanding the set of arch’s. Can it be detected when enabled?

@asfaltboy
Copy link
Contributor Author

asfaltboy commented Dec 13, 2020

Actually missed that that work exists 😅 ...

Yes, we can detect supported architectures and installed emulators by running Tonis' latest binfmt:

docker run --privileged --rm tonistiigi/binfmt

In the latest version binfmt allows to install support for all or some architectures, as well as uninstalling, and showing what's installed.

This is what I did in this PR, i.e without first adding the support (whether via the github action or directly installing support), we would only build wheels for existing platforms on our linux machine.

As an example, if I uninstall all platforms on my Mac (docker desktop installs many platforms via binfmt_misc on bootstrap), running the docker container without arguments returns:

docker run -it --rm --privileged tonistiigi/binfmt:latest
{
  "supported": [
    "linux/amd64",
    "linux/386"
  ],
  "emulators": null
}

Which means this code would only build for 64/32 bit architectures.

uses: docker/setup-qemu-action@v1
with:
platforms: all
if: runner.os == 'Linux'
Copy link
Contributor Author

@asfaltboy asfaltboy Dec 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user can select which (additional linux) platforms they want to support with the above action.

In other CIs, they can simply execute docker run --privileged --rm tonistiigi/binfmt --install all (or any other platform selection), as per binfmt docs

This section, and more, can be added to the README for explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't running a test that uses this yet, though. Should we add a build with ARM or PowerPC? Possibly trying to not add too much, since it's probably 5-10x slower?

@asfaltboy asfaltboy changed the title Support quemu in GitHub Actions Support qemu in GitHub Actions Dec 13, 2020
@joerick
Copy link
Contributor

joerick commented Dec 16, 2020

Hello @asfaltboy! Thanks for this approach! It is similar to #469, but yes, a little lighter.

Due to the existence of Github Actions like docker/setup-qemu-action and Docker images like binfmt, I'm thinking that it perhaps does make sense for us to keep things simple here, and offload the qemu setup onto the CI config.

If we do that, all that cibuildwheel needs to do is to provide a way to change what archs are built. As discussed in #469, I'm in favour of an explicit control here, so we can raise errors when things are misconfigured. I'm thinking something like:

$ cibuildwheel --help
...
optional arguments:
  --archs {x86_64,i686,aarch64,ppc64le,s390x}
                        Comma-separated list of CPU architectures to build for.
                        If unspecified, builds the architectures natively supported
                        on this machine. Set this option to build an architecture
                        via emulation, for example, using binfmt_misc and qemu.
...

I think the changes to our code would be pretty minimal then, but we could still have e.g. aarch64 manylinux builds in our CI using the docker/setup-qemu-action action, and we can provide example configs in our docs to help people set it up.

@asfaltboy
Copy link
Contributor Author

I tend to agree, in that case we would be able to remove the lines here, and replace them with updated documentation explaining how to set qemu up.

Then the current tests should pass as is, although I'd probably like to add some additional tests for the new --archs arg.

If you're in favour of this, I can start tackling this later today perhaps

@joerick
Copy link
Contributor

joerick commented Dec 16, 2020

Yes that sounds good! As long as @henryiii or @YannickJadoul don't have any issues with the above approach?

@henryiii
Copy link
Contributor

I think this is basically what I suggested in #469 (comment) under 2) - yes, I think this is best. Seems to be the simplest and most flexible solution.

target_archs = architectures or [Architecture(platform.machine())]
# x86_64 machines can run i686 docker containers
if Architecture.i686 not in target_archs and Architecture.x86_64 in target_archs:
target_archs.append(Architecture.i686)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this only be appended if this is not specified manually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the Architecture.i686 not in target_archs part of the condition checks that it wasn't specified manually. I found this way is more explicit and allows to specify both, or i686 (without x86), or x86 (which includes both) on their own.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do cibuildwheel . then I get the native archs (x86_64 and i686) but if I specify manually cibuildwheel --archs "x86_64" ., I should only get the x86_64 build. So adding the i686 build should only happen if the user doesn't specify archs in their options.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "native", which is both the default and allowed in the list would be better. native would expand to i686 and x86_64. Then you can do native + something else, or you can specify just one of these and not get the other.

Copy link
Contributor

@joerick joerick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good so far...

I'm also wondering, for completeness, how this --archs option would relate to macOS/Windows, too. BUILD/SKIP has that pretty well covered, right now, but it might be easier to understand if --archs works the same across all platforms?

If we start with Windows, since the macOS situation is still evolving (#484). I think a sensible approach would be to keep x86_64 and i686 as the canonical descriptions of the archs, and map them to win_amd64 and win32 inside the script.

(of course, I don't really imagine anyone would use --archs on macos/windows, it's mostly just for neatness i'm thinking about it)

The other option would just be that we document that this option only works on linux.

cibuildwheel/linux.py Outdated Show resolved Hide resolved
cibuildwheel/__main__.py Outdated Show resolved Hide resolved
@Czaki
Copy link
Contributor

Czaki commented Dec 17, 2020

I'm also wondering, for completeness, how this --archs option would relate to macOS/Windows, too. BUILD/SKIP has that pretty well covered, right now, but it might be easier to understand if --archs works the same across all platforms?

As I read comments in this thread I start thinking if it is possible to go back to an environment variable. Environment variables are simpler to use in the matrix of jobs.

Because of my preference to use a job matrix for a simple package, I also prefer to use a colon separate list of arch instead of multiple add arguments like cibuildwheel --arch x86_64 --arch i686 --arch ppc64le.

I also like CIBW_ARCHITECTURES idea with the default value 686 x86_64 universal2.

@henryiii
Copy link
Contributor

henryiii commented Dec 17, 2020

I don't really imagine anyone would use --archs on macos/windows

In an ideal world, if we could control this, it would be nice to have --arch x86_64 universal2 arm64 options to enable these builds, with auto being the native binaries.

Would it make sense to have auto or maybe better yet native be the default? Then you could do --arch=native,ppc64le to combine native (32 & 64 bit on non-macOS) with an emulated build without having to list out both.

@henryiii
Copy link
Contributor

@asfaltboy let me know if you need help. :) Looks like #484 will end up building on this.

@ogrisel
Copy link

ogrisel commented Dec 18, 2020

If --arch means building in an emulated environment, what flag would be used for cross-compiling without emulation?

For information, conda-forge has been using crossenv successfully to cross-compile conda binary packages to target the macos/arm64 platform on macos/ix86_64 host:

If cibuildwheel was to add support for cross-compiling, one option would be to use another dedicated flag such as --xarch for instance to make it possible to switch between emulation and crossenv.

Pros and cons of emulation vs cross-compiling:

  • cross compiling makes it possible to support architectures for which there is no free CI service (e.g. macos/arm64 from a macos/x86_64 host on Azure Pipelines / Github Actions)
  • cross compiling is fast but cannot execute the test suite
  • QEMU on works on linux hosts and AFAIK not possible to build windows or macos wheels with it (because one does not have access to the runtime libs for those)
  • emulation is slow but can execute the test suite (assuming it is not too long to avoid a timeout)

@Czaki
Copy link
Contributor

Czaki commented Dec 18, 2020

If --arch means building in an emulated environment, what flag would be used for cross-compiling without emulation?

What is the difference for user if compilation is done with, or without emulation? Emulation is needed rather for testing.

I do not think that there is a need to create a separate method to distinguish method to achieve build wheel for a given architecture.

@ogrisel
Copy link

ogrisel commented Dec 18, 2020

But we need a way for the maintainer whether to use cross-compiling or QEMU emulation when running cibuildwheel on a CI host, no?

@henryiii
Copy link
Contributor

henryiii commented Dec 18, 2020

The speed here is horrible for compiling (a 5 minute compile becomes 50 minutes). If we could cross compile and then emulate only for testing, that would be a huge win. Also would be clearer with macOS - we can cross-compile, but not emulate on Intel, so that's why testing is unavailable. For Apple Silicon, we can cross compile and emulate, so that's why testing can be used. And why you only need the arch -x line for the tests, and not the compile.

@henryiii
Copy link
Contributor

henryiii commented Dec 18, 2020

But I think that's orthogonal to this; here we are only providing a way to override the listed supported architectures, and then assuming you can run the docker containers. Currently, we only support native docker containers. I assume you'd have to have to use the normal manylinux docker container for cross compiling, since you are running the builder's arch not the host arch. Then you'd need to "trick" it inside the docker container with something like the package linked above. If you give cibuildwheel a docker image for CIBW_MANYLINUX_AARCH64_IMAGE that is x86, force the --archs as seen above, then all you'd need is the new build procedure with the trick above (which might be something you could "burn-in" to the docker image itself, possibly? Or you'd need a cross compile flag/var and a new procedure.). I'd like to see PyPA provide cross manylinux images, I think.

It's more relevant with the macOS change - as that's not tricking the system into running a different docker container, so is really much more inline with the "cross-compile" change than this one.

Using `--archs` for the flag, comma separated, also accepting CIBW_ARCHS.
Supports "auto" as the default that can also expand, matching
CIBW_PLATFORM.
@joerick
Copy link
Contributor

joerick commented Dec 31, 2020

I have the changes for using a set on windows in a branch in my fork:
mayeut/cibuildwheel@b5e6207 + mayeut/cibuildwheel@3958a51

I can push those commits in the PR if required.

I don't know about anyone else, but the discussion around this PR is getting a little confusing (and there is a lot of it!). Would you be amenable to merging this just supporting Linux, and we could open a new PR to extend support for Windows? We'd have docs updates to do for that too.

@henryiii
Copy link
Contributor

I'd be fine with that, but let's get #502 in and make a new release before we hit merge on this one!

@mayeut
Copy link
Member

mayeut commented Dec 31, 2020

Would you be amenable to merging this just supporting Linux, and we could open a new PR to extend support for Windows? We'd have docs updates to do for that too.

@joerick, I'm fine with that option.

@henryiii
Copy link
Contributor

henryiii commented Jan 1, 2021

I'm ready for a merge when you are, @joerick :)

@janaknat
Copy link

@henryiii Is this available in the latest PyPI cibuildwheel release?

@henryiii
Copy link
Contributor

Nope, only in master for the moment. We are working on Apple Silicon support and minor adjustments to the interface that affect --archs slightly. I don't think it's horribly far off, but for now, you can use a recent commit?

@janaknat
Copy link

Ok. I'm looking to add aarch64 support for markupsafe using GHA and PyPI cibuildwheel. I'll give master a shot. Any ideas on when it'll be available in the PyPI version?

@joerick
Copy link
Contributor

joerick commented Jan 14, 2021

It's a good question. #484 needs a bit more work - mostly docs - but is also waiting on a few upstream releases (packaging, pip and virtualenv), which could be two weeks or more away. Seems a bit of a shame to hold this for so long... options here-

  1. wait for upstream and merge Universal2 wheels on macOS #484 without workarounds
  2. merge Universal2 wheels on macOS #484 with workarounds and release soon (in the next week, perhaps)
  3. release now without merging Universal2 wheels on macOS #484, since we pretty much know what's happening design-wise with CIBW_ARCHS now, and the way it works in this PR seems compatible with Universal2 wheels on macOS #484.

I'd be happy with (3), I think, and it would be the quickest route to getting this on PyPI. Any thoughts @YannickJadoul @henryiii ?

@henryiii
Copy link
Contributor

henryiii commented Jan 14, 2021

I think we can aim for emulation in 1.8, and Universal2 in 1.9, that sounds good to me. Leads up perfectly to 2.0 :)

I'd like to get the change to ARCH we discussed (auto, native, all) in before we release. I think #496 is ready, though doesn't really affect the release much, it would keep the diff down for #484, as it will be affected. (See #528)

The other two changes I would have liked, but could be done later: the requires-python flag (since the sooner users start specifying that, the fewer changes will be needed when we up the "default" minimum), and the addition of {} syntax.

My wife is in the hospital for high blood pressure due to pregnancy; so any time in the next 7 weeks I might mostly drop out for a while.

@YannickJadoul
Copy link
Member

2 notes on this:

  • Does that include docs?
  • Do we want to include a caveat about there being a (reasonably small) chance the archs part will still change?

@henryiii
Copy link
Contributor

Just pushed #535.

@joerick
Copy link
Contributor

joerick commented Jan 14, 2021

the change to ARCH we discussed (auto, native, all) in before we release

Ah yes, excellent point. I had forgotten about that.

I think #496 is ready, though doesn't really affect the release much, it would keep the diff down for #484, as it will be affected. (See #528)

I'm just reviewing #496 now.

@janaknat
Copy link

@joerick Any consensus on when the CIBW_ARCH option will be available on PyPI?

@henryiii
Copy link
Contributor

Current plan is "this week", I think we are still on target for that.

@joerick
Copy link
Contributor

joerick commented Jan 21, 2021

Yes. Probably tomorrow :)

@russkel
Copy link

russkel commented Jan 21, 2021

🥳

@brettlangdon
Copy link

I just set this up on our project to try out before it is released. It appears to be working as intended!

One thing I noticed is that aarch64 jobs take considerably longer than x86_64/i686 builds.

https://github.com/DataDog/dd-trace-py/actions/runs/502345587

aarch64 on linux taking 24m
x86_64/i686 on linux taking 5m
x86_64/i686 on mac-os or windows about 10-15m

I do not have a ton of experience building on aarch64, is this expected or is there something I may have misconfigured?

Our GitHub action if it helps: https://github.com/DataDog/dd-trace-py/blob/4a456dd839d1c5bdb052a17f669561f479c5c3c9/.github/workflows/build_deploy.yml#L76-L110

We build on Python 2.7, 3.5, 3.6, 3.7, 3.8, 3.9

@russkel
Copy link

russkel commented Jan 21, 2021

Emulating another architecture will be quite slow, I don't imagine there's a huge amount of CPU power behind the GHA runners.

@brettlangdon
Copy link

@russkel yeah, exactly what I was thinking as well. Didn't seem alarming, but figured I'd share 👍🏻

@henryiii
Copy link
Contributor

Emulating an arch is slow. For boost-histogram, it takes about 50 mins per Python version instead of 5-10 mins (unless I accidentally build NumPy, which takes more than 50 mins by itself), pushing me close to the 6 hour limit.

Cross compiling is fast, but support for that is pretty spotty in the Python Linux ecosystem currently. Maybe in the future cibuildwheel will be able to support that too, though it will always probably take a little work from the package as well. Setuptools burns in the wrong python executable for the console entry points, for example.

@brettlangdon
Copy link

@henryiii good insights, it seems at least for our package 25m for all python versions on aarch64 isn't too bad then. Thanks!

@asfaltboy asfaltboy deleted the support-quemu-on-github branch January 26, 2021 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Seamless ARM builds using emulation through qemu docker images