Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci(docker): distribute build across runners #6191

Closed
wants to merge 1 commit into from

Conversation

crazy-max
Copy link
Contributor

follow-up #5102 (comment)

Made some changes in the workflow so build for each platform is distributed across runners per docs: https://docs.docker.com/build/ci/github-actions/multi-platform/#with-bake

@squidfunk
Copy link
Owner

Awesome, thanks for working on this! Just request a review from me once you consider it done 🚀

@crazy-max
Copy link
Contributor Author

Awesome, thanks for working on this! Just request a review from me once you consider it done 🚀

I was comparing build time with and without distributed mode and does not seem to improve that much (~2m):

Before:

image

After:

image

I'm surprised arm/v7 takes longer than arm64. It should be almost the same build time (~8m). I'm taking a closer look.

@squidfunk
Copy link
Owner

@crazy-max thanks for all your work on parallelizing our Docker build. Do you think we can make further improvements? I'm reading that the gain is very small, compared to the amount of changes in the code that we'd need to maintain, so I'm asking myself if it's worth pursuing this direction. Note that we can reopen and reconsider at any time, if anything improves.

@polarathene
Copy link
Contributor

ARMv7 (Advise to drop support, obsolete)

Do you think we can make further improvements?

Since your biggest bottleneck is the ARMv7 image, perhaps consider dropping it? DMS did this a while back, view the deprecation notice issue I wrote up which has plenty of details to support it's removal, along with a more recent response to someone who requested ARMv7 images:

You may be able to look into actual activity pull history stats on DockerHub, for the ARMv7 platform images published?


Had a quick look through issues, pr and discussion history for this project and I don't see any actual requests for ARMv7 support? It seems that it was just added in the initial support as a "why not?"? 🤷‍♂️

I'd just drop it, or give some notice for deprecation to see if there's any interest in keeping it, it's unlikely you'd have many users needing it ARMv7 is very old (obsoleted almost a decade ago?). Doing this would halve the build time.


Workflow for building / publishing docker images (simpler to maintain?)

As for the workflow itself, if you'd like to see what DMS does for building and publishing docker images, we have a separate workflow for each (since we build for test-suite, but only publish the master branch or tagged releases):

Those are a bit verbose due to added commentary for better supporting maintainers on the rare occasion that they need to understand any of it.

The generic workflow approach them to be used by the main workflows which are far simpler to grok.

Using generic workflows

Without visiting those links, you can see how easy it is to grok:

# Rough example of a workflow that builds then publishes via the generic workflows:
jobs:
  build-image:
    name: 'Build AMD64 Image'
    uses: docker-mailserver/docker-mailserver/.github/workflows/generic_build.yml@master

  publish-images:
    name: 'Publish AMD64 and ARM64 Image'
    needs: [build-image, run-tests]
    uses: docker-mailserver/docker-mailserver/.github/workflows/generic_publish.yml@master
    with:
      cache-key: ${{ needs.build-image.outputs.build-cache-key }}
    secrets: inherit

Generic Build

And here's the core parts for building and publishing (with some overlap in steps due to separate workflows):

# generic build example with cache optimizations omitted:

jobs:
  build-image:
    name: 'Build'
    runs-on: ubuntu-22.04
    steps:
      - name: 'Checkout'
        uses: actions/checkout@v4

      - name: 'Set up QEMU'
        uses: docker/setup-qemu-action@v3.0.0
        with:
          platforms: arm64

      - name: 'Set up Docker Buildx'
        uses: docker/setup-buildx-action@v3.3.0

      # Unless we've requested more platforms, by default we're building AMD64 only to run tests,
      # this is faster as there's no point building for other platforms with a PR if tests fail
      - name: 'Build images'
        uses: docker/build-push-action@v5.3.0
        with:
          context: .
          # Build at least the AMD64 image (which runs against the test suite).
          platforms: ${{ inputs.platforms }}

Generic Publish

# generic publish example with cache optimizations omitted:

jobs:
  publish-images:
    name: 'Publish'
    runs-on: ubuntu-22.04
    steps:
      - name: 'Checkout'
        uses: actions/checkout@v4

      # Easy tagging for your images, especially if you publish to more than one registry (DockerHub + GHCR)
      # - If it's a tagged release, takes semver from our git tag and will later publish image with all 3 semver tags (this is useful, eg for watchtower)
      # - If this workflow was triggered by a push to the master branch instead of a tag, use our preview tag "edge" instead
      - name: 'Prepare tags'
        id: prep
        uses: docker/metadata-action@v5.5.1
        with:
          images: |
            ${{ secrets.DOCKER_REPOSITORY }}
            ${{ secrets.GHCR_REPOSITORY }}
          tags: |
            type=edge,branch=master
            type=semver,pattern={{major}}
            type=semver,pattern={{major}}.{{minor}}
            type=semver,pattern={{major}}.{{minor}}.{{patch}}

      - name: 'Set up QEMU'
        uses: docker/setup-qemu-action@v3.0.0
        with:
          platforms: arm64

      - name: 'Set up Docker Buildx'
        uses: docker/setup-buildx-action@v3.3.0

     #
     # Actual publishing begins here:
     #

      # Similar block for GHCR registry if you want to publish there too
      - name: 'Login to DockerHub'
        uses: docker/login-action@v3
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}

      # If the images weren't cached from a prior build, they'll be built here
      - name: 'Build and publish images'
        uses: docker/build-push-action@v5.3.0
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          push: true
          tags: ${{ steps.prep.outputs.tags }}

Additional info

Hopefully that's not too scary 😅 Presently DMS builds ARM64 separately, and we're only caching AMD64

  • This is primarily due to a lack of time and priority to update the workflow to also cache AMD64.
  • Previously Docker did not support locally building (storing?) multi-platform images, so only building AMD64 was simpler for our testsuite.
    • It was fine to build for multiple platforms when publishing as can be seen at the end of the last snippet, no need for all the extra work with manifest merging then.
    • Recent releases of Docker support multiplatform images locally (at least with the containerd store AFAIK), although I haven't tried that with GH Actions yet.

Our test workflow uses a matrix to speed that up. This way the generic build workflow can build an image, upload to cache and then multiple runners can pull that to run tests in parallel which was a great improvement. Afterwards publishing benefits in the same way via cache.

I've omitted the caching from DMS workflows. There are better caching options now AFAIK, while the one I wrote for DMS derives a cache key and needs a workaround for minimizing what is uploaded.

I hope this feedback is helpful! 😎

@squidfunk
Copy link
Owner

Thanks @polarathene for the breakdown! I'm not feeling confident enough to decide whether we can/want to drop ARMv7 support. However, the workflows to build and publish the image are essentially the same we use, albeit we're using a consolidated workflow. IIRC, it was a deliberate decision to use a single workflow, but I wasn't the one working on it last.

Related:

I think, for now, we should consider this PR stale, because the upside of merging it is too low for the inreased complexity and amount of code that there is to maintain. If somebody wants to work on this, please cook something up on a dedicated repository for testing and demonstration, and let us know. We can consider merging it then.

@squidfunk squidfunk closed this May 3, 2024
@polarathene
Copy link
Contributor

I'm not feeling confident enough to decide whether we can/want to drop ARMv7 support.

If you don't know why you have it in the first place, you can very likely drop it. I seriously doubt there is value in keeping that going (as justified in my previous response references).

However, the workflows to build and publish the image are essentially the same we use, albeit we're using a consolidated workflow. IIRC, it was a deliberate decision to use a single workflow, but I wasn't the one working on it last.

DMS previously had it consolidated IIRC. Splitting the workflows was easier to grok and maintain for us, and we could re-use them where we previously had duplicated YAML that was out of sync, not great for maintenance. If it doesn't benefit you to have separate workflows that's fine 👍

The workflows DMS has differ from what this PR proposed with bake/hcl content, the multiplatform manifest merging seems unnecessary noise vs how we build, especially when you want to keep maintenance simple.


Regarding the #2133 reference for ARMv7 you state:

We shouldn't slow down the Docker build, compiling for architectures that are rarely or never used.
The Dockerfile is part of the repo, so it can easily be compiled and/or extended, if additional architectures are necessary.

I agree with that. The only implied reason for ARMv7 was for Raspberry Pi support which you were not that interested in. My DMS repo references note that even the RPi 3 could support ARM64 builds. You really have no need for ARMv7 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants