Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

Refactor CI checks to run as a single job #92

Merged
merged 3 commits into from Nov 19, 2021
Merged

Conversation

NevilleS
Copy link
Contributor

Purpose

This allows the docker build cache to be shared for each check, which is the majority of the build time. This will likely reduce CI usage by a factor of 7, until we have a solution that allows running in parallel with a shared docker cache.

Changes

  • Run CI checks on push command (so that checks are run more than once per PR)
  • Combine all checks into a single job, with multiple steps
  • Rename external_integration to integration_external pytest mark for consistency with others (e.g. integration_erasure)

Ticket

Closes #90

This allows the docker build cache to be shared for each check, which is
the majority of the build time. This will likely reduce CI usage by a
factor of 7, until we have a solution that allows running in parallel
with a shared docker cache.
@NevilleS NevilleS added the run unsafe ci checks Triggers running of unsafe CI checks label Nov 19, 2021
@seanpreston
Copy link
Contributor

We split these out intentionally a while back so we could have granularity on which part of CI was failing from this view. I'm not against changing it back, but let's make sure we're going for exactly what we want here. As I understand it the motivation here is to save the wasted build time between jobs?

@NevilleS
Copy link
Contributor Author

@seanpreston: yes, see my notes in #90 but we're basically taking 30m of CI time on every commit, which adds up fast. With this refactor, it looks like it's closer to about 9-10m. For example see this action: https://github.com/ethyca/fidesops/runs/4267532454?check_suite_focus=true
image

The "Format" step there shows as taking over 3m30 but that's because it's the first one that does a docker build. Everything else is much faster. However, if you parallelize this, you "pay" that build on every individual step so it becomes very long.

I think it's still reasonably easy to see what step fails (from that "Details" view). We can parallelize the jobs again with some caching if y'all can figure out a good solution for that too.

@NevilleS NevilleS added run unsafe ci checks Triggers running of unsafe CI checks and removed run unsafe ci checks Triggers running of unsafe CI checks labels Nov 19, 2021
@NevilleS
Copy link
Contributor Author

OK, this is ready for review. I think this strikes an OK balance for now. The before/after is approximately:

workflow total time (before) total time (after)
safe checks 29m 8m
unsafe checks 4m 4m
total 33m 12m

Additionally, I think I've setup both safe/unsafe to run on every merge to main, which I like since that'll trigger the full integration suite post-merge for PRs where we don't manually add the run unsafe ci checks label.

@NevilleS NevilleS marked this pull request as ready for review November 19, 2021 19:44
@seanpreston seanpreston self-assigned this Nov 19, 2021
@seanpreston
Copy link
Contributor

I think it's still reasonably easy to see what step fails (from that "Details" view). We can parallelize the jobs again with some caching if y'all can figure out a good solution for that too.

🙌 and if a step does fail it auto-expands the relevant section for us in the logs too.

@seanpreston seanpreston merged commit 762767c into main Nov 19, 2021
@seanpreston seanpreston deleted the ns-combine-ci-jobs branch November 19, 2021 20:29
sanders41 pushed a commit that referenced this pull request Sep 22, 2022
* Refactor CI checks to run as a single job

This allows the docker build cache to be shared for each check, which is
the majority of the build time. This will likely reduce CI usage by a
factor of 7, until we have a solution that allows running in parallel
with a shared docker cache.

* Fixes

* Polish tweaks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
run unsafe ci checks Triggers running of unsafe CI checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize CI actions to use fewer minutes (/w less parallelization or more caching)
2 participants