Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: speed up CI time by improving yarn install and caches (iteration 1 / >30%) #16581

Merged
merged 18 commits into from
May 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
88 changes: 88 additions & 0 deletions .github/actions/yarn-nm-install/action.yml
@@ -0,0 +1,88 @@
########################################################################################
# "yarn install" composite action for yarn 3/4+ and "nodeLinker: node-modules" #
#--------------------------------------------------------------------------------------#
# Requirement: @setup/node should be run before #
# #
# Usage in workflows steps: #
# #
# - name: 📥 Monorepo install #
# uses: ./.github/actions/yarn-nm-install #
# with: #
# enable-corepack: false # (default) #
# cache-install-state: false # (default) #
# cache-node-modules: false # (default) #
# #
# Reference: #
# - latest: https://gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a #
########################################################################################

name: 'Monorepo install (yarn)'
description: 'Run yarn install with node_modules linker and cache enabled'
inputs:
enable-corepack:
description: 'Enable corepack'
required: false
default: 'false'
cache-node-modules:
derrickmehaffy marked this conversation as resolved.
Show resolved Hide resolved
description: 'Cache node_modules, might speed up link step (invalidated lock/os/node-version/branch)'
required: false
default: 'false'
cache-install-state:
description: 'Cache yarn install state, might speed up resolution step when node-modules cache is activated (invalidated lock/os/node-version/branch)'
required: false
default: 'false'
derrickmehaffy marked this conversation as resolved.
Show resolved Hide resolved

runs:
using: 'composite'

steps:
- name: ⚙️ Enable Corepack
if: ${{ inputs.enable-corepack }} == 'true'
derrickmehaffy marked this conversation as resolved.
Show resolved Hide resolved
shell: bash
run: corepack enable

- name: ⚙️ Expose yarn config as "$GITHUB_OUTPUT"
id: yarn-config
shell: bash
env:
YARN_ENABLE_GLOBAL_CACHE: "false"
run: |
echo "CACHE_FOLDER=$(yarn config get cacheFolder)" >> $GITHUB_OUTPUT
echo "CURRENT_NODE_VERSION="node-$(node --version)"" >> $GITHUB_OUTPUT
echo "CURRENT_BRANCH=$(echo ${GITHUB_REF#refs/heads/} | sed -r 's,/,-,g')" >> $GITHUB_OUTPUT

- name: ♻️ Restore yarn cache
derrickmehaffy marked this conversation as resolved.
Show resolved Hide resolved
uses: actions/cache@v3
id: yarn-download-cache
with:
path: ${{ steps.yarn-config.outputs.CACHE_FOLDER }}
key: yarn-download-cache-${{ hashFiles('yarn.lock', '.yarnrc.yml') }}
restore-keys: |
yarn-download-cache-

- name: ♻️ Restore node_modules
if: inputs.cache-node-modules == 'true'
id: yarn-nm-cache
uses: actions/cache@v3
with:
path: '**/node_modules'
key: yarn-nm-cache-${{ runner.os }}-${{ steps.yarn-config.outputs.CURRENT_NODE_VERSION }}-${{ steps.yarn-config.outputs.CURRENT_BRANCH }}-${{ hashFiles('yarn.lock', '.yarnrc.yml') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if that is a bit over-optimized. Can we assume that CURRENT_NODE_VERSION and hashFiles is enough? Why do you think CURRENT_BRANCH should be part?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that based on the strapi repo. The node-modules folder isn't cached if not requested otherwise. False by default. In other words it's not enabled. We just cache the yarn zip archives

This is based on a previous benchmark made in this branch.

The reason node version is added is to be future proof. Imagine someone adds a library that will run a node-gyp postinstall build to create a binary.... for example sharp when vips-dev isn't installed.... there's other examples in the wild, I've chosen sharp cause you have it. It saves the compiled binary in the node modules folder and... 💥

But the most interesting reason to keep this... keep inline with the original gist that comes with doc.

https://gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a

In a next iteration, I'll plan to add more speed up. I prefer to restart from a situation I know well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gu-stav is right @belgattitude I think that CURRENT_BRANCH is not needed since nodeversion and packages.yml already solves this what would mean less cold starts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What @Boegie19 said :) CURRENT_NODE_VERSION looks useful to me - just confused about CURRENT_BRANCH :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, haven't explained this...

The idea is that yarn/pnpm works in multiple phases:

  • Resolution - seems affected by presence of install_state
  • Fetch - only affected by zip cache presence
  • Link - seems affected by presence of node_modules/install_state

So why it's there ?

Over time in a repo... it's always nice to re-asses cost/benefit perf between the two possible strategies

  • Just cache *.zip
  • Cache *.zip + install-state + node_modules.

And that depends mostly on the packages installed in the repo (ie you install esbuild, vite, vitest at some point and the balance is different)

In the strapi repo

  • The strategy that makes sense is to just save *.zip
  • As in one of my repos, it's better to save everything

image

See https://github.com/belgattitude/nextjs-monorepo-example/actions/runs/4888683373/jobs/8726574932 (brings a 1 minute with strategy 1 install to 25seconds install) - not totally correct but that's the idea...

The drawback node_modules/install-state cannot leak between branches (long to explain). The best for now is to add the branch to the key.

Yes it's over-optimization in the current strapi repo... and it might look hard-to-read :) With another repo this optim is valid. (mitgh make sense in the future)

But the main idea is to not diverge too much from the reference gist: https://gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a

This is optimization after-all, better to have a reference doc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you point. What do you think would be the danger in sharing a cache across branches? We don't bundle/ inline any secrets, so I don't see a problem there - but maybe for caches (thinking e.g. of eslint / prettier) that might be shared and shouldn't?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a security concern. Link step and whatever will happen there + tons of things... but too long to explain today. As it's not enabled... let's keep it as close to reference gist if you agree

Wdyt ? Or you want more details ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take your time - I'd like to know before we merge what the reason for that is, because otherwise we won't be able to get rid of it again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Boegie19 @gu-stav

is right @belgattitude I think that CURRENT_BRANCH is not needed since nodeversion and packages.yml already solves this what would mean less cold starts

I see... It seems to cause a lot of questionning. Let's say it's for ensuring stability / correctness / reproducibility.

I've added it based on experience and my current understanding. The thing is without that you can have really hard to debug situations.

When saving node_modules + instal state... Yarn (will/might/can) skip totally the link step (the postinstall won't be run - in all situations). I used (will/might/can) because it depends on other factors.

Some examples

  1. By default: prisma will generate your db types on postinstall based on the defined schema definition. Lock file does not change, cache is valid, you change the schema... then the postinstall is not run. The generated types are not in sync...

  2. Some of the packages depends on deasync (ie xdm -mdx...) same thing.

  3. And yes it's always fragile because some tools starts to cache things in node_modules/.cache (ie older turbo version, eslint, prettier...) or depend too much on that

The idea is : if full cache is enabled at some point -> avoid those edge-cases from the start (cause they will happen).

But I understand your points. So based previous points. What do you think of proceeding by steps:

  1. Merge this first iteration (safe)
  2. I'll prepare the iteration 2 (that will probably change the lock file and bring new light to the whole picture)
  3. I'll make a 3rd iteration - cleanup to make this action as simple as possible based on the strapi usage

Is it something that would answer your concerns ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for that explanation. It starts to make sense to me. I will leave this thread open for the other engineers to read, because I'd like more opinions before we merge this :)


- name: ♻️ Restore yarn install state
if: inputs.cache-install-state == 'true' && inputs.cache-node-modules == 'true'
id: yarn-install-state-cache
uses: actions/cache@v3
with:
path: .yarn/ci-cache
key: yarn-install-state-cache-${{ runner.os }}-${{ steps.yarn-config.outputs.CURRENT_NODE_VERSION }}-${{ steps.yarn-config.outputs.CURRENT_BRANCH }}-${{ hashFiles('yarn.lock', '.yarnrc.yml') }}

- name: 📥 Install dependencies
shell: bash
run: yarn install --immutable --inline-builds
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added --inline-build on ci cause otherwise some logs are stored in a file... (node-gyp...)

env:
# Overrides/align yarnrc.yml options (v3, v4) for a CI context
YARN_ENABLE_GLOBAL_CACHE: "false" # Use local cache folder to keep downloaded archives
YARN_NM_MODE: "hardlinks-local" # Reduce node_modules size
YARN_INSTALL_STATE_PATH: ".yarn/ci-cache/install-state.gz" # Might speed up resolutions when node_modules present
# Other environment variables
HUSKY: '0' # By default do not run HUSKY install
4 changes: 4 additions & 0 deletions .github/filters.yaml
@@ -1,11 +1,15 @@
backend:
- '.github/actions/yarn-nm-install/*.yml'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't exactly be part of this P/R... but asked in #16581 (comment).

Added the yarn-nm-install folder too. Why ? I'll need it for second iteration. But don't worry it's part of a plan: #16581 (comment). tldr: the idea is to bring all of this in @setup/node by default. I'm trying to coordinate this in parallel

- '.github/workflows/**'
- 'packages/**/package.json'
- 'packages/**/server/**/*.(js|ts)'
- 'packages/**/strapi-server.js'
- 'packages/{utils,generators,cli,providers}/**'
- 'packages/core/*/{lib,bin,ee}/**'
- 'api-tests/**'
frontend:
- '.github/actions/yarn-nm-install/*.yml'
- '.github/workflows/**'
- 'packages/**/package.json'
- 'packages/**/admin/src/**'
- 'packages/**/admin/ee/admin/**'
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/adminBundleSize.yml
Expand Up @@ -27,8 +27,9 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: 18
path: '**/node_modules'
derrickmehaffy marked this conversation as resolved.
Show resolved Hide resolved
gu-stav marked this conversation as resolved.
Show resolved Hide resolved
key: ${{ runner.os }}-${{ hashFiles('**/yarn.lock') }}

- name: Monorepo install
uses: ./.github/actions/yarn-nm-install

- uses: preactjs/compressed-size-action@v2
with:
Expand Down
3 changes: 0 additions & 3 deletions .github/workflows/checks.yml
Expand Up @@ -23,7 +23,4 @@ jobs:
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ hashFiles('**/yarn.lock') }}
- uses: ./.github/actions/security/lockfile
36 changes: 36 additions & 0 deletions .github/workflows/clean-up-pr-caches.yml
@@ -0,0 +1,36 @@
# https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#force-deleting-cache-entries
derrickmehaffy marked this conversation as resolved.
Show resolved Hide resolved
name: Cleanup caches for closed branches

on:
pull_request:
types:
- closed
workflow_dispatch:

jobs:
cleanup:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v3

- name: 🧹 Cleanup
run: |
gh extension install actions/gh-actions-cache

REPO=${{ github.repository }}
BRANCH="refs/pull/${{ github.event.pull_request.number }}/merge"

echo "Fetching list of cache key"
cacheKeysForPR=$(gh actions-cache list -R $REPO -B $BRANCH | cut -f 1 )

## Setting this to not fail the workflow while deleting cache keys.
set +e
echo "Deleting caches..."
for cacheKey in $cacheKeysForPR
do
gh actions-cache delete $cacheKey -R $REPO -B $BRANCH --confirm
done
echo "Done"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
2 changes: 0 additions & 2 deletions .github/workflows/contributor-doc.yml
Expand Up @@ -33,8 +33,6 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: 18
path: '**/node_modules'
key: ${{ runner.os }}-${{ hashFiles('**/yarn.lock') }}

- name: Install dependencies
run: yarn install --immutable
Expand Down
77 changes: 22 additions & 55 deletions .github/workflows/tests.yml
Expand Up @@ -48,12 +48,9 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- uses: nrwl/nx-set-shas@v3
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- name: Run build:ts
run: yarn nx run-many --target=build:ts --nx-ignore-cycles --skip-nx-cache
- name: Run lint
Expand All @@ -73,12 +70,9 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- uses: nrwl/nx-set-shas@v3
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- name: Run build:ts
run: yarn nx run-many --target=build:ts --nx-ignore-cycles --skip-nx-cache
- name: Run tests
Expand All @@ -98,12 +92,9 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- uses: nrwl/nx-set-shas@v3
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- name: Run build:ts for admin-test-utils
run: yarn build --projects=@strapi/admin-test-utils,@strapi/helper-plugin --skip-nx-cache
- name: Run test
Expand All @@ -121,11 +112,8 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- name: Build
run: yarn build --projects=@strapi/admin,@strapi/helper-plugin

Expand Down Expand Up @@ -161,11 +149,8 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- uses: ./.github/actions/run-api-tests
with:
dbOptions: '--dbclient=postgres --dbhost=localhost --dbport=5432 --dbname=strapi_test --dbusername=strapi --dbpassword=strapi'
Expand Down Expand Up @@ -201,11 +186,8 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- uses: ./.github/actions/run-api-tests
with:
dbOptions: '--dbclient=${{ matrix.db_client }} --dbhost=localhost --dbport=3306 --dbname=strapi_test --dbusername=strapi --dbpassword=strapi'
Expand Down Expand Up @@ -240,11 +222,8 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- uses: ./.github/actions/run-api-tests
with:
dbOptions: '--dbclient=${{ matrix.db_client }} --dbhost=localhost --dbport=3306 --dbname=strapi_test --dbusername=strapi --dbpassword=strapi'
Expand All @@ -263,11 +242,8 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- uses: ./.github/actions/run-api-tests
env:
SQLITE_PKG: ${{ matrix.sqlite_pkg }}
Expand Down Expand Up @@ -309,11 +285,8 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- uses: ./.github/actions/run-api-tests
with:
dbOptions: '--dbclient=postgres --dbhost=localhost --dbport=5432 --dbname=strapi_test --dbusername=strapi --dbpassword=strapi'
Expand Down Expand Up @@ -352,11 +325,8 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- uses: ./.github/actions/run-api-tests
with:
dbOptions: '--dbclient=${{ matrix.db_client }} --dbhost=localhost --dbport=3306 --dbname=strapi_test --dbusername=strapi --dbpassword=strapi'
Expand All @@ -378,11 +348,8 @@ jobs:
- uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v3
with:
path: '**/node_modules'
key: ${{ runner.os }}-${{ matrix.node }}-${{ hashFiles('**/yarn.lock') }}
- run: yarn install --immutable
- name: Monorepo install
uses: ./.github/actions/yarn-nm-install
- uses: ./.github/actions/run-api-tests
env:
SQLITE_PKG: ${{ matrix.sqlite_pkg }}
Expand Down