Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: speed up CI time by improving yarn install and caches (iteration 1 / >30%) #16581
CI: speed up CI time by improving yarn install and caches (iteration 1 / >30%) #16581
Changes from all commits
980c6f1
8a55588
8c76e63
c01b01b
c3961d3
86d2ab4
e5d19a1
9bbf73c
e94e0a3
b9cc357
d940e40
2cae065
dc2e131
e94f5ee
096411f
7d38bca
3df8e70
f142554
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering if that is a bit over-optimized. Can we assume that
CURRENT_NODE_VERSION
andhashFiles
is enough? Why do you thinkCURRENT_BRANCH
should be part?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that based on the strapi repo. The node-modules folder isn't cached if not requested otherwise. False by default. In other words it's not enabled. We just cache the yarn zip archives
This is based on a previous benchmark made in this branch.
The reason node version is added is to be future proof. Imagine someone adds a library that will run a node-gyp postinstall build to create a binary.... for example sharp when vips-dev isn't installed.... there's other examples in the wild, I've chosen sharp cause you have it. It saves the compiled binary in the node modules folder and... 💥
But the most interesting reason to keep this... keep inline with the original gist that comes with doc.
https://gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a
In a next iteration, I'll plan to add more speed up. I prefer to restart from a situation I know well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gu-stav is right @belgattitude I think that CURRENT_BRANCH is not needed since nodeversion and packages.yml already solves this what would mean less cold starts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What @Boegie19 said :)
CURRENT_NODE_VERSION
looks useful to me - just confused aboutCURRENT_BRANCH
:)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, haven't explained this...
The idea is that yarn/pnpm works in multiple phases:
So why it's there ?
Over time in a repo... it's always nice to re-asses cost/benefit perf between the two possible strategies
And that depends mostly on the packages installed in the repo (ie you install esbuild, vite, vitest at some point and the balance is different)
In the strapi repo
See https://github.com/belgattitude/nextjs-monorepo-example/actions/runs/4888683373/jobs/8726574932 (brings a 1 minute with strategy 1 install to 25seconds install) - not totally correct but that's the idea...
The drawback node_modules/install-state cannot leak between branches (long to explain). The best for now is to add the branch to the key.
Yes it's over-optimization in the current strapi repo... and it might look hard-to-read :) With another repo this optim is valid. (mitgh make sense in the future)
But the main idea is to not diverge too much from the reference gist: https://gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a
This is optimization after-all, better to have a reference doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you point. What do you think would be the danger in sharing a cache across branches? We don't bundle/ inline any secrets, so I don't see a problem there - but maybe for caches (thinking e.g. of eslint / prettier) that might be shared and shouldn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a security concern. Link step and whatever will happen there + tons of things... but too long to explain today. As it's not enabled... let's keep it as close to reference gist if you agree
Wdyt ? Or you want more details ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take your time - I'd like to know before we merge what the reason for that is, because otherwise we won't be able to get rid of it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Boegie19 @gu-stav
I see... It seems to cause a lot of questionning. Let's say it's for ensuring stability / correctness / reproducibility.
I've added it based on experience and my current understanding. The thing is without that you can have really hard to debug situations.
When saving node_modules + instal state... Yarn (will/might/can) skip totally the link step (the postinstall won't be run - in all situations). I used (will/might/can) because it depends on other factors.
Some examples
By default: prisma will generate your db types on postinstall based on the defined schema definition. Lock file does not change, cache is valid, you change the schema... then the postinstall is not run. The generated types are not in sync...
Some of the packages depends on deasync (ie xdm -mdx...) same thing.
And yes it's always fragile because some tools starts to cache things in node_modules/.cache (ie older turbo version, eslint, prettier...) or depend too much on that
The idea is : if full cache is enabled at some point -> avoid those edge-cases from the start (cause they will happen).
But I understand your points. So based previous points. What do you think of proceeding by steps:
Is it something that would answer your concerns ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for that explanation. It starts to make sense to me. I will leave this thread open for the other engineers to read, because I'd like more opinions before we merge this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added --inline-build on ci cause otherwise some logs are stored in a file... (node-gyp...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't exactly be part of this P/R... but asked in #16581 (comment).
Added the yarn-nm-install folder too. Why ? I'll need it for second iteration. But don't worry it's part of a plan: #16581 (comment). tldr: the idea is to bring all of this in @setup/node by default. I'm trying to coordinate this in parallel