-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI GitHub workflows use #7614
Comments
Thanks for opening this issue!
|
|
If the workflow has a by-default usage I guess is an equivalent benefit. It also depend in the number of dependencies the project have (details here). Since it seems the workflow does not re-use the cache between node versions, it seems there could be room for improvement here? (not sure if you have a use case where you explicitly need to NOT re-use the cache between node versions in your CI). Here's the ADR where it is explained on detail.
My understanding is that it re-uses the cache between versions:
source: https://github.com/actions/setup-node/blob/main/docs/advanced-usage.md#caching-packages-dependencies |
From the docs:
It seems that it doesn't matter whether we have separate cache entries per matrix permutation. The cache is available, whether it's 1 cache file or N cache files, so I would not expect any performance improvement. I am suspicious of having a single cache for all matrix permutations. Matrix caching is explicitly described in an example, using |
Nice find! I didn't know this 👍🏽 . As you pointed, if the cache expires every 7 days + has a capacity of 5Gb probably there is not much performance improvement going to happen. I guess it depends in the number of dependencies of the project and the frequency the CI runs.
Yep, from what I read, seems that it is not caching per version as you do with
For my understanding and learning, is there a particular use case I'm missing on why you need different cache folder per node version? |
Maybe a different node version produces a different cache? Again, no reference, one would probably have to research npm caching compatibility across versions. |
I found this: actions/setup-node#304 (comment)
The example I see is with Just posted a question In GitHub's Community to get some help: https://github.community/t/actions-setup-node-cache-with-matrix-of-node-versions/206940 |
Maxim's response in the issue you linked is pretty clear I think. As far as I understand it only the content of pnpm global store is cached which is not affected by node version and safe to share between different node versions. |
@oscard0m thanks for investigating. The suggested change of removing the node version won't bring any benefit as I understand. The benefit of keeping it however is that we have a clear cache separation between node versions. Even if that is not needed now, it seems to be a more sustainable approach because we cannot tell whether the Node.ja team will keep cache compatibility across node versions, at least I haven't found any reference. If we find an argument in favor of removing the separation by node version, let's reevaluate. |
Thanks for appreciating it and also thanks for all the time and care you are taking in this discussion. I'm learning a lot 💪🏽
On my understanding there would be a benefit. To not download dependencies already downloaded by other executions of that same step in the pipeline with another node version. For example, taking this matrix as example, the benefit would be for all the succeeding executions of the pipeline using I tried to check the execution times for the opened PR but it requires an approval to get checks running so... no real information to compare with.
I don't know how If you don't see any benefit and you think is not safe to rely on the same npm downloaded dependencies, we can close this issue an revisit it in the future if more strong arguments come up on this topic. |
The dependencies won't be downloaded, they are cached already. They are just not cached in 1 single cache but N individual caches by node version. The only difference would be that the different caches wouldn't have to be loaded, but I expect them to be loaded fast, because that's the purpose of a cache as they are loaded from GitHub's internal storage. My concern is that bugs due to caching can be difficult to analyze, so if node requires caching per version in the future, we have the effort of investigating and reverting this workflow change across multiple repos. Because for consistency we would not make this proposed change only in this repo but a few other repos too, so there would be some effort involved. However, let me make a test run using a single cache and compare the execution time. If the time is significantly better, we would have a pro-argument. |
Below is a performance comparison between the CI run times of this PR vs. status quo. As we can see, there is no noticeable performance improvement. The tests usually run +/- 1 min, so the conclusion would be that the two run at the same speed. I'm closing this PR, as the original issue was incorrect (we were already using cache), and the suggestion to not separate the cache per node version does not yield any improvement, but potentially opens up issues if node caches become incompatible across node versions in the future. @oscard0m Thanks for opening this issue and your investigations. If you come across a pro-argument for using a single cache, please let us know and we will gladly reconsider. This PR: |
Thanks for the research and experimentation on this. I learned a lot with you @mtrezza ! |
New Feature / Enhancement Checklist
Current Limitation
GitHub CI workflows using
setup-node
not getting benefit fromcache
property, which should improve execution times.Feature / Enhancement Description
Start getting benefit from this
cache
prop.Example Use Case
See PR
Alternatives / Workarounds
3rd Party References
https://github.blog/changelog/2021-07-02-github-actions-setup-node-now-supports-dependency-caching/
The text was updated successfully, but these errors were encountered: