-
Notifications
You must be signed in to change notification settings - Fork 17.4k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I disagree with the need for threading in the first place. Theoretically, there is no significant difference between this and just running the installer child processes in async. Both approaches are limited by the installers themselves; the overhead of spawning them (either way) is insignificant.
And because I wanted to backup my claims with data, I implemented the async version (to be clear: the async refers to how it is spawned; the installers are executing in parallel just as much as this PR). The following bootstrap times are what I have gathered from the last couple of runs of both PRs:
(Order: W64, W86, Mac, Linux)
Async bootstrap (#21303):
- 9:38, 11:34, 16:03, 4:23
- 11:29, 11:21, 4:54, 4:37
Parallel Bootstrap (#21308):
- 9:54, 11:26, 5:56, 4:30
- 15:22, 12:42, 5:13, 4:33
Besides the anomaly with the mac bootstrap time on the first line, which cannot reasonably be attributed to the PR itself, there is no significant difference. In fact, given how variable the CI is, the consistency is almost impressive. I did see the faster time quoted for the atomcommunity
build, but I have not seen anything close to that in the CI results here.
Perhaps this approach could be shown to be appropriate for the build
function (I would need to see measurements of worker threads vs async child processes first), but I conclude this is unnecessary for bootstrapping and so not worth the complexity at this time.
Edit: Just to clear up some confusion I've noticed: the async bootstrap PR is not in a state I would merge, it was a quick sketch to demonstrate the alternative approach. Appealing to diff sizes is nonsensical.
Here the code runs outside of Node (chil_process), so you should not expect performance difference between multi-threading and async. The benefit of using threads is that you will have a cleaner code while having the top performance. Comparing this to your async PR:
Multi-threaded bootstrap code diff(by ignoring 70 lines of code which are for package-lock.json): +85
-22 Async bootstrap code diff:If in the future, if we use direct API from
That PR is awaiting merging this. This PR includes script-runner which is directly used in the build. P.S: I am not against using async and child_processes, but there are suitable places for that. For example, in atom-community#155 I use async functions to parallelize copy/pasting assets which is a I/O operation and runs by the operating system. |
1b50c61
to
117fdf9
Compare
117fdf9
to
373c084
Compare
373c084
to
1fa40ca
Compare
483f829
to
7a16aba
Compare
It is not used in script-runner
removes redundant config double require
use 'max' for JOBs to use all available cores nodejs/node-gyp#1770 don't set JOBS if already defined.
This reverts commit de47280.
7a16aba
to
f0d28a2
Compare
Thanks for the PR! That would definitely be a useful addition to our build process. However, it is not something that is within the scope that we have envisioned for Atom. We are not planning to make any major improvements to our build process for the time being. We are focused on fixing user-facing bugs. For that reason, I'm going to close this. Thanks again for contributing to Atom 👍 |
Description of the change
Run the bootstrap script on different threads in parallel:
- apm installation runs in parallel to installing script/ dependencies
- using JOBS env variable node-gyp is parallelized
Benefits
Running this on a computer with multiple cores gives a good speedup.
Even in the CI with a processor with only 2 cores.
For example, on Linux, the bootstrap time is reduced from 7:30 to 3:41, which is 203% faster (2X).
Verification
The CI passes. This is tested locally as well.
Release Notes
Closes #21315