Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Parallel Bootstrap #21308

Closed
wants to merge 13 commits into from

Conversation

aminya
Copy link
Contributor

@aminya aminya commented Sep 14, 2020

Description of the change

Run the bootstrap script on different threads in parallel:
- apm installation runs in parallel to installing script/ dependencies
- using JOBS env variable node-gyp is parallelized

Benefits

Running this on a computer with multiple cores gives a good speedup.

Even in the CI with a processor with only 2 cores.

For example, on Linux, the bootstrap time is reduced from 7:30 to 3:41, which is 203% faster (2X).

Verification

The CI passes. This is tested locally as well.

Release Notes

  • Faster bootstrapping by using parallelization

Closes #21315

Copy link
Contributor

@Aerijo Aerijo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I disagree with the need for threading in the first place. Theoretically, there is no significant difference between this and just running the installer child processes in async. Both approaches are limited by the installers themselves; the overhead of spawning them (either way) is insignificant.

And because I wanted to backup my claims with data, I implemented the async version (to be clear: the async refers to how it is spawned; the installers are executing in parallel just as much as this PR). The following bootstrap times are what I have gathered from the last couple of runs of both PRs:

(Order: W64, W86, Mac, Linux)
Async bootstrap (#21303):

  • 9:38, 11:34, 16:03, 4:23
  • 11:29, 11:21, 4:54, 4:37

Parallel Bootstrap (#21308):

  • 9:54, 11:26, 5:56, 4:30
  • 15:22, 12:42, 5:13, 4:33

Besides the anomaly with the mac bootstrap time on the first line, which cannot reasonably be attributed to the PR itself, there is no significant difference. In fact, given how variable the CI is, the consistency is almost impressive. I did see the faster time quoted for the atomcommunity build, but I have not seen anything close to that in the CI results here.

Perhaps this approach could be shown to be appropriate for the build function (I would need to see measurements of worker threads vs async child processes first), but I conclude this is unnecessary for bootstrapping and so not worth the complexity at this time.

Edit: Just to clear up some confusion I've noticed: the async bootstrap PR is not in a state I would merge, it was a quick sketch to demonstrate the alternative approach. Appealing to diff sizes is nonsensical.

script/config.js Outdated Show resolved Hide resolved
script/build Show resolved Hide resolved
script/lib/install-apm.js Outdated Show resolved Hide resolved
script/lib/install-script-runner-dependencies.js Outdated Show resolved Hide resolved
script/bootstrap Show resolved Hide resolved
script/script-runner/package.json Show resolved Hide resolved
@aminya
Copy link
Contributor Author

aminya commented Sep 24, 2020

Overall, I disagree with the need for threading in the first place. Theoretically, there is no significant difference between this and just running the installer child processes in async.

Here the code runs outside of Node (chil_process), so you should not expect performance difference between multi-threading and async. The benefit of using threads is that you will have a cleaner code while having the top performance.

Comparing this to your async PR:

  • If you look at the code diff in the following, this is totally visible. You are doubling the amount of code that needs to be maintained.

  • You have copy-pasted code twice just to support both Sync and Async. This doubles the amount of code that we need to maintain. That means we should constantly be careful to keep both codes in sync with each other. This issue does not exist in the multi-threaded code, since the actual functions are not changed at all.

Multi-threaded bootstrap code diff

(by ignoring 70 lines of code which are for package-lock.json):

+85 
-22
Details

image
image

Async bootstrap code diff:

image

If in the future, if we use direct API from npm or apm to install the dependencies or run things (instead of spawning external process), we can get more performance benefits from using threads. Not using multi-threading, in that case, means you have to run this on the main thread.

Perhaps this approach could be shown to be appropriate for the build function

That PR is awaiting merging this. This PR includes script-runner which is directly used in the build.

P.S: I am not against using async and child_processes, but there are suitable places for that. For example, in atom-community#155 I use async functions to parallelize copy/pasting assets which is a I/O operation and runs by the operating system.

@aminya aminya force-pushed the parallel-bootstrap-4upstream branch 2 times, most recently from 1b50c61 to 117fdf9 Compare September 24, 2020 07:57
@aminya aminya force-pushed the parallel-bootstrap-4upstream branch from 117fdf9 to 373c084 Compare October 14, 2020 08:46
@aminya aminya mentioned this pull request Oct 19, 2020
@aminya aminya force-pushed the parallel-bootstrap-4upstream branch from 373c084 to 1fa40ca Compare October 21, 2020 04:22
@aminya aminya force-pushed the parallel-bootstrap-4upstream branch from 483f829 to 7a16aba Compare January 29, 2021 20:23
@aminya aminya force-pushed the parallel-bootstrap-4upstream branch from 7a16aba to f0d28a2 Compare January 29, 2021 20:25
@sadick254
Copy link
Contributor

Thanks for the PR!

That would definitely be a useful addition to our build process. However, it is not something that is within the scope that we have envisioned for Atom. We are not planning to make any major improvements to our build process for the time being. We are focused on fixing user-facing bugs. For that reason, I'm going to close this.

Thanks again for contributing to Atom 👍

@sadick254 sadick254 closed this Jun 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants