Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: only apply lazy cjs module transform on cli and core #10443

Merged
merged 1 commit into from Sep 27, 2019

Conversation

JLHwung
Copy link
Contributor

@JLHwung JLHwung commented Sep 15, 2019

Q                       A
Fixed Issues? Devtools tweak
Tests Added + Pass? No tests should be added
License MIT

This PR is a follow-up to #7588. In this PR we apply lazy module transform only on babel/core and babel/cli so that

  • both babel --version and require("@babel/core").loadPartialConfig still benefit from lazy module transformation
  • slightly reduced compiled instructions size of popular execution path such as traverse.node. So it is marginally faster.
  • slightly reduced code size

up-front loading time comparison

Here is comparison of babel --version loading time between experiment and control:

Experiment (this branch)

$ time node packages/babel-cli/bin/babel --version
node packages/babel-cli/bin/babel --version  0.07s user 0.29s system 96% cpu 0.368 total

$ time node -e "require('./packages/babel-core').loadPartialConfig({ filename: 'foo' })"
node -e 0.24s user 0.32s system 102% cpu 0.544 total

Control (master)

$ time node packages/babel-cli/bin/babel --version
node packages/babel-cli/bin/babel --version  0.06s user 0.32s system 96% cpu 0.393 total

$ time node -e "require('./packages/babel-core').loadPartialConfig({ filename: 'foo' })"
node -e   0.25s user 0.34s system 103% cpu 0.568 total

One can see there is no significant performance regression about up-front loading time.

Comparison of code

Here we take traverse.node from @babel/traverse to analyze its impact. It is a popular execution path when transforming codes.

Comparison of compiled JavaScript code

Experiment:

var t = _interopRequireWildcard(require("@babel/types"));
traverse.node = function (node, opts, scope, state, parentPath, skipKeys) {
  const keys = t.VISITOR_KEYS[node.type];
  /* same codes */
};

Control:

function t() {
  const data = _interopRequireWildcard(require("@babel/types"));
  t = function () {
    return data;
  };
  return data;
}

traverse.node = function (node, opts, scope, state, parentPath, skipKeys) {
  const keys = t().VISITOR_KEYS[node.type];
  /* same codes */
};

Apparently the only code change here is that
const keys = t().VISITOR_KEYS[node.type]; is replaced by
const keys = t.VISITOR_KEYS[node.type];

In C code, we may see no difference here since t() call can be inlined. However, we are not sure if it is equivalent in JavaScript executed by v8. The answer comes from turbofan optimized machine code.

Comparison of the compiled x86_64 machine instructions of traverse.node

Thanks to [print code] utility introduced in babel/parser_performance#8, we can show the compiled machine code of traverse.node. For brevity we list the difference only.

Experiment:

-- </Users/jh/code/babel/packages/babel-traverse/lib/index.js:74:18> --
xorl rax,rax
REX.W movq rdx,0x9a8fa48dd49    ;; object: 0x09a8fa48dd49 <Object map = 0x9a82b766ab9>
REX.W movq rcx,0x9a8c87b7799    ;; object: 0x09a8c87b7799 <String[#12]: VISITOR_KEYS>

Control:

-- </Users/jh/code/parser_performance/node_modules/@babel/traverse/lib/index.js:88:16> --
REX.W movq r8,0x9a8b4fd59a1    ;; object: 0x09a8b4fd59a1 <FunctionContext[17]>
REX.W movq rdi,[r8+0x4f]
REX.W movq r8,[r13-0x28] (root (undefined_value))
push r8
REX.W movq rsi,0x9a89edc0139    ;; object: 0x09a89edc0139 <NativeContext[249]>
xorl rax,rax
REX.W movq r9,rsi
REX.W movq r11,rax
-- Inlined Trampoline to Call_ReceiverIsNullOrUndefined --
REX.W movq r10,0x1006902e0  (Call_ReceiverIsNullOrUndefined)
call r10 
-- </Users/jh/code/parser_performance/node_modules/@babel/traverse/lib/index.js:88:19> --
movq rsi,[rbp-0x18]
REX.W movq rcx,0x9a8a8b34649    ;; object: 0x09a8a8b34649 <String[#12]: VISITOR_KEYS>
REX.W movq rdx,rax

Apparently, the experiment branch is more optimal than the control since we completely get rid of a JavaScript function call:

In the experiment branch, the t object is loaded directly into rdx and prepared to be called by LoadICTrampoline later. While in the control branch, a function context is constructed instead and after Call_ReceiverIsNullOrUndefined, the returned result rax is later moved to rdx.

@JLHwung JLHwung added PR: Internal 🏠 A type of pull request used for our changelog categories PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories labels Sep 15, 2019
@babel-bot
Copy link
Collaborator

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/11576/

@buildsize
Copy link

buildsize bot commented Sep 15, 2019

File name Previous Size New Size Change
babel-preset-env.js 2.43 MB 2.42 MB -10.68 KB (0%)
babel-preset-env.min.js 1.4 MB 1.4 MB -5.71 KB (0%)
babel.js 2.95 MB 2.91 MB -40.91 KB (1%)
babel.min.js 1.63 MB 1.61 MB -21.42 KB (1%)

Copy link
Member

@loganfsmyth loganfsmyth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great investigation, nice work!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
outdated A closed issue/PR that is archived due to age. Recommended to make a new issue PR: Internal 🏠 A type of pull request used for our changelog categories PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants