Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster tokenizer lookahead #13341

Merged
merged 8 commits into from May 26, 2021
Merged

Conversation

JLHwung
Copy link
Contributor

@JLHwung JLHwung commented May 20, 2021

Q                       A
License MIT

Currently we clone the whole parser state when doing lookahead(), this is expensive because the State is large and in the past we have replaced some lookahead() usage by lookaheadCharCode() (#10371).

In this PR we limit the returned structure of tokenizer#lookahead(), based on the observations that 1) in most cases we retrieve only type and value from the result of lookahead(), 2) the tokenizer only depends on a subset of parser state.

We use the Flow arrow function types type A = (x) => void as a benchmark case because parsing such types invokes two lookahead() calls. As we can see parsing such types are 200% faster than the baseline.

$ node --predictable ./many-arrow-function-flow-types/bench.mjs
baseline 256 arrow function types: 506 ops/sec ±1.06% (1.976ms)
baseline 512 arrow function types: 256 ops/sec ±0.72% (3.906ms)
baseline 1024 arrow function types: 121 ops/sec ±0.63% (8.266ms)
baseline 2048 arrow function types: 50.41 ops/sec ±1.4% (20ms)
current 256 arrow function types: 1537 ops/sec ±1.25% (0.651ms)
current 512 arrow function types: 733 ops/sec ±0.77% (1.364ms)
current 1024 arrow function types: 359 ops/sec ±0.3% (2.786ms)
current 2048 arrow function types: 171 ops/sec ±1% (5.859ms)

Surprisingly, if we replace lookahead() (shown as "baseline") by lookaheadCharCode() (shown as "current"), performance is almost halved. It means skipWhiteSpace regex may be slower than our handwritten skipSpace method when there aren't many whitespaces to skip.

baseline 256 arrow function types: 1533 ops/sec ±1.5% (0.652ms)
baseline 512 arrow function types: 721 ops/sec ±1.3% (1.386ms)
baseline 1024 arrow function types: 353 ops/sec ±1.21% (2.833ms)
baseline 2048 arrow function types: 179 ops/sec ±0.9% (5.588ms)
current 256 arrow function types: 861 ops/sec ±3.14% (1.162ms)
current 512 arrow function types: 445 ops/sec ±2.13% (2.245ms)
current 1024 arrow function types: 218 ops/sec ±2.33% (4.584ms)
current 2048 arrow function types: 101 ops/sec ±2.61% (9.877ms)

@JLHwung JLHwung added pkg: parser PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories labels May 20, 2021
start: number,
end: number,
/* Used only in readSlashToken */
exprAllowed: boolean,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be 5% faster if we get rid of exprAllowed and inType, so that they don't have to be copied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does copying a couple of bool influence that much perf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Unlike C, in V8 the boolean literal occupies 64 bit memory, essentially an address points to the true/false built-ins. And a sequence of boolean literals will not be compressed into a bit array. So two boolean literal = 16 byte memory.

In Babel 7.7 we improved the traverser performance by compressing 3 booleans to a bit array: https://hackmd.io/UMdqwvVgQGaofjCZHfGKrA?view#Compress-boolean-flags

Note that the benchmark is specifically constructed to highlight the performance impact of lookahead(), so in the real word they are not that significant. However, the performance improvement is a long-term task and eventually every tiny improvement counts.

Copy link

@KFlash KFlash May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JLHwung Funny to see devs playing with perf :) Your suggested changes is only micro-optimization and not scalable, but yeah you are on right track.
The entire tokenizer is an Class that extends ParserErrors . Run a benchmark and check the extends perf and you get chocked about perf loss.
Each token is it's own Class. Try flatten it instead into an infinite state machine. I'm sure you gain 20% perf. As it is right now it consumes memory.

You really should reduce amount of object access too. This is expensive
this.state.lastTokEnd vs this.lastTokEnd. Why do you need this extra state object inside the class?
A quick google search will also let you know that this is expensive.

There are many other things you can do to optimize Babel parser. I mentioned only a few.

@babel-bot
Copy link
Collaborator

babel-bot commented May 20, 2021

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/46417/

@@ -247,17 +283,16 @@ export default class Tokenizer extends ParserErrors {

nextToken(): void {
const curContext = this.curContext();
if (!curContext?.preserveSpace) this.skipSpace();
if (!curContext.preserveSpace) this.skipSpace();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value of context is [ct.braceStatement] so curContext is always non-nullish.

if (this.options.tokens) {
this.pushToken(new Token(this.state));
}
this.checkKeywordEscapes();
Copy link
Contributor Author

@JLHwung JLHwung May 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isLookahead condition is removed because we call nextToken in lookahead().

@codesandbox-ci
Copy link

codesandbox-ci bot commented May 20, 2021

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit 8dafa65:

Sandbox Source
babel-repl-custom-plugin Configuration
babel-plugin-multi-config Configuration

start: state.start,
end: state.end,
lastTokEnd: state.end,
context: [this.curContext()],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have to copy the whole context because lookahead never update contexts.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you got 2x state.end. It's better to allocate some memory for a var and avoid duplicate code and 2x. object access on each run

Copy link
Member

@nicolo-ribaudo nicolo-ribaudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@JLHwung JLHwung force-pushed the faster-tokenizer-lookahead branch from ae5541e to d1c3ed0 Compare May 21, 2021 14:45
@nicolo-ribaudo
Copy link
Member

Can we also add a non-test262 test for 6349459 (#13341)? Even if we have the test262 test, I'd like our test suite to cover every possible code path.

@JLHwung JLHwung force-pushed the faster-tokenizer-lookahead branch from 6349459 to 8dafa65 Compare May 25, 2021 12:54
@JLHwung JLHwung merged commit acf2a10 into babel:main May 26, 2021
@JLHwung JLHwung deleted the faster-tokenizer-lookahead branch May 26, 2021 01:12
@KFlash
Copy link

KFlash commented May 26, 2021

@JLHwung Is there any place this "merged" build is available so I can do a benchmark comparison vs my own parser?

@JLHwung
Copy link
Contributor Author

JLHwung commented May 26, 2021

@KFlash Nope, we have only latest @babel/standlone build, so currently will have to clone the repo and build it by yourself. However I don't expect it will improve that much on average cases -- the lookahead() is only used in Flow and TypeScript plugin, which also used state.clone in tryParse heavily.

Here you got 2x state.end. It's better to allocate some memory for a var and avoid duplicate code and 2x. object access on each run

You are right, V8 generates two object access here.

                  -- <babel/packages/babel-parser/lib/index.js:8479:18> inlined at <babel/packages/babel-parser/lib/index.js:8489:23> --
0x1fafde11f291   111  4d8bb90f010000 REX.W movq r15,[r9+0x10f]
0x1fafde11f298   118  4d897b37       REX.W movq [r11+0x37],r15
                  -- <babel/packages/babel-parser/lib/index.js:8480:25> inlined at <babel/packages/babel-parser/lib/index.js:8489:23> --
0x1fafde11f29c   11c  4d8bb90f010000 REX.W movq r15,[r9+0x10f]
0x1fafde11f2a3   123  4d897b3f       REX.W movq [r11+0x3f],r15

For those member expression in object literal, e.g.

{
  pos: state.pos
}

V8 generates extra initialization machine code, which could be eliminated when the object literal contains only literals / identifiers.

0x1fafde11f238    b8  49c7432f00000000 REX.W movq [r11+0x2f],0x0

@KFlash
Copy link

KFlash commented May 26, 2021

Regarding the flow and typescript. They are always doing a call to their parent class - witch in turn kills perf. The TS plugin is the worse candidate if you compare them.

why do you need that extra 'state.pos'? Why not flatten it?

Did you check with V8 all the extra expenses you get with extending a class?

The ideal solution is to skip string comparison and let everything become a series of numbers. See my Kataw parser as an example.

@github-actions github-actions bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label Aug 25, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
outdated A closed issue/PR that is archived due to age. Recommended to make a new issue pkg: parser PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants