Refactor private name tokenizing #13256

JLHwung · 2021-05-04T15:23:46Z

Q	A
Tests Added + Pass?	Yes
License	MIT

This PR overhauls how Babel parser tokenizes the privateIdentifier #name. Currently #name is tokenized as hash and name, in this PR we merge these two tokens into a new privateName token whose value holds the String value of private identifier (without hash). We observe performance gain up to 18%.

$ node --predictable ./benchmark/many-class-private-properties/1-length.bench.mjs
baseline 256 length-1 private properties: 2640 ops/sec ±35.22% (0.379ms)
baseline 512 length-1 private properties: 1700 ops/sec ±1.47% (0.588ms)
baseline 1024 length-1 private properties: 844 ops/sec ±1.24% (1.186ms)
baseline 2048 length-1 private properties: 424 ops/sec ±0.42% (2.359ms)
current 256 length-1 private properties: 3123 ops/sec ±33.93% (0.32ms)
current 512 length-1 private properties: 2010 ops/sec ±0.83% (0.497ms)
current 1024 length-1 private properties: 977 ops/sec ±0.77% (1.023ms)
current 2048 length-1 private properties: 455 ops/sec ±0.85% (2.198ms)

This PR is based on the observation that Babel always does a lookahead when tokenizing #, so we can determine early if an identifier start is following a #, and avoid extra read of the leading identifier character.

By merging # and name we also avoids the tokenizer context update hooks for name-type tokens. We don't need to check of, functions and class for private identifiers anyway.

Since we expose tokens when options.tokens is true, we add a compat routine for tt.privateName which essentially undo the merging, hopefully we can remove it in Babel 8.

Note that Acorn adopts the same approach, which means it is likely that @babel/eslint-parser will have to merge # and name for older Babel versions. By merging tokens in @babel/parser we also do a favour for the @babel/eslint-parser.

codesandbox-ci · 2021-05-04T15:32:35Z

This pull request is automatically built and testable in CodeSandbox.

To see build info of the built libraries, click here or the icon next to each commit SHA.

Latest deployment of this branch, based on commit 52f533d:

Sandbox	Source
babel-repl-custom-plugin	Configuration
babel-plugin-multi-config	Configuration

babel-bot · 2021-05-04T15:33:06Z

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/45853/

JLHwung · 2021-05-04T16:04:27Z

eslint/babel-eslint-parser/src/convert/convertTokens.js

@@ -173,6 +173,8 @@ function convertToken(token, source) {
  } else if (type === tt.bigint) {
    token.type = "Numeric";
    token.value = `${token.value}n`;
+  } else if (type === tt.privateName) {
+    token.type = "PrivateIdentifier";


We convert tt.privateName to PrivateIdentifier, which aligns to eslint/espree#486

Note that although this PR merges tt.hash and tt.name to tt.privateName, this behaviour will be not observed by @babel/eslint-parsers because of the compat layer. However if we run with BABEL_8_BREAKING=true, the eslint parser will see tt.privateName, instead of breaking tt.privateName, we align it to the new espree behaviour.

nicolo-ribaudo

💯

packages/babel-parser/src/parser/expression.js

packages/babel-parser/src/tokenizer/index.js

jridgewell · 2021-05-06T01:20:06Z

packages/babel-parser/src/tokenizer/index.js

@@ -438,6 +438,9 @@ export default class Tokenizer extends ParserErrors {
        this.finishToken(tt.bracketHashL);
      }
      this.state.pos += 2;
+    } else if (isIdentifierStart(next) || next === charCodes.backslash) {


Can you explain the backslash? I'm confused why #\ would be valid.

It's for the escaped private names: #\u0061.

lweathermon · 2021-05-06T01:23:39Z

Thank you

Co-authored-by: Justin Ridgewell <justin@ridgewell.name>

existentialism

💯

JLHwung added 3 commits May 4, 2021 11:01

add benchmark

cb7ab80

refactor: create tt.privateName token for private names

203d1ab

add backward compat privateName = hash + name to Babel 7

94d0ae6

JLHwung added pkg: parser PR: Performance 🏃‍♀️ A type of pull request used for our changelog categories labels May 4, 2021

perf: get private name SV from token value

6708d82

chore: tweak benchmark file

97a2f60

JLHwung force-pushed the refactor-private-name-tokenizing branch from 806d076 to 97a2f60 Compare May 4, 2021 15:38

JLHwung commented May 4, 2021

View reviewed changes

JLHwung added 2 commits May 4, 2021 12:15

chore: update test fixtures

97e08dc

convert tt.privateName to PrivateIdentifier

1f65f5e

JLHwung force-pushed the refactor-private-name-tokenizing branch from 3f161e3 to 1f65f5e Compare May 4, 2021 16:15

nicolo-ribaudo approved these changes May 4, 2021

View reviewed changes

perf: avoid most isPrivateName call

56d64f1

JLHwung requested a review from nicolo-ribaudo May 4, 2021 19:42

JLHwung mentioned this pull request May 5, 2021

Faster identifier tokenizing #13262

Merged

nicolo-ribaudo approved these changes May 5, 2021

View reviewed changes

jridgewell reviewed May 6, 2021

View reviewed changes

lweathermon approved these changes May 6, 2021

View reviewed changes

Update packages/babel-parser/src/parser/expression.js

0b49227

Co-authored-by: Justin Ridgewell <justin@ridgewell.name>

jridgewell approved these changes May 6, 2021

View reviewed changes

JLHwung added 2 commits May 5, 2021 21:45

perf: use inlinable codePointAtPos

7f3d99e

make prettier happy

52f533d

existentialism approved these changes May 6, 2021

View reviewed changes

JLHwung merged commit a387973 into babel:main May 6, 2021

JLHwung deleted the refactor-private-name-tokenizing branch May 6, 2021 13:46

fedeci mentioned this pull request May 17, 2021

[Bug]: Unexpected error thrown #13322

Closed

1 task

JLHwung mentioned this pull request May 17, 2021

fix: preserve tokensLength in tryParse #13326

Merged

This was referenced May 18, 2021

chore(deps-dev): bump @babel/core from 7.13.8 to 7.14.2 filecoin-project/slate#746

Closed

chore(deps-dev): bump @babel/eslint-parser from 7.13.8 to 7.14.2 filecoin-project/slate#745

Closed

github-actions bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label Aug 6, 2021

github-actions bot locked as resolved and limited conversation to collaborators Aug 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor private name tokenizing #13256

Refactor private name tokenizing #13256

JLHwung commented May 4, 2021 •

edited by gitpod-io bot

codesandbox-ci bot commented May 4, 2021 •

edited

babel-bot commented May 4, 2021 •

edited

JLHwung May 4, 2021

nicolo-ribaudo left a comment

jridgewell May 6, 2021

JLHwung May 6, 2021

lweathermon commented May 6, 2021

existentialism left a comment

Refactor private name tokenizing #13256

Refactor private name tokenizing #13256

Conversation

JLHwung commented May 4, 2021 • edited by gitpod-io bot

codesandbox-ci bot commented May 4, 2021 • edited

babel-bot commented May 4, 2021 • edited

JLHwung May 4, 2021

Choose a reason for hiding this comment

nicolo-ribaudo left a comment

Choose a reason for hiding this comment

jridgewell May 6, 2021

Choose a reason for hiding this comment

JLHwung May 6, 2021

Choose a reason for hiding this comment

lweathermon commented May 6, 2021

existentialism left a comment

Choose a reason for hiding this comment

JLHwung commented May 4, 2021 •

edited by gitpod-io bot

codesandbox-ci bot commented May 4, 2021 •

edited

babel-bot commented May 4, 2021 •

edited