Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge multiple regex transform plugin #10447

Merged

Conversation

JLHwung
Copy link
Contributor

@JLHwung JLHwung commented Sep 16, 2019

Q                       A
Fixed Issues? Fixes #8951, closes #10347, fixes #10367, fixes #9892, fixes #9199, fixes #10601
Patch: Bug Fix? Yes
Tests Added + Pass? Yes
Any Dependency Changes? Yes
License MIT

This PR is includes refactor on #10430. I will rebase once it gets merged.

In this PR we merge multiple regex transformers into a single meta plugin so that regex transform can be finished in one-pass. The current modular transform plugin is essentially turned into transformer switches. By doing this we provide support for multiple ESNext regex features
such as /(?<name>\p{Unified_Ideograph}{3}/u.

We also set lookbehind: true to regexpu-core so that our regex transformer can always parse the lookbehind assertions. I don't think we require babel-syntax-regex-lookahead here as we are not parsing every single regex and lookbehind: true is only meaningful when user used it with other transformable features.

Note that although it is a bug fix, I think we may target to a 7.7.0 release as 1) it relies on upstream minor bump. 2) it introduces new package.

Todo:

Edits: I realize that in practice there is no browser that supports property escape but not named capturing groups, as they are supported in same version, we do not rely on mathiasbynens/regexpu-core#30 anymore.

@babel-bot
Copy link
Collaborator

babel-bot commented Sep 16, 2019

Build successful! You can test your changes in the REPL here: https://babeljs.io/repl/build/11623/

@nicolo-ribaudo nicolo-ribaudo added the PR: Internal 🏠 A type of pull request used for our changelog categories label Sep 16, 2019
@buildsize
Copy link

buildsize bot commented Sep 17, 2019

File name Previous Size New Size Change
babel-preset-env.js 2.69 MB 2.74 MB 57.07 KB (2%)
babel-preset-env.min.js 1.63 MB 1.66 MB 26.54 KB (2%)
babel.js 2.9 MB 2.95 MB 51.57 KB (2%)
babel.min.js 1.6 MB 1.63 MB 24.39 KB (1%)

@JLHwung JLHwung force-pushed the helper-create-regexp-features-plugin branch from 71816cb to ba51deb Compare September 18, 2019 02:16
@JLHwung JLHwung marked this pull request as ready for review September 19, 2019 16:56
@nicolo-ribaudo nicolo-ribaudo added this to the v7.7.0 milestone Sep 19, 2019
const flagsIncludesU = flags.includes("u");

if (flagsIncludesU === true) {
if (hasFeature(features, FEATURES.unicodeFlag) === false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Don't you like !? 😛
Also all the === trues are unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I have no preference on ! or === false until I read the compiled instructions:

This is what happens under the hood when we write if(a) { // blablabla }

0x1fafde0d1bf2    32  4c8b4510       REX.W movq r8,[rbp+0x10]
0x1fafde0d1bf6    36  41f6c001       testb r8,0x1
0x1fafde0d1bfa    3a  0f84cf140000   jz 0x1fafde0d30cf  <+0x150f>
                  -- B4 start --
0x1fafde0d1c00    40  4d3945f8       REX.W cmpq [r13-0x8] (root (false_value)),r8
0x1fafde0d1c04    44  0f8438000000   jz 0x1fafde0d1c42  <+0x82>
                  -- B5 start --
0x1fafde0d1c0a    4a  4d394500       REX.W cmpq [r13+0x0] (root (empty_string)),r8
0x1fafde0d1c0e    4e  0f842e000000   jz 0x1fafde0d1c42  <+0x82>
                  -- B6 start --
0x1fafde0d1c14    54  4d8b48ff       REX.W movq r9,[r8-0x1]
0x1fafde0d1c18    58  41f6410d10     testb [r9+0xd],0x10
0x1fafde0d1c1d    5d  0f851f000000   jnz 0x1fafde0d1c42  <+0x82>
                  -- B7 start --
                  -- B8 start --
0x1fafde0d1c23    63  4d398d80000000 REX.W cmpq [r13+0x80] (root (heap_number_map)),r9
0x1fafde0d1c2a    6a  0f8486140000   jz 0x1fafde0d30b6  <+0x14f6>
                  -- B9 start --
0x1fafde0d1c30    70  4d398d40010000 REX.W cmpq [r13+0x140] (root (bigint_map)),r9
0x1fafde0d1c37    77  0f8466140000   jz 0x1fafde0d30a3  <+0x14e3>
0x1fafde0d1c3d    7d  e913000000     jmp 0x1fafde0d1c55  <+0x95>

Basically it compares a to these four objects: false, "", 0 and 0n. Since JavaScript is weak-typed, even if you know for sure a could only be a boolean, the compiler would not simplify if(a) to the following codes, which is simplified from if(a === true).

0x1fafde0d1bf2    32  4c8b4510       REX.W movq r8,[rbp+0x10]
0x1fafde0d1bf6    36  41f6c001       testb r8,0x1
0x1fafde0d1bfa    3a  0f84cf140000   jz 0x1fafde0d30cf  <+0x150f>
                  -- B4 start --
0x1fafde0d1c00    40  4d3945f8       REX.W cmpq [r13-0x8] (root (false_value)),r8
0x1fafde0d1c04    44  0f8438000000   jz 0x1fafde0d1c42  <+0x82>

However, these branch instructions would never be our performance bottleneck thanks to complex branch prediction in our modern CPU. I am happy to rewrite into more common code style as long as you find out. 😀

@JLHwung JLHwung force-pushed the helper-create-regexp-features-plugin branch 2 times, most recently from ffa7989 to db67055 Compare September 19, 2019 23:39
// as 70000100005. This method is easier than using a semver-parsing
// package, but it breaks if we release x.y.z where x, y or z are
// greater than 99_999.
const version = pkg.version.split(".").reduce((v, x) => v * 1e5 + +x, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think it's perfectly fine using semver to parse (since we use it elsewhere)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my fault 😛 I used this logic for class features

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is okay to leave it as is because all we need is a version sorter and we are not using semver elsewhere.

@@ -10,23 +10,9 @@ export default declare((api, options) => {
throw new Error(".useUnicodeFlag must be a boolean, or undefined");
}

return {
return createRegExpFeaturePlugin({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯, clean!

Copy link
Member

@existentialism existentialism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@nicolo-ribaudo nicolo-ribaudo added the PR: Ready to be Merged A pull request with already two approvals, but waiting for the next minor release label Oct 3, 2019
@JLHwung JLHwung force-pushed the helper-create-regexp-features-plugin branch from e8b960c to 7fbd168 Compare October 24, 2019 15:10
@JLHwung
Copy link
Contributor Author

JLHwung commented Oct 24, 2019

@existentialism @nicolo-ribaudo A new fix 7fbd168 has been pushed since you reviewed. In this fix we apply dotAllFlag when flags includes u and we have to apply rewritePattern.

Note that this approach is not ideal. I will propose a useDotAllFlag to regexpu-core so we can preserve the s flag if babel is not instructed to transform them. And we can expect subsequent fixes coming in v7.7.x.

@nicolo-ribaudo nicolo-ribaudo merged commit 8ffca04 into babel:master Oct 29, 2019

> Compile ESNext Regular Expressions to ES5

See our website [@babel/helper-create-regexp-features-plugin](https://babeljs.io/docs/en/next/babel-helper-create-regexp-features-plugin.html) for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link is dead. Is there supposed to be anything here?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
outdated A closed issue/PR that is archived due to age. Recommended to make a new issue PR: Internal 🏠 A type of pull request used for our changelog categories PR: Ready to be Merged A pull request with already two approvals, but waiting for the next minor release
Projects
None yet
5 participants