Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix var/const/let variable use before declaration #4148

Merged
merged 15 commits into from Jul 7, 2021
Merged

Fix var/const/let variable use before declaration #4148

merged 15 commits into from Jul 7, 2021

Conversation

kzc
Copy link
Contributor

@kzc kzc commented Jun 25, 2021

  • Also retains TDZ violations present in input
  • Enabled by default when treeshake is active
  • Low overhead - uses existing treeshaking simulated execution to mark declarations as reached

This PR contains:

  • bugfix
  • feature
  • refactor
  • documentation
  • other

Are tests included?

  • yes (bugfixes and features will not be merged without tests)
  • no

Breaking Changes?

  • yes (breaking changes will not be merged unless absolutely necessary)
  • no

List any relevant issue numbers:

resolves #4101

Description

Retains code that reads const/let/var variables before their declarations to preserve the original behavior of the rollup input. This PR handles all known issues in #4101.

@github-actions
Copy link

github-actions bot commented Jun 25, 2021

Thank you for your contribution! ❤️

You can try out this pull request locally by installing Rollup via

npm install kzc/rollup#fix-var-use-before-decl-and-tdz

or load it into the REPL:
https://rollupjs.org/repl/?pr=4148

@codecov
Copy link

codecov bot commented Jun 25, 2021

Codecov Report

Merging #4148 (8069a36) into master (b2218cc) will increase coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4148      +/-   ##
==========================================
+ Coverage   98.29%   98.31%   +0.02%     
==========================================
  Files         201      201              
  Lines        7149     7194      +45     
  Branches     2093     2108      +15     
==========================================
+ Hits         7027     7073      +46     
  Misses         58       58              
+ Partials       64       63       -1     
Impacted Files Coverage Δ
src/ast/scopes/BlockScope.ts 100.00% <ø> (ø)
src/Graph.ts 100.00% <100.00%> (ø)
src/Module.ts 100.00% <100.00%> (ø)
src/ast/nodes/ArrayPattern.ts 90.00% <100.00%> (+1.76%) ⬆️
src/ast/nodes/AssignmentPattern.ts 100.00% <100.00%> (ø)
src/ast/nodes/CallExpression.ts 100.00% <100.00%> (ø)
src/ast/nodes/Identifier.ts 100.00% <100.00%> (ø)
src/ast/nodes/ObjectPattern.ts 88.23% <100.00%> (+1.56%) ⬆️
src/ast/nodes/Property.ts 100.00% <100.00%> (ø)
src/ast/nodes/RestElement.ts 100.00% <100.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2218cc...8069a36. Read the comment docs.

@kzc
Copy link
Contributor Author

kzc commented Jun 25, 2021

Not much can be done about this...

  • codecov/project — 98.23% (-0.05%)

  • Tests / Node 12 (Windows):

npm WARN tarball tarball data for typescript@4.3.2 (sha512-zZ4hShnmnoVnAHpVHWpTcxdv7dWP60S2FsydQLV8V5PbS3FifjWFFRiHSWpDJahly88PRyV5teTSLoq4eG7mKw==) seems to be corrupted.

@kzc kzc closed this Jun 25, 2021
@kzc kzc reopened this Jun 25, 2021
@kzc
Copy link
Contributor Author

kzc commented Jun 25, 2021

I couldn't examine the Codecov report in any event:

Unauthorized

Github API rate limit error: Forbidden Check on Codecov's status or see our docs for common support.

Error 403

@kzc
Copy link
Contributor Author

kzc commented Jun 26, 2021

@lukastaegert This PR successfully runs the tests from #4149 without treeshake.correctVarValueBeforeDeclaration being set.

codecov/project — 98.23% (-0.05%) cannot be resolved without adding unrelated coverage tests outside the scope of this PR.

Copy link
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, this is really looking very promising. I added some notes as I think especially the pattern handling could be improved.
Also there is one thing bothering me, but maybe this is not actually an issue:
Technically, there can be many tree-shaking passes. The reason is that e.g. when a declaration is first encountered, it is not included as it is unused at that point. Then later we use the declaration, which will include the corresponding variable and trigger another pass, at which point the declaration is then included.
During the second pass, however, the initReached values will be wrong as they retain the values at the end of the first pass.
The reason this is likely not an issue is that the TDZ values are cached anyway the first time an identifier is encountered. But I am not sure I am overlooking something, will need to think a little more about this. But if it should turn out to be an issue anyway, it would be solvable e.g. by putting a Set on the HasEffectsContext and InclusionContext to track which variables are initialized and use a fresh Set for each pass. But as I said, maybe we really do not have an issue here.

src/ast/nodes/ArrayPattern.ts Outdated Show resolved Hide resolved
src/ast/nodes/Identifier.ts Show resolved Hide resolved
src/ast/nodes/Identifier.ts Outdated Show resolved Hide resolved
src/ast/nodes/Identifier.ts Show resolved Hide resolved
src/ast/nodes/Identifier.ts Outdated Show resolved Hide resolved
src/ast/nodes/Identifier.ts Outdated Show resolved Hide resolved
@kzc
Copy link
Contributor Author

kzc commented Jun 28, 2021

@lukastaegert Please feel free to modify this PR to your liking. Unfortunately I don't have the time to work on it further. I believe you have a good understanding of how it works and the remaining unresolved issues that are carried over from the previous rollup implementation. I would ask that whatever form this PR takes, that its functionality is the default treeshaking behavior and not be put behind an optional flag.

@kzc
Copy link
Contributor Author

kzc commented Jun 28, 2021

I think it's worth noting why the regression in test/form/samples/simplify-return-expression/main.js happened in its original form with this PR. If that source module were to be loaded by a JS engine it would not perform any work other than setting some top level variables (test, foo, bar, getBuild) to their respective arrow functions, and then setting BUILD to a constant literal - no functions are called. But because rollup include()d the exported function test in order of appearance in the source file, this PR believes that accessing BUILD before its declaration is a TDZ violation, which pessimizes and retains that variable. The reason that BUILD does not result in a TDZ in practise in a JS engine is due to the fact that it is a deferred operation, since the exported function in question, test, is not actually run. The distinction is subtle, but it makes a difference. You will notice that the IIFE version of the same code that calls test() at the end of IIFE does in fact drop BUILD. Also, had that original module been imported and test called then BUILD would also be dropped. However, it may be possible to tweak the include algorithm to defer the inclusion of the exported functions to the end of the module, thus not triggering the TDZ, and achieve the tree shaking of BUILD.

src/ast/nodes/Identifier.ts Outdated Show resolved Hide resolved
@lukastaegert
Copy link
Member

The reason that BUILD does not result in a TDZ in practise in a JS engine is due to the fact that it is a deferred operation, since the exported function in question, test, is not actually run. The distinction is subtle, but it makes a difference

This analysis is spot on. I added a fix that defers including entry exports after the first tree-shaking pass that indeed fixes this.

kzc and others added 7 commits July 2, 2021 06:18
* Also retains TDZ violations present in input
* Enabled by default when treeshake is active
* Low overhead - uses existing treeshaking simulated
  execution to mark declarations as reached
without treeshake.correctVarBeforeDeclaration
@kzc
Copy link
Contributor Author

kzc commented Jul 2, 2021

Good stuff - and such a simple fix.

What's left to do? This PR already fixes many longstanding tree shaking and ES correctness issues.

@lukastaegert
Copy link
Member

I want to have a look at this one #4148 (comment)
and also deprecate the correctVarValueBeforeDeclaration option properly.

@lukastaegert
Copy link
Member

This is mostly done, I am currently looking into some edge cases. So looking at the coverage misses, there is indeed an interesting deoptimization behind this: https://rollupjs.org/repl/?pr=4148&shareable=JTdCJTIybW9kdWxlcyUyMiUzQSU1QiU3QiUyMm5hbWUlMjIlM0ElMjJtYWluLmpzJTIyJTJDJTIyY29kZSUyMiUzQSUyMmZ1bmN0aW9uJTIwZm9vKCklMjAlN0IlNUNuJTVDdGNvbnN0JTIwcmVzdWx0JTIwJTNEJTIwZmFsc2UlM0IlNUNuJTVDdHJldHVybiUyMHJlc3VsdCUzQiU1Q24lN0QlM0IlNUNuJTVDbiUyRiUyRiUyMFRoZSUyMGxpbmUlMjBiZWxvdyUyMGlzJTIwZGVvcHRpbWl6aW5nJTIwZXZlcnl0aGluZyU1Q25jb25zb2xlLmxvZyhmb28oKSUyMCU3QyU3QyUyMHRydWUpJTNCJTVDbiU1Q25pZiUyMChmb28oKSUyMCU3QyU3QyUyMHRydWUpJTIwY29uc29sZS5sb2coJ3JldGFpbmVkJyklM0IlNUNuZWxzZSUyMGNvbnNvbGUubG9nKCdzaG91bGQlMjBiZSUyMHJlbW92ZWQlMjBidXQlMjBpcyUyMG5vdCcpJTNCJTVDbiU1Q25mdW5jdGlvbiUyMGJhcigpJTIwJTdCJTVDbiU1Q3Rjb25zdCUyMHJlc3VsdCUyMCUzRCUyMGZhbHNlJTNCJTVDbiU1Q3RyZXR1cm4lMjByZXN1bHQlM0IlNUNuJTdEJTNCJTVDbiU1Q25pZiUyMChiYXIoKSUyMCU3QyU3QyUyMHRydWUpJTIwY29uc29sZS5sb2coJ3JldGFpbmVkJyklM0IlNUNuZWxzZSUyMGNvbnNvbGUubG9nKCdyZW1vdmVkJyklM0IlMjIlMkMlMjJpc0VudHJ5JTIyJTNBdHJ1ZSU3RCU1RCUyQyUyMm9wdGlvbnMlMjIlM0ElN0IlMjJmb3JtYXQlMjIlM0ElMjJlcyUyMiUyQyUyMm5hbWUlMjIlM0ElMjJteUJ1bmRsZSUyMiUyQyUyMmFtZCUyMiUzQSU3QiUyMmlkJTIyJTNBJTIyJTIyJTdEJTJDJTIyZ2xvYmFscyUyMiUzQSU3QiU3RCU3RCUyQyUyMmV4YW1wbGUlMjIlM0FudWxsJTdE

Without the patch, the logical expression would have been simplified for both foo and bar. But with the patch, it first tries to deoptimize the arguments of the console.log call, and during this deoptimization it thinks that the variable result in the function call is in a TDZ position (because the function was not checked at this point; to be precise, this happens when the logical expression checks for the literal value of its left side). This value is then stored with the identifier.

@kzc
Copy link
Contributor Author

kzc commented Jul 3, 2021

At least the bug example above is not a regression. The same erroneous empty output is produced with rollup v2.52.7 as well. Something else appears to be amiss.

This PR may not handle every TDZ scenario correctly, but it's a big improvement in any case. Detecting all TDZ issues is not possible without actually executing the code.

I'll take a look at these examples tomorrow if you haven't already figured out a solution by that time.

@kzc
Copy link
Contributor Author

kzc commented Jul 3, 2021

The following patch applied to the latest PR commit will fix the bug in #4148 (comment) and pass npm test:

--- a/src/ast/nodes/Identifier.ts
+++ b/src/ast/nodes/Identifier.ts
@@ -239,6 +239,20 @@ export default class Identifier extends NodeBase implements PatternNode {
                        return (this.isTDZAccess = false);
                }
 
+               var decl_id;
+               if (
+                       this.variable.kind === 'let' &&
+                       this.variable.declarations &&
+                       this.variable.declarations.length === 1 &&
+                       (decl_id = this.variable.declarations[0] as any) &&
+                       this.start < decl_id.start &&
+                       closestParentFunctionOrProgram(this) === closestParentFunctionOrProgram(decl_id)
+               ) {
+                       // a `let` variable accessed before its declaration
+                       // in the same function or at top level of module
+                       return (this.isTDZAccess = true);
+               }
+
                if (!this.variable.initReached) {
                        // Either a const/let TDZ violation or
                        // var use before declaration was encountered.
@@ -248,3 +262,11 @@ export default class Identifier extends NodeBase implements PatternNode {
                return (this.isTDZAccess = false);
        }
 }
+
+function closestParentFunctionOrProgram(node: any): any {
+       while (node && !/^Program|Function/.test(node.type)) {
+               node = node.parent;
+       }
+       // One of: ArrowFunctionExpression, FunctionDeclaration, FunctionExpression or Program.
+       return node;
+}

I have not investigated the working but suboptimal test case in #4148 (comment). There are bound to be edge cases, but as long as they are infrequent and produce the correct result they can be tolerated until a future PR fix can be devised.

docs/999-big-list-of-options.md Outdated Show resolved Hide resolved
@kzc
Copy link
Contributor Author

kzc commented Jul 4, 2021

This patch fixes the deoptimization in #4148 (comment) without any npm test failures:

--- a/src/ast/nodes/CallExpression.ts
+++ b/src/ast/nodes/CallExpression.ts
@@ -178,15 +178,18 @@ export default class CallExpression extends NodeBase implements DeoptimizableEnt
        hasEffects(context: HasEffectsContext): boolean {
-               if (!this.deoptimized) this.applyDeoptimizations();
-               for (const argument of this.arguments) {
-                       if (argument.hasEffects(context)) return true;
+               try {
+                       for (const argument of this.arguments) {
+                               if (argument.hasEffects(context)) return true;
+                       }
+                       if (
+                               (this.context.options.treeshake as NormalizedTreeshakingOptions).annotations &&
+                               this.annotations
+                       )
+                               return false;
+                       return (
+                               this.callee.hasEffects(context) ||
+                               this.callee.hasEffectsWhenCalledAtPath(EMPTY_PATH, this.callOptions, context)
+                       );
+               } finally {
+                       if (!this.deoptimized) this.applyDeoptimizations();
                }
-               if (
-                       (this.context.options.treeshake as NormalizedTreeshakingOptions).annotations &&
-                       this.annotations
-               )
-                       return false;
-               return (
-                       this.callee.hasEffects(context) ||
-                       this.callee.hasEffectsWhenCalledAtPath(EMPTY_PATH, this.callOptions, context)
-               );
        }

I used try/finally because it was the easiest way to defer this.applyDeoptimizations() without thinking of the return statement code flow. It can be manually expanded of course with the risk of introducing mechanical errors - if you think it's more efficiently handled by the V8 engine.

Other AST classes should be examined for similar suboptimal code due to child node hasEffects* and applyDeoptimizations ordering. NewExpression would probably have the same issue.

@kzc
Copy link
Contributor Author

kzc commented Jul 4, 2021

This is the same patch as above but it's easier to read using git diff -w:

--- a/src/ast/nodes/CallExpression.ts
+++ b/src/ast/nodes/CallExpression.ts
@@ -176,7 +176,7 @@ export default class CallExpression extends NodeBase implements DeoptimizableEnt
        }
 
        hasEffects(context: HasEffectsContext): boolean {
-               if (!this.deoptimized) this.applyDeoptimizations();
+               try {
                        for (const argument of this.arguments) {
                                if (argument.hasEffects(context)) return true;
                        }
@@ -189,6 +189,9 @@ export default class CallExpression extends NodeBase implements DeoptimizableEnt
                                this.callee.hasEffects(context) ||
                                this.callee.hasEffectsWhenCalledAtPath(EMPTY_PATH, this.callOptions, context)
                        );
+               } finally {
+                       if (!this.deoptimized) this.applyDeoptimizations();
+               }
        }
 
        hasEffectsWhenAccessedAtPath(path: ObjectPath, context: HasEffectsContext): boolean {

Copy link
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added your fix for the CallExpression regression but did not add the other fix as I find it does not really fix the root cause but just my constructed example, and there could be quite a few variations that are not yet fixed. I would rather go for a separate approach to these ones. Which is why I undeprecated the correctVarValueBeforeDeclaration for now.
Still, I think this could be released for now as a definitive improvement.

@kzc
Copy link
Contributor Author

kzc commented Jul 6, 2021

but did not add the other fix as I find it does not really fix the root cause but just my constructed example, and there could be quite a few variations that are not yet fixed

That's unfortunate. It's a very simple and reliable approach to fix such bugs - any use of a let variable before its declaration within the exact same function (or both in top level) and not crossing function boundaries ought to always be a TDZ.

Can you post some examples of variations that are not fixed by #4148 (comment) ?

@kzc
Copy link
Contributor Author

kzc commented Jul 6, 2021

Which is why I undeprecated the correctVarValueBeforeDeclaration for now.

I suppose there's no harm in keeping this around as a non-default treeshake option, but as of this latest PR, I'm not aware of any rollup input producing a wrong result for var-use-before-declaration. The patch not applied is a let TDZ fix - not related to var.

@lukastaegert
Copy link
Member

lukastaegert commented Jul 7, 2021

Can you post some examples of variations that are not fixed?

I think we should fix these issues, but I also think we should reflect a little deeper on the how. As we already fixed a lot of issues, I will release this now.
My feeling is that a "proper" solution would not assign fixed flags to variables, but rather tracks on the context what variables are initialized and also "forgets" this at the end of a function call. But of course, such a solution would be much more involved. On the other hand, this might also allow us to remove the blanket deoptimization we currently have for var declarations in nested block scopes:
https://rollupjs.org/repl/?pr=4148&shareable=JTdCJTIybW9kdWxlcyUyMiUzQSU1QiU3QiUyMm5hbWUlMjIlM0ElMjJtYWluLmpzJTIyJTJDJTIyY29kZSUyMiUzQSUyMiU3QiU1Q24lMjAlMjB2YXIlMjBuZXN0ZWQlMjAlM0QlMjBmYWxzZSUzQiU1Q24lNUN0aWYlMjAobmVzdGVkKSUyMGNvbnNvbGUubG9nKCd0aGlzJTIwY291bGQlMjBiZSUyMHJlbW92ZWQnKSUzQiU1Q24lN0QlMjIlMkMlMjJpc0VudHJ5JTIyJTNBdHJ1ZSU3RCU1RCUyQyUyMm9wdGlvbnMlMjIlM0ElN0IlMjJmb3JtYXQlMjIlM0ElMjJlcyUyMiUyQyUyMm5hbWUlMjIlM0ElMjJteUJ1bmRsZSUyMiUyQyUyMmFtZCUyMiUzQSU3QiUyMmlkJTIyJTNBJTIyJTIyJTdEJTJDJTIyZ2xvYmFscyUyMiUzQSU3QiU3RCU3RCUyQyUyMmV4YW1wbGUlMjIlM0FudWxsJTdE
This exists because otherwise, we might fail situations like this:
https://rollupjs.org/repl/?pr=4148&shareable=JTdCJTIybW9kdWxlcyUyMiUzQSU1QiU3QiUyMm5hbWUlMjIlM0ElMjJtYWluLmpzJTIyJTJDJTIyY29kZSUyMiUzQSUyMnN3aXRjaCUyMCh1bmtub3duKSUyMCU3QiU1Q24lNUN0Y2FzZSUyMGZpcnN0JTNBJTVDbiU1Q3QlNUN0dmFyJTIweCUyMCUzRCUyMHRydWUlM0IlNUNuJTVDdGNhc2UlMjBzZWNvbmQlM0ElNUNuJTVDdCU1Q3RpZiUyMCh4KSUyMGNvbnNvbGUubG9nKCdvbmx5JTIwaWYlMjAlNUMlMjJmaXJzdCU1QyUyMiUyMHdhcyUyMHVzZWQlMkMlMjBibGFua2V0JTIwZGVvcHRpbWl6YXRpb24lMjBzYXZlZCUyMHVzJTIwaGVyZScpJTNCJTVDbiU3RCUyMiUyQyUyMmlzRW50cnklMjIlM0F0cnVlJTdEJTVEJTJDJTIyb3B0aW9ucyUyMiUzQSU3QiUyMmZvcm1hdCUyMiUzQSUyMmVzJTIyJTJDJTIybmFtZSUyMiUzQSUyMm15QnVuZGxlJTIyJTJDJTIyYW1kJTIyJTNBJTdCJTIyaWQlMjIlM0ElMjIlMjIlN0QlMkMlMjJnbG9iYWxzJTIyJTNBJTdCJTdEJTdEJTJDJTIyZXhhbXBsZSUyMiUzQW51bGwlN0Q=
And you can also constructor similar situations with if statements.
And, depending how it is done, such a solution could potentially be extended to do some limited value tracking for mutable variables (provided we can make sure we handle things like callbacks in a safe way).

@lukastaegert lukastaegert merged commit 752d4fe into rollup:master Jul 7, 2021
@kzc
Copy link
Contributor Author

kzc commented Jul 7, 2021

This PR is a net improvement over the previous release, which is good. But the thing is that the PR only exists because some of rollup's optimization assumptions regarding init expressions were incorrect. So future fixes would involve work from both sides of the issue - questioning whether some optimizations are tracking variable state changes correctly in the first place, and adding variable deoptimizations when it cannot be proven that an access can be safely elided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

var use before defined incorrect, let/const TDZ errors masked by tree shaking
2 participants