Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class method effects #4018

Merged
merged 56 commits into from May 23, 2021
Merged

Class method effects #4018

merged 56 commits into from May 23, 2021

Conversation

marijnh
Copy link
Contributor

@marijnh marijnh commented Mar 26, 2021

This PR contains:

  • bugfix
  • feature
  • refactor
  • documentation
  • other

Are tests included?

  • yes (bugfixes and features will not be merged without tests)
  • no

Breaking Changes?

  • yes (breaking changes will not be merged unless absolutely necessary)
  • no

List any relevant issue numbers:
Resolves #3989
Resolves #3250
Resolves #4016

Updated Description

To try out this PR before it is released, use npm install rollup@beta.

Core Changes

After being taken over, this PR has been very much extended beyond the original scope. Under the hood, two core changes have been introduced:

  • All AST nodes, variables and virtual entities now inherit from a basic ExpressionEntity that follows a "conservative by default" approach. I.e. a node that does not implement any methods of its own will behave identical to an "unknown expression" meaning any call, assignment or even access to this node is a side-effect and we treat it as an "unknown value" in conditionals.
  • There is now a shared ObjectEntity that implements object-like behaviour with regard to property access that is used by object literals, array literals, classes and virtual object and arrays generated by builtin methods. In the future, I hope to extend this to functions as well. This means that improving the behaviour of objects means just working on the ObjectEntity.
  • All deoptimization now happens lazily, e.g. assignments in dead code will no longer mess up the analysis.

New Features

  • Static-effect free methods and properties of classes can now be detected: Updated REPL, v2.48.0 REPL
  • This also extends to super classes. To that end, we included a concept of prototypes into the new ObjectEntity Updated REPL, v2.48.0 REPL
  • Side-effect free array elements can now be detected as well Updated REPL, v2.48.0 REPL
  • Getter and setter handling has been greatly expanded. The new "conservative by default" approach will even handle getters that are created within unknown functions Updated REPL, v2.48.0 REPL
  • Rollup is now able to precisely track mutations in methods, getters and setters via their "this" context Updated REPL, v2.48.0 REPL
  • Handling for builtin array methods has been extended to track mutations of the original array through methods that return the original array again Updated REPL, v2.48.0 REPL

Possible extensions for future PRs:

  • Also use ObjectEntity for functions and arrow functions. For now, those will deoptimize completely when properties are assigned.
  • Handle assignments to __proto__ and uses of Object.setPrototypeOf/Object.getPrototypeOf. In some situations, mutations may still not be detected here. This is a tricky one as technically, passing an object to an unknown method could mutate the object prototype. We will need to think what assumptions we keep in the system, e.g. "builtins always behave as expected". Maybe we add a flag here.
  • Track function parameter mutations the same way we track "this" mutations. This will be an important one.

Original Description (outdated)

Track static class fields and improve handling of class getters/setters

This aims to improve tree-shaking of code that uses static class properties (#3989) and to improve detection of side effects through class getters/setters (#4016).

The first part works by keeping a map of positively known static properties (methods and simple getters) in
ClassNode.staticPropertyMap, along with a flag (ClassNode.deoptimizedStatic) that indicates that something happened that removed our confidence that we know anything about the class object.

Access and calls to these known static properties are handled by routing the calls to getLiteralValueAtPath, getReturnExpressionWhenCalledAtPath, and hasEffectsWhenCalledAtPath to the known values in the properties. In contrast to ObjectExpression, this class does not try to keep track of multiple expressions associated with a property, since that doesn't come up a lot on classes.

The handling of side effect detection through getters and setters is done by, if the entire class object (or its prototype in case of access to the prototype) hasn't been deoptimized, scanning through the directly defined getters and setters to see if one exists (calling through to superclasses as appropriate). I believe that this is solid because any code that would be able to change the set of getters and setters on a class would cause the entire object to be deoptimized.

@codecov
Copy link

codecov bot commented Mar 26, 2021

Codecov Report

Merging #4018 (7490a87) into master (07b3a02) will increase coverage by 0.51%.
The diff coverage is 96.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4018      +/-   ##
==========================================
+ Coverage   97.49%   98.00%   +0.51%     
==========================================
  Files         193      201       +8     
  Lines        6824     6981     +157     
  Branches     2005     2049      +44     
==========================================
+ Hits         6653     6842     +189     
+ Misses         84       66      -18     
+ Partials       87       73      -14     
Impacted Files Coverage Δ
cli/run/watch-cli.ts 92.00% <ø> (-0.11%) ⬇️
src/ast/CallOptions.ts 100.00% <ø> (ø)
src/ast/nodes/ArrayPattern.ts 88.23% <ø> (ø)
src/ast/nodes/ArrowFunctionExpression.ts 100.00% <ø> (ø)
src/ast/nodes/BinaryExpression.ts 97.22% <ø> (ø)
src/ast/nodes/BlockStatement.ts 100.00% <ø> (ø)
src/ast/nodes/CatchClause.ts 100.00% <ø> (ø)
src/ast/nodes/IfStatement.ts 100.00% <ø> (ø)
src/ast/nodes/ImportDeclaration.ts 100.00% <ø> (ø)
src/ast/nodes/ReturnStatement.ts 100.00% <ø> (ø)
... and 75 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 07b3a02...7490a87. Read the comment docs.

@marijnh
Copy link
Contributor Author

marijnh commented Mar 26, 2021

#4011 was merged. I'll tweak tests to improve coverage and push again.

So that it sits in one place and is easier to extend.
This aims to improve tree-shaking of code that uses static class
properties (rollup#3989) and to improve detection of side effects through
class getters/setters (rollup#4016).

The first part works by keeping a map of positively known static
properties (methods and simple getters) in
`ClassNode.staticPropertyMap`, along with a flag
(`ClassNode.deoptimizedStatic`) that indicates that something happened
that removed our confidence that we know anything about the class
object.

Access and calls to these known static properties are handled by
routing the calls to `getLiteralValueAtPath`,
`getReturnExpressionWhenCalledAtPath`, and
`hasEffectsWhenCalledAtPath` to the known values in the properties. In
contrast to `ObjectExpression`, this class does not try to keep track
of multiple expressions associated with a property, since that doesn't
come up a lot on classes.

The handling of side effect detection through getters and setters is
done by, _if_ the entire class object (or its prototype in case of
access to the prototype) hasn't been deoptimized, scanning through the
directly defined getters and setters to see if one exists (calling
through to superclasses as appropriate). I believe that this is solid
because any code that would be able to change the set of getters and
setters on a class would cause the entire object to be deoptimized.
@marijnh
Copy link
Contributor Author

marijnh commented Mar 26, 2021

I was unable to create tests that run deoptimizeCache on a ClassNode. It also seems like, at least in this context, hasEffectsWhenAssignedAtPath is not called when hasEffectsWhenAccessedAtPath returns true for a given assignment. Thus there are a bunch of non-covered lines in the new code. Let me know if I can do something reasonable about them.

@marijnh marijnh changed the title Class method effects [do not merge] Class method effects Mar 26, 2021

const propertyMap = this.staticPropertyMap = Object.create(null);
const seen: {[name: string]: boolean} = Object.create(null);
for (const definition of this.body.body) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done by building an up-front lookup table similar to how we do this for object expression? For classes with many definitions, this will scale badly as mayModifyThisWhenCalledAtPath is called during tree-shaking with potentially many passes (dozens in larger code bases) with potentially many calls per pass. This is a highly performance critical code path.
Maybe some code can be even extracted from/shared with object expression. The object expression logic is indeed very powerful as it will also pre-resolve computed properties if possible, handle quoted properties, handle multiple definitions of the same name and handle unknown computed properties that might overwrite previous properties.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided against sharing code with ObjectExpression because there's a lot of small differences between how this works and how objects work (in syntax tree shape, handling of builtin properties, static/prototype distinction, and so on) and the helper functions would become monsters with a list of boolean parameters, and I prefer some vague duplication over that most of the time.

Building up a table is probably a good idea. Looking into that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, the thing you're commenting on here is building up a table. The suggestions holds for mayHaveGetterSetterEffect, but, as far as I can see, only there.

kind: 'get' | 'set', isStatic: boolean, name: string,
context: HasEffectsContext
) {
for (const definition of body.body) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again this could benefit from a pre-built lookup table.

}

deoptimizeCache() {
this.deoptimizeAllStatics();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe this is necessary unless you rely on retrieving literal values for computed properties. That is also why you cannot test it.

So it works like this:
When you perform a getLiteralValueAtPath or a getReturnExpressionWhenCalledAtPath you pass a DeoptimizableEntity as the last parameter (which is usually the Node requesting the literal/return value). The reason is that it is possible a literal/return value might be deoptimized at a later time when binding/tree-shaking progresses. If that happens, then some Node down the chain that is responsible for the deoptimization will call deoptimizeCache on this entity. At this point in time, the entity is responsible for "forgetting" the cached value and also deoptimize the cache of all dependent entities that may rely on that value.

Admittedly it is kind of difficult to describe such deoptimization scenarios. Ideally I would just put a breakpoint into "deoptimizeCache" of ObjectExpression and see what test triggers it.

This will become even more important in the future as one goal is to move more and more of the initial deoptimizations to "on demand late deoptimization" i.e. only deoptimize when as reassignment is included.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe this is necessary unless you rely on retrieving literal values for computed properties.

You mean this is only necessary for nodes that themselves retrieve literal values? Do I understand correctly that the call-throughs to definition nodes in the class' own getLiteralValueAtPath and getReturnExpressionWhenCalledAtPath don't count because they just pass through their argument instead of passing themselves?

@marijnh
Copy link
Contributor Author

marijnh commented Mar 27, 2021

Attached patches drop the deoptimizeCache method and optimize getter/setter lookup with a table.

@marijnh
Copy link
Contributor Author

marijnh commented Mar 28, 2021

Rebased onto current master branch and added the new argument to mayModifyThisWhenCalledAtPath.

@lukastaegert
Copy link
Member

Sorry the review takes some time, but I want to test it thoroughly. It is very capable, here are some of my findings so far (The REPL is using your PR via its CircleCI build number):

Reassigning the prototype of a class throws in strict mode, which means it should likely be retained even if the class is not used (but I do not think this is critical):
https://rollupjs.org/repl/?circleci=14744&shareable=JTdCJTIybW9kdWxlcyUyMiUzQSU1QiU3QiUyMm5hbWUlMjIlM0ElMjJtYWluLmpzJTIyJTJDJTIyY29kZSUyMiUzQSUyMmNsYXNzJTIwRm9vJTIwJTdCJTdEJTVDbiUyRiUyRiUyMFRoaXMlMjB3b3VsZCUyMHRocm93JTIwaW4lMjBhJTIwbW9kdWxlJTVDbkZvby5wcm90b3R5cGUlMjAlM0QlMjBudWxsJTNCJTIyJTJDJTIyaXNFbnRyeSUyMiUzQXRydWUlN0QlNUQlMkMlMjJvcHRpb25zJTIyJTNBJTdCJTIyZm9ybWF0JTIyJTNBJTIyZXMlMjIlMkMlMjJuYW1lJTIyJTNBJTIybXlCdW5kbGUlMjIlMkMlMjJhbWQlMjIlM0ElN0IlMjJpZCUyMiUzQSUyMiUyMiU3RCUyQyUyMmdsb2JhbHMlMjIlM0ElN0IlN0QlN0QlMkMlMjJleGFtcGxlJTIyJTNBbnVsbCU3RA==

While the logic is able to trace values through class properties, it still thinks reading those values has a side-effect. I would think this can be fixed by relaying the check to the actual property method definition. Of course it would mean extending PropertyDefinition to support that.
https://rollupjs.org/repl/?circleci=14744&shareable=JTdCJTIybW9kdWxlcyUyMiUzQSU1QiU3QiUyMm5hbWUlMjIlM0ElMjJtYWluLmpzJTIyJTJDJTIyY29kZSUyMiUzQSUyMmNsYXNzJTIwRm9vJTIwJTdCJTVDbiUyMCUyMHN0YXRpYyUyMGZvbyUyMCUzRCUyMCU3QmJhciUzQSUyMHRydWUlN0QlM0IlNUNuJTdEJTVDbiU1Q24lMkYlMkYlMjBUaGVyZSUyMGlzJTIwbm8lMjByZWFzb24lMjB0byUyMHJldGFpbiUyMHRoZSUyMGNvbmRpdGlvbiUyMGhlcmUlNUNuaWYlMjAoRm9vLmZvby5iYXIpJTIwY29uc29sZS5sb2coJ3ZhbHVlJTIwaXMlMjB0cnVlJyklNUNuZWxzZSUyMGNvbnNvbGUubG9nKCd2YWx1ZSUyMGlzJTIwZmFsc2UnKSUzQiU1Q24lNUNuJTJGJTJGJTIwT3IlMjB0aGUlMjByaWdodC1oYW5kJTIwc2lkZSUyMGhlcmUlNUNuY29uc3QlMjB1bnVzZWQxJTIwJTNEJTIwRm9vLmZvby5wcm9wZXJ0eSUzQiU1Q24lNUNuY2xhc3MlMjBCYXIlMjAlN0IlNUNuJTIwJTIwc3RhdGljJTIwZm9vJTIwJTNEJTIwJTdCYmFyJTNBJTIwdHJ1ZSU3RCUzQiU1Q24lN0QlNUNuJTVDbkJhci5mb28lMjAlM0QlMjBudWxsJTNCJTVDbiU1Q24lMkYlMkYlMjBIZXJlJTIwdGhlJTIwcmlnaHQtaGFuZCUyMHNpZGUlMjBzaG91bGQlMjBiZSUyMHJldGFpbmVkJTVDbmNvbnN0JTIwdW51c2VkMiUyMCUzRCUyMEJhci5mb28ucHJvcGVydHklM0IlMjIlMkMlMjJpc0VudHJ5JTIyJTNBdHJ1ZSU3RCU1RCUyQyUyMm9wdGlvbnMlMjIlM0ElN0IlMjJmb3JtYXQlMjIlM0ElMjJlcyUyMiUyQyUyMm5hbWUlMjIlM0ElMjJteUJ1bmRsZSUyMiUyQyUyMmFtZCUyMiUzQSU3QiUyMmlkJTIyJTNBJTIyJTIyJTdEJTJDJTIyZ2xvYmFscyUyMiUzQSU3QiU3RCU3RCUyQyUyMmV4YW1wbGUlMjIlM0FudWxsJTdE

In general with regard to getter/setter handling, I wonder if we could not build a single table to handle everything just like ObjectExpression does it? I know it moves complexity a little, but it also makes it easier to compare the logic of the two, and it might even be faster as it only does a single iteration (if you think the logic in ObjectExpression could be made easier to understand e.g. by choosing better names or a better table layout, that would be welcome as well).

Will continue later.

@marijnh
Copy link
Contributor Author

marijnh commented Mar 31, 2021

In general with regard to getter/setter handling, I wonder if we could not build a single table to handle everything just like ObjectExpression does it?

I'm open to that, but there's so much going on there that I couldn't figure out what the system used there works like. If you could do a quick rundown of that logic, I'll take a look.

@github-actions
Copy link

github-actions bot commented Apr 2, 2021

Thank you for your contribution! ❤️

You can try out this pull request locally by installing Rollup via

npm install marijnh/rollup#class-method-effects

or load it into the REPL:
https://rollupjs.org/repl/?pr=4018

@lukastaegert
Copy link
Member

As I often find myself looking up the logic again, I pushed a comment explaining the logic to ObjectExpression on your branch. I hope this helps. If you have an idea for better names or a better algorithm, let me know.

As a matter of fact, I just discovered that the implementation has a bug because apparently

var obj = {
  foo: 'value',
  set foo(value) {}
}
console.log(obj.foo);

will not log value but undefined (but it does not throw! Even if the first property were missing). Only if the first property is a getter it actually logs value. Not yet sure how to adapt the algorithm to accommodate that. If there is a regular property followed by a getter it works correctly though. In that case, writing the property does not throw, but the value is lost anyway.

@marijnh
Copy link
Contributor Author

marijnh commented Apr 6, 2021

Thanks for those clarifications. Unfortunately, I still find the ObjectExpression property tracking very daunting, with lots of subtle corner cases, and I'm just not confident I could reproduce (or share) that logic in the class property tracking without introducing a bunch of new bugs.

And since I also found out that, even with the extensions proposed in this PR, Rollup still won't remove the calls that motivated this work (due to dereferencing an object parameter), I'm not really motivated to continue on this. Feel free to close it or to keep it around for future reference. Sorry for wasting your time.

(As a general—probably quite unhelpful—observation, I think the library would benefit from a more rigorous abstract interpretation framework, as the current approach, which I assume grew organically as new requirements come up, doesn't really seem to be up to the complicated uses that are being made of it. But of course that kind of overhaul would be a huge amount of work, and who has time for that.)

@lukastaegert
Copy link
Member

Yes I understand. Please leave this PR open, I will try to take it over in the next days.

As a general—probably quite unhelpful—observation, I think the library would benefit from a more rigorous abstract interpretation framework

Probably, and there will be larger refactorings in the future as there have been in the past. However for the sake of continuous improvement in smaller steps, one takeaway from this story would be that we would benefit from an abstract "JavaScript object" data type/class/... where nodes like ObjectExpression or ClassNode just build a common data structure with the necessary information while a shared logic handles side effect detection. This could then be reused for all object-like types there are, e.g. FunctionNode, ArrayExpression etc.

@lukastaegert
Copy link
Member

Breaking with the rule of "keeping a PR minimal and to the point", I have abused this PR now to also fix quite a few issues in Rollup around the use of getters and setters. I also released a pre-release version you can try via npm install rollup@beta.

This is now ready for release, but I will keep it open for a few days before doing so to let people give it a spin.

Core Changes

After being taken over, this PR has been very much extended beyond the original scope. Under the hood, two core changes have been introduced:

  • All AST nodes, variables and virtual entities now inherit from a basic ExpressionEntity that follows a "conservative by default" approach. I.e. a node that does not implement any methods of its own will behave identical to an "unknown expression" meaning any call, assignment or even access to this node is a side-effect and we treat it as an "unknown value" in conditionals.
  • There is now a shared ObjectEntity that implements object-like behaviour with regard to property access that is used by object literals, array literals, classes and virtual object and arrays generated by builtin methods. In the future, I hope to extend this to functions as well. This means that improving the behaviour of objects means just working on the ObjectEntity.

New Features

  • Static-effect free methods and properties of classes can now be detected: Updated REPL, v2.48.0 REPL
  • This also extends to super classes. To that end, we included a concept of prototypes into the new ObjectEntity Updated REPL, v2.48.0 REPL
  • Side-effect free array elements can now be detected as well Updated REPL, v2.48.0 REPL
  • Getter and setter handling has been greatly expanded. The new "conservative by default" approach will even handle getters that are created within unknown functions Updated REPL, v2.48.0 REPL
  • Rollup is now able to precisely track mutations in methods, getters and setters via their "this" context Updated REPL, v2.48.0 REPL
  • Handling for builtin array methods has been extended to track mutations of the original array through methods that return the original array again Updated REPL, v2.48.0 REPL

@marijnh
Copy link
Contributor Author

marijnh commented May 18, 2021

Thanks for continuing this after I gave up. The changes sound like they address a number of the pain points I ran into. I tried bundling a bunch of my projects with the beta and ran their tests without seeing any regressions.

@lukastaegert
Copy link
Member

lukastaegert commented May 19, 2021

One improvement I forgot to list is that now all deoptimization happens lazily, i.e. all assignments in dead code will no longer mess up the analysis. Before, these kinds of deoptimizations were happening during the "bind" phase, which was causing its own bunch of issues. Now, the "bind" phase is solely for binding variables to nodes as it was originally intended 😉

@dasa
Copy link

dasa commented May 19, 2021

I think I've found a regression in this PR where it's not tracking the items pushed to an array.

2.48 REPL

PR REPL

It's dropping the array push inside the every callback and breaking the logic.

@kzc
Copy link
Contributor

kzc commented May 20, 2021

Simplified test case:

let a = [];
a.push("pass");
console.log(a[0] || "fail");

PR REPL:

console.log("fail");

@lukastaegert
Copy link
Member

Thanks a lot, great catch! There was a bug in the array deoptimization tracking that did not consider mutations via builtins in one of the places. Fixed now. Also released another beta.

@lukastaegert lukastaegert merged commit d73b2ac into rollup:master May 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants