New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple passes on tree shaking #3200
Comments
I think maybe this shouldn't be happening if the variable is in a dead code block, given the reassign would never happen. rollup/src/ast/variables/LocalVariable.ts Line 72 in 53fb6fe
Not sure how to walk up the tree from here though and determine of the code would be included. |
The problem is indeed that these deoptimizations happen during bind and not during the actual tree-shaking when it is clearer if the variable is part of included code. This is mostly because of performance reasons to avoid having repeated checks or deoptimizations during the multiple treeshaking passes. At the moment, we can rely on the fact that we know all that is necessary about variable values during the actual treeshaking. I am aware of the scenario you describe, but I am not sure, how common it actually is. |
But maybe it would be worthwhile to challenge this, especially if you are willing to do some digging. Worst thing you would learn a little about how the algorithm works 😉. So a little background how treeshaking works:
|
@lukastaegert thanks! I'm interested in learning more about rollup so will take a look when I get time |
Ok so adding shouldBeIncluded(context: InclusionContext): boolean {
this.left.deoptimizePath(EMPTY_PATH);
this.right.deoptimizePath(UNKNOWN_PATH);
return super.shouldBeIncluded(context);
} and removing the Is this what you were thinking? I guess then the same kind of thing should be done in all the other |
Yes, but it would be nice to also track via a flag if this has happened so that it only happens on the first pass as deoptimizations are not necessarily cheap (and depending on the code, there can be 20 passes or more). Now there are some points to consider:
let value = false;
try {
value = true;
} catch {}
if (value) {
console.log('must be included');
} I hope there are no other cases that are forgotten. The problem with the current approach is that it creates some "hidden" dependencies between some parts of the code, but maybe this is manageable for the moment. The deoptimizations in other bind statements could also be investigated. |
@lukastaegert I opened a draft PR #3212 which fails 1 test (deoptimize-member-expressions) I'm still pretty confused with how it works though and have a lot of questions. Don't suppose there's some docs anywhere on There's code like this for example in hasEffectsWhenAccessedAtPath(path: ObjectPath, context: HasEffectsContext) {
if (path.length === 0) return false;
if (this.isReassigned || path.length > MAX_PATH_DEPTH) return true;
const trackedExpressions = context.accessed.getEntities(path);
if (trackedExpressions.has(this)) return false;
trackedExpressions.add(this);
return (this.init && this.init.hasEffectsWhenAccessedAtPath(path, context)) as boolean;
} Why do we always return |
I found https://github.com/estree/estree and https://astexplorer.net/ which helped with some things. I’m thinking about creating a dev getting started guide or something similar with links after all this Maybe another approach would be to register all the assignment nodes with LocalVariable during bind(). Then during the treeshaking run getLiteralValueAtPath() on a LocalVariable can check hasEffects/shouldBeIncluded() on all the assignments. If this is all false then it knows it can return the init value. AssignmentPattern would then have the ability to say if it would be included by checking shouldBeIncluded up the parents. (Still not really sure what the difference is between shouldBeIncluded and hasEffects, other than maybe the result of shouldBeIncluded is cached and hasEffects isn’t) |
Unfortunately not. The reason is that these are internal implementation details that are often subject to change, and any kind of documentation could be outdated, if not outright wrong in half a year. Therefore the goal is to make the names as descriptive as possible so that they tell you what they do. If you as an outsider have suggestions for better names or you find things misleading, I am very open to improvement suggestions here, even if it means changing a name in 50 files! To figure out what a specific piece of code does, I recommend changing or removing the code in question and see what tests turn red. This may not REALLY tell you what the mechanics are, but it will tell you which problem was solved by the code. Thus, test coverage is CRUCIAL in this project, and I very much recommend you write tests for everything new you implement early on, maybe even test-first. At least for me, tests are also a great debugging tool as they allow you to easily run Rollup on a given piece of code while finding out what is going on. I am saying this because you already implemented something new without a test :) Also, tests are important for documenting what your PR is trying to achieve. For some time now I wanted to write a TESTS.md documentation for how to write tests but never get around to it. For your purpose, two kinds of tests are important:
Here is an example issue that is caught by this: var foo = foo || {};
console.log(foo.x); When checking if accessing
The reason is that if this was already checked for side-effects and we are STILL searching for side-effects, then the first call must have returned false and we can do an early return.
That would be REALLY awesome! Another pro-tip: Each AST Node has a custom
Not sure this is a good idea for various reasons
Checking parents is a bad pattern (at least in that code base) that should be avoided as much as possible.
Thus the idea is to practice a "one-way-data-flow" pattern as much as possible, not for efficiency's but for complexity's sake. This is also why the I will leave some more comments at your PR, by now I also figured out why one test is failing, because you actually uncovered a small bug. |
Thanks for the super detailed explanation and the comments on the PR!
Took a while but yeh I get this now :D I started drafting some docs here, which from what you said before might be a bit detailed, but it's helping me for now anyway One thing I'm trying to understand now is whether it's possible for Searching the code currently it seems like this might not be possible, but moving the deoptimize call into In an |
Actually my explanation was somewhat incomplete. There are two cases:
Very nice! I have looked through it, some points I came across:
I think this is misleading. It creates the PluginDriver, which is basically a plugin manager. The plugins themselves are already intantiated and working when they are passed to Rollup by the user, it is just an adapter to unify and simplify accessing the plugins by Rollup.
This name choice is probably not ideal as Otherwise really nice so far, great work.
Probably a little, but we'll see.
Actually this can happen when the call is moved, but this should not a problem. Consider this example: let testValue = true;
let reassignTestValue = () => {};
reassignTestValue = () => testValue = false; // A
reassignTestValue(); // B
if (testValue) { // C
console.log(1);
} else {
console.log(2);
} Here is what should happen:
Note that every statement that was previously included is included again for each new pass. The reason is that since new variables were included, expressions that previously did not have side-effects may now have side-effects, and for instance the body of an included function could receive additional included statements during the next pass. AssignmentExpressions are the most important example here, but of course this extends to calls to functions that contain assignments etc. Thus on the first pass |
Closing this now given the pr is merged :) I haven’t had time to read/process all this yet so I might come back with more questions in the future |
Thanks from 2022, finding ways to learn more about Rollup internal😃, love that dev guide |
Feature Use Case
Tree shaking works great, but sometimes more things could be removed if more passes were made.
E.g
repl
becomes
If there was another iteration then the
if (!b) {
statement would be removed.I understand currently it isn't removed because of the
b = false
, but this would be missing in the second pass.Note this is a simplified example, but we do have some logic in our app that would benefit from this.
Feature Proposal
Run multiple passes until no more changes are made, or figure out a way of detecting if reassignments to a variable are in a branch that would be removed.
Happy to investigate this and open a PR!
The text was updated successfully, but these errors were encountered: