New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inlining of the function definitions that have conflicting arguments #908
base: master
Are you sure you want to change the base?
Conversation
f9d9838
to
208e584
Compare
lib/compress/index.js
Outdated
return false; | ||
} | ||
if (scope.conflicting_def(arg.name)) { | ||
// TODO arg should be renamed for safety |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need a function that renames this particular argument in the fn
function. Is there a method for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just about the one argument in the function.
I've given multiple attempts at this before and I've always encountered a problem like this:
function x() {
const conflict = 4
function y() {
z()
}
y()
}
const conflict = "PASS"
function z() {
console.log(conflict)
}
x()
x()
In this situation, z() could be inlined into y but then it gets the wrong value for conflict
.
You can check out that terser doesn't inline z() by running the above through terser bug.js --toplevel -c defaults=false,inline=true,reduce_vars=true,unused=true
.
For now this is too tricky to fix in a maintainable way, because it involves renaming lots of things and being wary of lots of pitfalls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, we should disable inlining altogether if the arguments are not injectable. As I mentioned in #907, this has a worse runtime performance with not much size reduction.
This is worse:
// Bad (heap lambda definition is called instead of free function)
new class{
method(o){
!function(o){
console.log(o)
}(o)
}
};
than this in the runtime:
// Good (free function is called)
function myPureFunction (a) {
console.log(a)
}
new class {
method(a) {
myPureFunction(a)
}
}
I also think we should limit the optimization so only the pure functions get inlined. If a function is pure, there is no need to be worried about its surroundings. We just need to make the inner variables of that pure function unique (for example, by prepending the function name), and inline the code into the call site.
For example, something like this test:
rename_and_inline: {
options = {
toplevel: true,
defaults: true,
pure_getters: true
}
input: {
class MyClass {
method(a) {
myPureFunction(a)
}
}
function myPureFunction (a) {
console.log(a)
}
new MyClass();
}
expect: {
new class{method(a) {console.log(a)}};
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If by "pure" you mean that it doesn't use anything from the closure it should be a safe transform indeed.
Every function has a .clone(deep, toplevel) method, which clones the scope and calls figure_out_scope once more. This cloned function can then get a rename.
There's no function that renames symbols for safety, but there is a method create_symbol which can create a non-colliding symbol taking several scopes into consideration (you want the function scope and the scope you're inlining into).
For renaming you can use walk(node, node => ...) which does a recursive walk, and look for SymbolRef instances that have .definition() === to the definition of the argument you're renaming. You'll have to change the definition's name too, as well as the correct function argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To address your test, with avoiding inlining when the function is hard to inline, yes, this is very desirable behaviour. Sometimes further passes can make IIFE inlinable, but if this is not frequent (or is fairly predictable) then it's a great improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a commit. Will this work? It is quite hard to add features because this requires a knowledge of the internal implementation of Terser and a lack of TypeScript help for the available methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the lack of typings is not exactly a plus here :/
I've considered spending time trying to sort that out, but now I've joined an effort to rewrite terser in rust, which will fix that problem once we're done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing Terser in a more performant language like Go and Rust would be awesome!
247abce
to
9cd69cc
Compare
9cd69cc
to
79c1c43
Compare
lib/compress/index.js
Outdated
} else { | ||
return walk(node, cb); | ||
} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In lib/scope.js is an easy example of how to rename a variable. The definition object (obtained through node.thedef
or node.definition()
) can help us iterate on references and definitions alike.
It also deal with shadowing (in case fn
here contains another scope which redefines a variable of the same name)
def.name = name
def.orig.forEach(function(sym) {
sym.name = name;
});
def.references.forEach(function(sym) {
sym.name = name;
});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise this looks great, and the tests are sound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I edited the renaming code. However, I don't know why the original function node doesn't seem to update. The last test should fail as I have written it. The inlined argument should be named argument_1
. Or maybe I am understanding this wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the cause is the fact that the "defaults" option is enabled, causing more transforms to occur.
If you remove defaults and add reduce_vars, inline and unused, it should showcase what you did a bit better. Even if the code in the "expect" section isn't truly optimally minified, which isn't so important for a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have a deeper review of this PR later today. My cursory look says it's good to merge, but I'll check it out better after work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After spending more time on this, if I am correct, I actually think this is the expected behavior! I wanted to rename the variables of the pure function so they can be inlined by the compressor (like the second test). The compressor correctly substituted the new name we created with the argument passed in!
options = { | ||
toplevel: true, | ||
defaults: true, | ||
pure_getters: false // TODO This even works without pure getters which is not consistent with the previous test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This inconsistent behavior is surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pure_getters option only assumes that a.b
is pure (not always safe due to proxies and impure getters). This test doesn't use any getters so it's not a surprise that the results don't change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this changes the results for the first two tests, where the only difference is the value of pure_getters
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't notice this at first.
The code overlooks some corner cases. Even though I apparently rename the conflicting arguments of the inner function, the code that is responsible for inlining doesn't handle that properly. For example, in the following example, the renamed argument should be assigned and then inlined (similar to the test that I have added), but it seems that it is inlined directly which leads to a runtime error. ---INPUT---
{
const a = "32";
!function() {
var b = a + 1;
!function(a) {
console.log(b, a++);
}(0);
}();
}
---OPTIONS---
{
"mangle": false
}
---OUTPUT---
const a="32";console.log("321",a++);
---EXPECTED STDOUT---
321 0
---ACTUAL ERROR---
TypeError: Assignment to constant variable. |
I'm looking into how this can be improved. One of the problems it has is that it mutates the original function. This might not be acceptable, because the function might have multiple uses. The other is that There's 2 stores for variable information in Terser, the scope and also the reduce_vars graph. They're generated wholesale and the information is used throughout the compress step. They help with figuring out how variables are used and whether they can be moved around. The scope can be rebuilt with figure_out_scope, (individual symbols can be added with create_symbol) but the reduce_vars graph I don't really have an answer for right now. |
Based on my tests this results in a more performant code and 30% more size-reduction
Related to #907