Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inlining of the function definitions that have conflicting arguments #908

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

aminya
Copy link

@aminya aminya commented Jan 12, 2021

Based on my tests this results in a more performant code and 30% more size-reduction

Related to #907

  • Make it possible to inject args in pure getters by renaming their arguments
  • Adding an option to ignore inlining if the arguments cannot be injected (because only injecting the function definition can have runtime regression).

return false;
}
if (scope.conflicting_def(arg.name)) {
// TODO arg should be renamed for safety
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a function that renames this particular argument in the fn function. Is there a method for that?

cc: @fabiosantoscode @alexlamsl

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just about the one argument in the function.

I've given multiple attempts at this before and I've always encountered a problem like this:

function x() {
  const conflict = 4
  function y() {
    z()
  }
  y()
}

const conflict = "PASS"
function z() {
  console.log(conflict)
}
x()
x()

In this situation, z() could be inlined into y but then it gets the wrong value for conflict.

You can check out that terser doesn't inline z() by running the above through terser bug.js --toplevel -c defaults=false,inline=true,reduce_vars=true,unused=true.

For now this is too tricky to fix in a maintainable way, because it involves renaming lots of things and being wary of lots of pitfalls.

Copy link
Author

@aminya aminya Feb 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, we should disable inlining altogether if the arguments are not injectable. As I mentioned in #907, this has a worse runtime performance with not much size reduction.

This is worse:

// Bad (heap lambda definition is called instead of free function)
new class{
  method(o){
    !function(o){
     console.log(o)
     }(o)
  }
};

than this in the runtime:

// Good (free function is called)
function myPureFunction (a) {
 console.log(a) 
}
new class {
  method(a) {
   myPureFunction(a) 
  }
}

I also think we should limit the optimization so only the pure functions get inlined. If a function is pure, there is no need to be worried about its surroundings. We just need to make the inner variables of that pure function unique (for example, by prepending the function name), and inline the code into the call site.

For example, something like this test:

rename_and_inline: {
    options = {
        toplevel: true,
        defaults: true,
        pure_getters: true
    }
    input: {
        class MyClass {
          method(a) {
           myPureFunction(a)
          }
        }

        function myPureFunction (a) {
         console.log(a)
        }

        new MyClass();
    }
    expect: {
        new class{method(a) {console.log(a)}};
    }
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If by "pure" you mean that it doesn't use anything from the closure it should be a safe transform indeed.

Every function has a .clone(deep, toplevel) method, which clones the scope and calls figure_out_scope once more. This cloned function can then get a rename.

There's no function that renames symbols for safety, but there is a method create_symbol which can create a non-colliding symbol taking several scopes into consideration (you want the function scope and the scope you're inlining into).

For renaming you can use walk(node, node => ...) which does a recursive walk, and look for SymbolRef instances that have .definition() === to the definition of the argument you're renaming. You'll have to change the definition's name too, as well as the correct function argument.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To address your test, with avoiding inlining when the function is hard to inline, yes, this is very desirable behaviour. Sometimes further passes can make IIFE inlinable, but if this is not frequent (or is fairly predictable) then it's a great improvement.

Copy link
Author

@aminya aminya Apr 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit. Will this work? It is quite hard to add features because this requires a knowledge of the internal implementation of Terser and a lack of TypeScript help for the available methods.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the lack of typings is not exactly a plus here :/

I've considered spending time trying to sort that out, but now I've joined an effort to rewrite terser in rust, which will fix that problem once we're done.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing Terser in a more performant language like Go and Rust would be awesome!

@aminya aminya force-pushed the inline-if-args-injectable branch 2 times, most recently from 247abce to 9cd69cc Compare April 26, 2021 07:00
} else {
return walk(node, cb);
}
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In lib/scope.js is an easy example of how to rename a variable. The definition object (obtained through node.thedef or node.definition()) can help us iterate on references and definitions alike.

It also deal with shadowing (in case fn here contains another scope which redefines a variable of the same name)

def.name = name
def.orig.forEach(function(sym) {      
    sym.name = name;                  
});                                   
def.references.forEach(function(sym) {
    sym.name = name;                  
});                                   

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise this looks great, and the tests are sound.

Copy link
Author

@aminya aminya May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I edited the renaming code. However, I don't know why the original function node doesn't seem to update. The last test should fail as I have written it. The inlined argument should be named argument_1. Or maybe I am understanding this wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the cause is the fact that the "defaults" option is enabled, causing more transforms to occur.

If you remove defaults and add reduce_vars, inline and unused, it should showcase what you did a bit better. Even if the code in the "expect" section isn't truly optimally minified, which isn't so important for a test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have a deeper review of this PR later today. My cursory look says it's good to merge, but I'll check it out better after work.

Copy link
Author

@aminya aminya May 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After spending more time on this, if I am correct, I actually think this is the expected behavior! I wanted to rename the variables of the pure function so they can be inlined by the compressor (like the second test). The compressor correctly substituted the new name we created with the argument passed in!

options = {
toplevel: true,
defaults: true,
pure_getters: false // TODO This even works without pure getters which is not consistent with the previous test
Copy link
Author

@aminya aminya May 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This inconsistent behavior is surprising.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pure_getters option only assumes that a.b is pure (not always safe due to proxies and impure getters). This test doesn't use any getters so it's not a surprise that the results don't change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this changes the results for the first two tests, where the only difference is the value of pure_getters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't notice this at first.

@aminya aminya changed the title inlining of function definition inside a class method inlining of the function definitions that have conflicting arguments May 7, 2021
@aminya
Copy link
Author

aminya commented May 8, 2021

The code overlooks some corner cases. Even though I apparently rename the conflicting arguments of the inner function, the code that is responsible for inlining doesn't handle that properly. For example, in the following example, the renamed argument should be assigned and then inlined (similar to the test that I have added), but it seems that it is inlined directly which leads to a runtime error.

---INPUT---
{
    const a = "32";
    !function() {
        var b = a + 1;
        !function(a) {
            console.log(b, a++);
        }(0);
    }();
}
---OPTIONS---
{
    "mangle": false
}
---OUTPUT---
const a="32";console.log("321",a++);
---EXPECTED STDOUT---
321 0

---ACTUAL ERROR---
TypeError: Assignment to constant variable.

@fabiosantoscode
Copy link
Collaborator

I'm looking into how this can be improved. One of the problems it has is that it mutates the original function. This might not be acceptable, because the function might have multiple uses. The other is that fn.body[0] instead of just fn.

There's 2 stores for variable information in Terser, the scope and also the reduce_vars graph. They're generated wholesale and the information is used throughout the compress step. They help with figuring out how variables are used and whether they can be moved around.

The scope can be rebuilt with figure_out_scope, (individual symbols can be added with create_symbol) but the reduce_vars graph I don't really have an answer for right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants