Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pure annotation - called CallExpression #2331

Open
Andarist opened this issue Sep 23, 2017 · 25 comments
Open

pure annotation - called CallExpression #2331

Andarist opened this issue Sep 23, 2017 · 25 comments
Labels

Comments

@Andarist
Copy link
Contributor

Andarist commented Sep 23, 2017

Bug report
ES5 input

Uglify version (uglifyjs -V)
3.1.1

The uglifyjs CLI command executed or minify() options used.

uglifyjs -mc passes=3,unsafe --toplevel

JavaScript input/output

// input
/*#__PURE__*/fn1(fn2())
// output
fn2();
// input
/*#__PURE__*/fn1(fn2())()
// blank output

It seems like a bug that by calling a CallExpression the inner unannotated call get eliminated.

@kzc
Copy link
Contributor

kzc commented Sep 23, 2017

Yes, the second example above is definitely wrong. It apparently ignores the side effects of the outermost AST_Call.expression.

Are we in agreement that the first example above is fine, and the output of the second example should be the following?

fn1(fn2());

Edit: fixed expected result above.

@Andarist Just curious whether this bug was discovered in real life code. The reason being that uglify-js@2.x would also have this bug and it is more widely used. A fix would have to be back-ported to 2.x if this occurred in a popular npm package.

@kzc
Copy link
Contributor

kzc commented Sep 24, 2017

This shows the issue more clearly:

$ echo '/*#__PURE__*/fn1(fn2(a()), b())(c());' | bin/uglifyjs -c -b
c();

This call should be removed:

/*#__PURE__*/fn1(fn2(a()), b())(c());
                               ^   ^

It should output:

fn1(fn2(a()), b()), c();

@alexlamsl alexlamsl added the bug label Sep 24, 2017
@Andarist
Copy link
Contributor Author

Are we in agreement that the first example above is fine, and the output of the second example should be the following?
fn1(fn2());

Aint sure, how to annotate all of them as pure then?

@Andarist Just curious whether this bug was discovered in real life code.

Not rly 🧌 I've written babel-plugin-annotate-pure-calls and was playing around with test cases when Ive discovered this.

Would love yours (and @alexlamsl's too) review on that plugin. You guys are way more aware how uglifyjs works and how to 'properly' annotate the code.

$ echo '/#PURE/fn1(fn2(a()), b())(c());' | bin/uglifyjs -c -b

Same thins as above - how to annotate each of this calls to have them all removed? ofc the problematic here is a call to fn1

Also I was wondering what should happen for:

/*#__PURE__*/fn()()()()

Is a single annotation enough to DCE whole thing?

@kzc
Copy link
Contributor

kzc commented Sep 24, 2017

Not rly 🧌 I've written babel-plugin-annotate-pure-calls and was playing around with test cases when Ive discovered this.

Good to know that this wasn't discovered in real life code. I was concerned that this came up as result of pure annotations in core Babel polyfills or something.

I took a look at the sources and tests for babel-plugin-annotate-pure-calls and to be honest I don't understand the use case and I don't think it's safe in the general case. What heuristic does it use to safely deem certain top level calls in a program (or closure) to be pure? For example, this test case result looks to be incorrect. Had the original code been passed through uglifyjs --toplevel -mc passes=3,unsafe the result would be a=void call();, but if the expected result were passed through the same uglify command it would yield a=void 0;. Program flow analysis is a difficult problem.

$ echo '/*#__PURE__*/fn1(fn2(a()), b())(c());' | bin/uglifyjs -c -b

how to annotate each of this calls to have them all removed?

Multiple annotations and well placed parentheses could be used.

When the pure annotation feature was added no consideration had been given to that fact that the outer AST_Call.expression could have side effects - unlike the AST_Call.args which are considered. One possible interpretation of this issue could be that everything is working as designed as we are instructing uglify to deem the call to be side effect free and ignore the side effects of AST_Call.expression.

what should happen for:
/*#__PURE__*/fn()()()()

The intention was that the pure annotation would affect the outermost call only. AST_Call.expression in this case is fn()()(). Presently uglify ignores the side effects of AST_Call.expression and the original statement is dropped altogether. A case could be made for either outputting nothing (as uglify does now) or outputting fn()()();. This requires more thought.

For example the following output is logical and correct:

$ echo '( /*#__PURE__*/f() )()' | bin/uglifyjs -c
f()();

Although f() is deemed to be pure, its result is still required by the program so it is not dropped.

@kzc
Copy link
Contributor

kzc commented Sep 24, 2017

@alexlamsl What's your take on this issue?

Should a pure annotated call ignore the side effects of AST_Call.expression as it does now or should side effects be considered as is the case with AST_Call.args?

@alexlamsl
Copy link
Collaborator

@Andarist @kzc sorry for the radio silence – been running around town the whole day.

So I think what happens is the call has_pure_annotation() needs to be extracted out from is_expr_pure() as the former only guarantees the call being pure.

And if that is the case, then there are two locations to patch up, i.e. def(AST_Call) in has_side_effects() and drop_side_effect_free().

(I like how pure_funcs assumes both expression & call are pure, but that's topic for another day.)

(I typed this on the phone in a metro heading home, so apologies in advance for any mistakes.)

@kzc
Copy link
Contributor

kzc commented Sep 24, 2017

(I like how pure_funcs assumes both expression & call are pure, but that's topic for another day.)

Actually, that is the topic for today. :-)

@alexlamsl Implementation aside, do you think that side effects should be considered for AST_Call.expression for /*@__PURE__*/ calls?

I'm on the fence. I can see a case for either behavior.

@alexlamsl
Copy link
Collaborator

If we want consistent behaviour alongside pure_funcs, then the current behaviour is good as is.

I remotely recall in the distant past I tried to "fix" pure_funcs and got bitten by the onslaught of test failures. Since I was very green with uglify-js, I chose strategic retreat.

@alexlamsl
Copy link
Collaborator

Just making sure this actually works:

$ echo 'fn1(fn2(a()),b())(c());' | uglifyjs -c pure_funcs="fn1(fn2(a()),b())"
c();

@kzc
Copy link
Contributor

kzc commented Sep 24, 2017

If we want consistent behaviour alongside pure_funcs, then the current behaviour is good as is.

I'm leaning in that direction. When you think about it, the behavior is somewhat arbitrary and is ultimately what we decide it will be.

Let's just keep the behavior as is for now, but leave the ticket open while we consider the issues involved and see if it impacts any real world code.

@alexlamsl
Copy link
Collaborator

Another data point is that collapse_vars doesn't seem to interfere with /*@__PURE__*/:

$ echo 'function f(){ var a=fn(); a(b()); }' | node bin\uglifyjs -c
function f(){fn()(b())}

$ echo 'function f(){ var a=fn(); /*@__PURE__*/a(b()); }' | node bin\uglifyjs -c
function f(){fn();b()}

@alexlamsl
Copy link
Collaborator

alexlamsl commented Sep 24, 2017

Let's just keep the behavior as is for now, but leave the ticket open while we consider the issues involved and see if it impacts any real world code.

Indeed - let's collect more real-world code examples to sniff out potential corner cases. Especially since I can't think of any easy way to augment test/ufuzz.js for this feature.

@kzc
Copy link
Contributor

kzc commented Sep 24, 2017

Change the [bug] label to [question]?

@alexlamsl alexlamsl changed the title [bug] pure annotation - called CallExpression pure annotation - called CallExpression Sep 24, 2017
@alexlamsl alexlamsl added question and removed bug labels Sep 24, 2017
@kzc
Copy link
Contributor

kzc commented Sep 24, 2017

off topic: Some projects are using uglify /*@__PURE__*/ annotations to great effect in the wild:

Angular Build Optimizer has been reported to significantly reduce app bundle sizes - by as much as 80%:

angular/components#4137 (comment)
angular/components#4137 (comment)

"on angular.io, we saw a 52% size reduction in our production build."
https://blog.angular.io/the-past-present-and-future-of-the-angular-cli-13cf55e455f8

https://www.softwarearchitekt.at/post/2017/07/26/shrinking-angular-bundles-with-the-angular-build-optimizer.aspx

/*@__PURE__*/ annotations are most effective when combined with a scope hoisting bundler like Rollup or Webpack 3+.

@kzc
Copy link
Contributor

kzc commented Sep 24, 2017

And of course @Andarist added /*@__PURE__*/ annotation support to downleveled ES5 classes in the upcoming Babel 7 release:

babel/babel@c47258d

@Andarist
Copy link
Contributor Author

Andarist commented Sep 25, 2017

@kzc I believe what you have described here is correct and I still consider this thing a bug. In the same way as c() (argument of outermost call) doesn't get removed (it is not marked as pure) the arguments of the inner call (not marked as pure) shouldn't get removed as well. This is strange and completely unexpected.

Good to know that this wasn't discovered in real life code.

I agree this is quite a corner case and possibly not the highest priority.

I took a look at the sources and tests for babel-plugin-annotate-pure-calls and to be honest I don't understand the use case and I don't think it's safe in the general case.

Well - its not targeted for the general use case. I believe (once i smooth the rough edges) it might be useful for side-effects free libraries such as ramda, without #__PURE__ annotations it effectively cannot be dead code eliminated at all and annotating in the source code is:

  • ugly :P
  • error prone

What heuristic does it use to safely deem certain top level calls in a program (or closure) to be pure?

Non, but thats quite the point of the plugin. The user must want to use it and know its risks, I believe the risks (for well tested/battle tested plugin) are minimal though. The only heuristic I use is that a function call appears in an assignment context + I consider contents of the top level IIFEs as top level calls (this is a though a subject of taste and Im not strongly attached to the idea, so I might change it or provide an option for this).

For example, this test case result looks to be incorrect. Had the original code been passed through uglifyjs --toplevel -mc passes=3,unsafe the result would be a=void call();, but if the expected result were passed through the same uglify command it would yield a=void 0;. Program flow analysis is a difficult problem.

Certainly agree that program flow analysis is difficult and I do not intend to touch the subject within that plugin too much, its job is to assume calls being pure (technically even not pure per se, but side effect free - or that if the result is unused any possible side effects are unused too and the call can be eliminated).

I believe the test case is ok (i invite u ofc to further discussion about it), maybe the test case itself is not obvious, but lets consider slightly different input:

a = (function() {
  var b = (function() {
    var c = (function() {
      var d = call()
      someOtherCall(d)
    })()
  })()
})()

The test case shows that call() gets annotated and if d stays unused I want to produce a=void 0;. If it is used (as in the above example) it obviously wont be dropped even if annotated.

Multiple annotations and well placed parentheses could be used.

Multiple annotations sounds like a good solution, but well placed parentheses cannot be achieved (afaik) with babel transformations, because AST doesnt hold any information about that.

When the pure annotation feature was added no consideration had been given to that fact that the outer AST_Call.expression could have side effects - unlike the AST_Call.args which are considered.

It is a side effect in inner call expressions that gets ignored (but outer callee) not in the outer call expression. I use Babel's semantics here, maybe yours differs.

The intention was that the pure annotation would affect the outermost call only.
[...]
Although f() is deemed to be pure, its result is still required by the program so it is not dropped.

Agree 100%, but as mentioned - the whole statement gets dropped at the moment, which sounds to me like a bug.

The only problem I see with the approach is that its unobvious how to annotate all those calls as pure (in fn()()() example). If we annotate all CallExpressions as pure we end up with:

/*#__PURE__*//*#__PURE__*//*#__PURE__*/fn()()()

and when we go back to AST we end up with a single (outermost) CallExpression with 3 leading comments. Unfortunately such operations (when comments are involved) as going from ast to code and back to ast are not inverse operations (in mathematics terms).

@Andarist @kzc sorry for the radio silence – been running around town the whole day.

@alexlamsl No worries, your answers were extremely fast.

uglifyjs -c pure_funcs="fn1(fn2(a()),b())"

wow, thats one weird looking syntax :o

specially since I can't think of any easy way to augment test/ufuzz.js for this feature.

what does test/ufuzz.js do? Is it just generating outputs for each combination of uglifyJS options?

off topic: Some projects are using uglify /@PURE/ annotations to great effect in the wild:

Wow again, I didnt expect such great results, Im wondering how structured are the apps that have benefited from such reduction. I think those optimizations that this project (Angular Build Optimizer) performs (in a way that PURE annotations have desired effect) are great! In example - class fold. I plan to do the same for babel's outputs (as discussed in other threads).

@kzc
Copy link
Contributor

kzc commented Sep 25, 2017

@Andarist I appreciate that the pure annotation behavior with respect to ignoring side effects of AST_Call.expression seems strange and unexpected at first glance - it certainly was to me. But it is self-consistent and matches the behavior of pure_funcs which has been in the code base for years. In the end it's a documentation issue - the feature is whatever we deem it to be. We're assuming the function expression of a pure call is pure itself. If there's a compelling use case in real world code - or an outright bug in well-formed code - then we could be persuaded to change it. Then there's the issue of taking the side effects of the pure_func AST_Call.expression into account might break existing code. In theory we could create another compress option that could gate this AST_Call.expression side effect behavior.

As for babel-plugin-annotate-pure-calls perhaps you could add some more documentation to its README explaining the use cases where it should and should not be used. The case result I cited was quite unexpected to me. I'm not convinced your revised example would be problem-free either. Perhaps this Babel plugin can only be used on pure functional libraries were every single function (internal or external) is known to be pure beforehand? Rather than sidetrack this Uglify issue, feel free to ask my opinion in the babel-plugin-annotate-pure-calls project.

@kzc
Copy link
Contributor

kzc commented Sep 25, 2017

Be aware that I may not read the re-edits of past posts.

/*#__PURE__*//*#__PURE__*//*#__PURE__*/fn()()()

The lack of output reversibility with respect to pure annotation comments is a known issue that will not likely ever be addressed. Comments make for a poor code annotation system - it's a hack. Uglify is rarely used as a code formatter - there are better alternatives. The code emitter would not necessarily be able to handle outputting pure annotations correctly in pathological compress scenarios if all comments were preserved. The code outputter could be fixed with well-placed parentheses in lib/output.js but it's not worth it in my opinion - too many potential problems. 99.99% of the time uglify users do not emit comments other than license info. If a comment fix were to be implemented it would likely be just dropping all pure annotations in the code emitter if comments are preserved - just to be on the safe side.

Back to handling side effects of AST_Call.expression - consider this simple code:

/*#__PURE__*/g();
/*#__PURE__*/foo.bar();

Currently this results in both statements being entirely dropped. But if side effects of the call expression were considered it would emit:

g, foo.bar;

That's not expected, nor desirable. We don't want to introduce problems for code in use in the wild.

@Andarist
Copy link
Contributor Author

In the end it's a documentation issue - the feature is whatever we deem it to be.

Totally agree.

Rather than sidetrack this Uglify issue, feel free to ask my opinion in the babel-plugin-annotate-pure-calls project.

Sure thing, I've created an issue for the review there.

Currently this results in both statements being entirely dropped. But if side effects of the call expression were considered it would emit:

Im not sure if I see this. Do you mean that getters could be called here? Isnt there an option to assume pure getters?

@kzc
Copy link
Contributor

kzc commented Sep 26, 2017

Do you mean that getters could be called here? Isnt there an option to assume pure getters?

Yeah, getters or unset globals like debug or log statements that would result in an exception being thrown. Or just to simply avoid littering the output with code that would not have been there with the existing pure annotation behavior.

There is one change I'd like to put in - a new finer granularity compress option to control whether pure annotations are enabled:

--- a/lib/compress.js
+++ b/lib/compress.js
@@ -93,6 +93,7 @@ function Compressor(options, false_by_default) {
         unsafe_regexp : false,
         unused        : !false_by_default,
         warnings      : false,
+        __PURE__      : !false_by_default,
     }, true);
     var global_defs = this.options["global_defs"];
     if (typeof global_defs == "object") for (var key in global_defs) {
@@ -2083,6 +2084,7 @@ merge(Compressor.prototype, {
     });
 
     AST_Call.DEFMETHOD("has_pure_annotation", function(compressor) {
+        if (!compressor.option("__PURE__")) return false;
         if (!compressor.option("side_effects")) return false;
         if (this.pure !== undefined) return this.pure;
         var pure = false;

That way in the event of a problem or bug it can be disabled without needing to disable side_effects which has a much bigger impact.

Also, having a __PURE__ compress option would leave the door open to later implementing non-boolean option values to take side effects of AST_Call.expression in account and/or not taking AST_Call.args side effects into account should that use case arise.

@Andarist
Copy link
Contributor Author

There is one change I'd like to put in - a new finer granularity compress option to control whether pure annotations are enabled

Might be useful, even though the proposed option name indicates quite clearly what it does I would prefer some better ("human-friendly") name. __PURE__ is an implementation detail.

@kzc
Copy link
Contributor

kzc commented Sep 26, 2017

It was the first option name that came to mind. Alternatives: pure_annotate, pure_annotations.

@Andarist
Copy link
Contributor Author

pure_annotations 👍

@alexlamsl
Copy link
Collaborator

Another idea would be pure_funcs: true by default for /*@__PURE__*/, though it gets complicated when you have custom pure_funcs but would like to disable the annotations.

@kzc
Copy link
Contributor

kzc commented Sep 26, 2017

Or perhaps set a string value for side_effects similar to how pure_getters is overloaded. The side_effects default of true would retain current pure annotation behavior. I don't know what the value could be - "pure" and "no-pure"? And how to convey AST_Call.expression and/or AST_Call.args would or would not have to be considered for side_effects if we choose to do so in the future?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants