Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make comments output more robust #2633

Merged
merged 5 commits into from Dec 21, 2017
Merged

Conversation

alexlamsl
Copy link
Collaborator

@alexlamsl alexlamsl commented Dec 21, 2017

  • improve handling of comments right after return
  • retain comments after OutputStream
  • preserve trailing comments
  • fix handling of new line before comments
  • handle comments around parentheses

fixes #88
fixes #112
fixes #218
fixes #372
fixes #2629

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

I briefly reviewed it enough to see the pure annotation test working, but I won't get a chance to do a thorough review until later today if you don't mind waiting.

@alexlamsl
Copy link
Collaborator Author

@kzc you may have all the time you need 😉

}
input: {
/*@__PURE__*/ f1();
(/*@__PURE__*/ f2)();
Copy link
Contributor

@kzc kzc Dec 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to drop f2(). The pure annotation only presently works for calls. Here the symbol f2 is annotated as pure, and it is not dropped in v3.2.2 when a symbol:

$ echo 'a; /*@__PURE__*/b; c;' | bin/uglifyjs -c
a,b,c;

$ echo 'a(); /*@__PURE__*/b(); c();' | bin/uglifyjs -c
a(),c();

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no distinction between annotating on f2 or f2() in this case – they are one and the same.

See https://github.com/mishoo/UglifyJS2/pull/2633/files#diff-2cef7371337131fb81c6b3880549b93eR211

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate there's no distinction in this PR, but that wasn't the case in v3.2.2.

We could extend the notion of pure annotations to cover function expressions and function symbols, but originally it was only intended for calls specifically.

But we have to be logically consistent with respect to inlining values and functions.

With this PR:

$ cat testp.js 
!function(){
    /*@__PURE__*/function bar() {
        console.log("bar");
    }
    bar();

    ( /*@__PURE__*/ function foo() { console.log("foo"); } )();
}();
$ bin/uglifyjs testp.js -c
!function(){console.log("bar")}();

Why is foo() dropped but bar() kept?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this will ever be confusing - (/*@__PURE__*/f2) still won't be dropped after this PR. You need it to be AST_Call.expression in order for the pure annotation to be effective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not logically consistent. See test above. Why is one dropped and one retained?

If you can rig it such that functions can be annotated as pure in the function bar case above and then every single future bar() call will be dropped then it would logically consistent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is one dropped and one retained?

Because one is an IIFE while the other one isn't - I don't see any confusion there whatsoever.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But once function bar is inlined in bar() why won't it carry the pure annotation comment with it and behave exactly like the IIFE where the AST_Call.expression is annotated? Once inlining is taken into account it becomes logically inconsistent.

/*@__PURE__*/ f1();
(/*@__PURE__*/ f2)();
/*@__PURE__*/ (f3());
(/*@__PURE__*/ f4());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other test cases should behave like v3.2.2 (greedy AST_Call.expression):

/*@__PURE__*/f5(1)(2)(3);
/*@__PURE__*/f6.x(1).y(2).z(3);

These should work same as above:

( /*@__PURE__*/f7(1)(2)(3) );
( /*@__PURE__*/f8.x(1).y(2).z(3) );

Also please add the sequence and array tests from #2629 (comment) to make sure no regressions from v3.2.2.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in fd267ad

(/*@__PURE__*/ f6(1)(2)(3));
/*@__PURE__*/ f7.x(1).y(2).z(3);
(/*@__PURE__*/ f8.x(1).y(2).z(3));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add arguments with side effects for each of these calls?

We want to show that the final call's arguments with side effects are retained if the pure annotated call is dropped. Likewise, we want to demonstrate that the AST_Call.expression will be dropped entirely even if it happens to have side effects.

Basically we want tests to cement the defacto v3.2.2 pure annotation behavior from future changes. This would serve as the pure annotation "spec".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what the spec should be, since I don't use pure annotation - please list them out and I'll put those into the test cases 😉

This PR is an exercise in cleaning up the mess with comments - by laying a more solid foundations so at least others can make incremental improvements as there will still be cases where comments will be dropped.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the current broken comment logic dictates much of the pure annotation logic.

In my opinion this:

/*@__PURE__*/(g() || h())( x(), y() );
( /*@__PURE__*/ (a() || b()) )( c(), d() ); 

should produce:

x(), y();
(a() || b())(c(), d());

Notice the pure annotation in the second line marks a non-call expression so it should do nothing. v3.2.2 does not support pure annotations other than calls.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the first line works as described, while the second line produces c(), d();

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

Could we move the pure annotation discussion here in the main comment area rather than in the code comments? It's easier to follow that way.

@alexlamsl
Copy link
Collaborator Author

So for the example of:

( /*@__PURE__*/ (a() || b()) )( c(), d() );

to me it just reads:

/*@__PURE__*/ (a() || b())(c(), d());

so the end result of c(), d(); seems reasonable.

I can entertain the idea that /*@__PURE__*/ is only marking a() - but the idea of parenthesis grouping scope of comments seem rather far-fetched.

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

In my opinion if we allow the following line to be dropped entirely:

( /*@__PURE__*/ g )();

Then we'd also have to drop all 3 lines below whether inlined or not:

/*@__PURE__*/ function bar() { console.log("bar"); }
bar();
bar();

And what would be done with this?

( /*@__PURE__*/ baz() )();

Only baz() is annotated as pure. Should baz()(); be dropped?

@alexlamsl
Copy link
Collaborator Author

As for tests for pure annotation - bear in mind we already have a battery of them under test/compress/issue-1261.js

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

but the idea of parenthesis grouping scope of comments seem rather far-fetched.

I agree it is, but unfortunately comments govern how pure annotations work and will change their behavior dramatically.

@alexlamsl
Copy link
Collaborator Author

Then we'd also have to drop all 3 lines below whether inlined or not:

No, because the annotation is not on a call site.

And what would be done with this?

That is syntactically identical to /*@__PURE__*/ baz()(); - as I said, I am open to having a tighter association for that annotation, i.e. bound to baz() as opposed to baz()()

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

As for tests for pure annotation - bear in mind we already have a battery of them under test/compress/issue-1261.js

Unfortunately we had neglected to consider all the recently discovered border cases.

@alexlamsl
Copy link
Collaborator Author

There is this possibility to mark pure annotations in lib/parser.js instead of lib/compress.js and that may give us finer granularity.

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

( /*@__PURE__*/ baz() )();

I am open to having a tighter association for that annotation, i.e. bound to baz() as opposed to baz()()

I'd prefer that. In that way the annotation on baz() in the outer AST_Call.expression would be a no-op since it is still being used. And then the code above would produce:

baz()();

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

There is this possibility to mark pure annotations in lib/parser.js instead of lib/compress.js and that may give us finer granularity.

Interesting idea. How would it work?

Side note... just realized that none of the pure annotation logic would work with mozilla AST input. That's fine as far as I am concerned.

@alexlamsl
Copy link
Collaborator Author

Interesting idea. How would it work?

We just mark AST_Call.pure as we make new instances of AST_Call - we have local information down to every parenthesis at that point, and comments have already been parsed.

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

More than a few have asked to annotate functions rather than calls as is the present behavior - I have to admit it would be more logical than what we have as of v3.2.2. The only problem is that we've locked in this IIFE pure annotation logic originally intended to drop ES6 classes once converted to ES5 IIFEs. Is there a way to remain backwards compatible but still allow functions to be annotated as pure?

@alexlamsl
Copy link
Collaborator Author

I'd prefer that. In that way the annotation on baz() in the outer AST_Call.expression would be a no-op since it is still being used.

This may already solve #2331

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

We just mark AST_Call.pure as we make new instances of AST_Call - we have local information down to every parenthesis at that point, and comments have already been parsed.

That would be ideal. No compressor cost and greater annotation fidelity. I can't think of any downside off-hand.

@alexlamsl
Copy link
Collaborator Author

I can't think of any downside off-hand.

More work for this poor human soul? 😅

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

This may already solve #2331

#2331 was a legacy pure annotation backwards compatibility decision not to retain side effects in AST_Call.expression, but do retain side effects for AST_Call.args. We were just making official what was already happening in practise.

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

More work for this poor human soul?

It's not something I'd take on. I'm just the ideas guy. :-)

It's completely up to you whether you want to implement it, but I speak on behalf of the Internet when I say that we are grateful for all your fine work.

@alexlamsl
Copy link
Collaborator Author

Before we go further - is this PR good for solving the issues at hand? Any more tests to add?

We should probably leave improvements of pure annotations to another PR.

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

When functions are inlined - the pure annotation comments_before travel with them do they not?

In your proposed scheme for the parser marking AST_Call.pure we could immediately remove the text of the pure annotation from the comment in parse so that it does not get incorrectly output after transformations. Presently I can create cases where -c passes=3 --comments all will produce incorrect code if fed to uglify again due to the fact that comments do not respect parens, and uglify removes (and does not AST encode) parentheses.

Before we go further - is this PR good for solving the issues at hand? Any more tests to add?

I think we've covered the main areas of inconsistency.

We should probably leave improvements of pure annotations to another PR.

Agreed. Pure functions can be done at another time should we decide to do tackle it.

@alexlamsl
Copy link
Collaborator Author

When functions are inlined - the pure annotation comments_before travel with them do they not?

They do - but we look at AST_Call.start.comments_before, so we don't get anything from the inlined nodes.

@alexlamsl alexlamsl merged commit edb4e3b into mishoo:master Dec 21, 2017
@alexlamsl alexlamsl deleted the issue-112-372 branch December 21, 2017 20:59
@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

Regarding this PR being merged, I was under the impression that we agreed that the following:

$ echo 'a(); ( /*@__PURE__*/b )(); c();' | bin/uglifyjs -c
a(),c();

would produce:

a(),b(),c();

Did I misunderstand your view or did you plan a follow up PR?

@alexlamsl
Copy link
Collaborator Author

Did I misunderstand your view or did you plan a follow up PR?

The current implementation of pure annotations won't be able to make a distinction between /*@__PURE__*/(b)() and (/*@__PURE__*/b)() - hence my suggestion in #2633 (comment) which should be a standalone PR.

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

Cool. We don't want to have users create code that depends on this inadvertent unintended behavior.

@kzc
Copy link
Contributor

kzc commented Dec 21, 2017

In the followup PR could you please add the side-effects tests in #2633 (comment)?

/*@__PURE__*/(g() || h())( x(), y() );
( /*@__PURE__*/ (a() || b()) )( c(), d() ); 

@papandreou
Copy link
Contributor

Thank you so much, @alexlamsl! This allows us to get rid of a bunch of workarounds: https://github.com/assetgraph/assetgraph/pull/811/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants