-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotation for pure getters (discussion) #3334
Comments
Yes, please! I dont immediately see a value in PUREGETWITHIN though. Annotations are only important for code evaluated during initialization time so how this helps function declarations? Or is it for annotating declaration site instead of the call site, but still really affecting only the call sites? What would happen for such annotated function that is evaluated at init time but its result stay unused? We could take your second example with an IIFE as an example for what im asking. |
The reasoning behind puregetwithin is to annotate every property access expression inside the function or iife in one go. It would be useful for a human to specify "every property access in this iife is pure", as opposed to annotating each and every one. |
Right, I’ve guessed that this has to work smth like that - I’m wondering though how this would impact a file consisting of such an annotated IIFE which return value is not utilized, not assigned to a variable in the source code |
Annotations in comments are inelegant and problematic because they are not first class constructs in ECMAScript. As result, some tools drop necessary parentheses which can subvert the meaning of comment annotations resulting in incorrect code. I suggested in the original terser thread to support a new |
@kzc I thought of that as well, to create a module on NPM and expose compiler annotations there. That should indeed be more robust.
If some day we ever get function decorators in ES we could do
And more stuff could be enabled by passing things to this function, like:
|
Personally, I wouldn't put this proposed
Or just leave these calls in place and polyfill
Or if no minifier or bundler is used the function prefix can just be elided in the source code with a dumb regex string replacement leaving just the
Upstream tools that are oblivious to the |
I see the value and semantic clarity, but I do not like that it produces broken code if the polyfill is not added, and I do not think our users will like it either. A comment is less elegant and has problematic edge cases, but it does not change the control flow and can be easily ignored if it is not supported. |
I feel the same as @lukastaegert - those proposals have value, they are just IMHO a really hard sell for people. I hear pushback about existing annotations from time to time and those proposed calls would feel to people as even more intrusive. |
Makes sense. |
It's a catch 22. Babel by default will currently drop parentheses in complex
The vast majority of user ES code is run through a bundler or minifier anyway. The |
Isnt the given example highlighting non-real use case? I really cant imagine anyone annotating part of the expression and not the other other part - if this is needed for any reason the expression can always be split into two separate statements. |
The Babel pure annotation comment issue above was found in real life create-react-app code when trying to transpile async code to ES5. They had incorrectly chalked it up to being a minifier bug. Such bugs are very difficult to diagnose and beyond the ability of most JS programmers. With AST optimizations and passing the code through various tools it's more common than you think. |
I do not really see the difference between The problem with the latter is of course if code is inlined, then the inlined code would also be marked as having pure getters. I would appreciate such an annotation, but if there are transformations in place, a lot of funny stuff can indeed happen. On the other hand, having a function is problematic as the code no longer runs unmodified, which is something that more and more people are doing again in development thanks to ES module support. Thus this is nothing a library user can "just use" as it makes the library no longer universally useable. Not good for Rollup users. |
Looking at the upstream issue that spawned this discussion, it appears to have been easily resolved with the existing pure call comment annotation. I don't see a need for any new pure annotation functionality - if anything, just a little extra documentation with examples of its use in various scenarios could be helpful. This documentation should also mention enabling the |
The getter issue can only be solved currently by: -const foo = some.getter
+const foo = /*#__PURE__*/(() => some.getter)() which is doable, but it's a super hard sell. Only most hardcore people would do this. Others will just tell people wanting to improve libs tree-shakeability (like me) that it's not worth the gains.
This kinda works only for final consumers (mostly app-level), whereas I would really love to figure out some solution for library code providers. Ideally, consumers shouldn't have to know about an extra flag to enable somewhere so they can benefit. A global flag (on an app-level) is also a huge risk, comparing to local annotations which gives more fine-grained control. Q: isn't it that for example Babel could just introduce additional heuristics or output the code slightly differently to make those edge cases safer? If transpiled code is the main source of edge cases then I think it could only be a matter of tweaking its output to increase safety. |
That's a decision that can be made on an app by app basis by developers after they've tested with it. Advanced users ought to familiarize themselves with minifier and rollup options if they really want to achieve the smallest possible output size.
Babel and any other tool that generates ES from AST - plugins, loaders, the typescript AST API, https://github.com/Rich-Harris/code-red, or what have you.
Users were simply not aware of the pattern. This can be easily resolved with additional documentation. |
Well, just tools that opt into outputting pure annotations, which should limit the scope. If they want to emit those, they should do it in a safe manner - this seems reasonable to me. I understand that the AST might still be manipulated later by a tool not aware of those annotations, but transformations take on this stage should really rarely change the structure of the code dramatically. This is really hard to quantify though, so it probably would be better if we could gather some real-life examples and discuss this based on them.
I don't believe this is a whole story. I found people really hesitant to adjust code just to aid consuming tools, as it feels intrusive to them, whereas it's much easier to accept some extra comments. I thought about writing a babel transform to wrap getters in such an annotated IIFE, but I also lately was thinking that for this kind of thing ESLint plugin is better than a Babel plugin (less magic). And even then - would Terser be able to inline such undropped IIFEs? |
Rollup aside, no AST-based ES generation tool opts into outputting pure annotations - they just emit comments. Comments are just considered to be whitespace - not first class ES constructs. So unless these tools support retaining redundant parentheses with respect to comments by default, they are all subject to this problem.
Please take the time to review the real life bug report above and the babel issue that explains the problem in detail. To be perfectly honest in all the years I've been responding to issues related to pure annotations everyone has found the existing functionality to be adequate once they know how it works. That combined with the optional use of |
Because the annotation I'm talking about marks an entire property access chain it's immune to that specific bug. I don't think we can force people to use a library of sorts to mark their code. |
Even without this proposed functionality, Terser is also subject to these sorts of pure annotation AST code generation bugs since it recently added support for Given the input:
Expected output when run through Terser twice and executed with Node:
Actual output when run through Terser with
Note the incorrectly emitted annotations below. The terser project should consider reverting
fwiw, Rollup appears to handle these scenarios correctly:
This proposal would create additional edge cases in Terser when the code is optimized. Because the pure annotated IIFE pattern already handles the scenario that initiated this discussion, I don't see a compelling reason to extend the existing pure annotation functionality. |
This would be part of the AST, is usually not removed by tools other than minimizers, is ignored when executed, not badly rewritten during formating, has next to no runtime overhead when ignored and no runtime overhead when processed by not supporting minimizer (as it would be removed). It also allows to reuse In contrast to If needed multiple annotations could be separated by |
A sequence expression technique was mentioned in babel/babel#10306 (comment) to preserve pure annotations for calls in a parentheses hostile tool chain.
This proposed pure syntax would work fine for calls and most expression types, but be aware of this scenario:
Replacing
That corner case aside which optimizing compilers and minifiers would have to avoid, the proposed pure sequence syntax has some appealing characteristics. |
@alexlamsl Given that uglify-js is still very actively maintained, do you have any opinions on this matter? |
That's true. I think we could keep this as limitation. Here Possible workaround would be an annotation Unrelated from this I think a const fn = ("__PURE_FUNC__", arg => {
return magic(arg) + 12;
});
const x = fn(magic(42)); // x unused
// minimized to:
magic(42);
// function and call can be dropped
// bug arguments are not considered as pure I think that's easier to reason about compared to annotations at the callsite. Sadly it's not possible to put this annotation onto function declarations with the sequence-based approach. Another limitation. It might be possible to do it this way for statement annotations: "__PURE_FUNC__";
function fn(arg) {
return magic(arg) + 12;
} Maybe also a const x = ("__PURE_CALL__", some().magic().fn(magic(42)));
// minimized to:
magic(42); |
I figured that TypeScript dislikes these construct and produces: error TS2695: Left side of comma operator is unused and has no side effects. But I think we can make them support these constructs should we decide on this syntax. Prettier rewrites "__PURE_FUNC__";
function f() {} to ("__PURE_FUNC__");
function f() {} I think this actually looks better. Putting it in brackets also prevents it to be a directive. So I would propose to make this the actual syntax for statement-level annotations. Sorry if I push this too hard. A general purpose annotation system would be nice for more tools (e. g. webpack). If this finds consensus, I would volunteer to write up a little spec for that. |
I think I'm not a big fan of the pseudo-directive I think expression site annotation is generally more useful than function site annotation because the latter is not always deterministic in determining call sites, but more of a best-effort if functions are passed through variables. |
@kzc I would have to see specific use cases in order to form an opinion (or suggest an existing workaround). It is a fascinating read on all those corner cases though 👻 As for the latest proposal, if implemented I can certainly see myself losing hair staring at some |
Good point. Uglify and its forks heavily rely on the sequence construct for optimization. To support such a an annotation proposal, a fair bit of selective deoptimization would have to be introduced into the code base. |
Something I wanted to point out that may or may not be useful here: Webpack's codegen for modules currently constructs something very similar to what @kzc is proposing, and I think that precedent makes his suggestion all the more interesting. // current webpack output:
var _React = require("react");
Object(x.default.createElement)("span", null, "hello world"); It's relatively trivial to convert this to a hoisted "helper" that is really just assigned to Object. This would obviously change the context when applied to a callee, but I'd be curious to know how much the two needs intersect. // proposed
var __PURE_GET__ = Object;
var _React = require("react");
__PURE_GET__(x.default.createElement)("span", null, "hello world"); FWIW, few/none of these seem to address destructured assignment, which seems worth fixing: // how can this be marked as pure?
const { a: { b: { c } } } = obj;
// ...or this
let c;
({ a: { b: { c } } } = obj); |
// how can this be marked as pure?
("__PURE__")
const { a: { b: { c } } } = obj;
// ...or this
let c;
("__PURE__", { a: { b: { c } } } = obj); |
I propose the addition of annotations to specify which property access expressions are pure. These annotations will also be understood by Terser, but I want to discuss how they should look like, and this project's needs, instead of implementing them there and finding out they don't suit Rollup.
relevant twitter thread where this was brought up to my attention again and I reconsidered
Feature Use Case
/*#__PURE__*/
right now.Feature Proposal
The addition of two annotations:
/*#__PUREGET__*/
, which can be used at the start of a "property access chain", like this:/*#__PUREGET__*/foo.bar.baz
And a
/*#__PUREGETWITHIN__*/
annotation to be added just before a function or IIFE, like this:Maybe it can also be possible to annotate a whole module or whole package (by way of a package.json field) but Terser does not read package.json fields, so Rollup would have to annotate what it gives to Terser by itself.
Hope this post finds you well, and thank you for making the internet a better place :)
The text was updated successfully, but these errors were encountered: