New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend PURE annotations to work on getters as well. #513
Comments
I do not support this proposed feature request. Although slightly awkward, the existing pure annotation behavior can make any arbitrary expression "pure" using this idiom: var drop_if_unused = /*@__PURE__*/(() => expr)(); or if you prefer ES5: var drop_if_unused = /*@__PURE__*/(function () { return expr; })(); Efforts should be spent optimizing minification using this pattern rather than inventing a new set of rules governing pure annotations with potential backward compatibility issues. Angular uses this idiom quite extensively to great effect on a very large code base. The current behavior of Comments are not a first class construct in ASTs and are problematic to say the least. I suggest to leave the semantics of pure annotations as is and limited to calls and new expressions. If any other arbitrary expression is annotated then combinations of other optimizations could lead to shifted code and ambiguous interpretation. If I were to go back in time and reimplement pure annotations I'd probably have chosen a regular function call mechanism, such as |
Problem is that often it's hard to convince people (lib owners) to restructure their code "just" to make it for minifiers/tree-shakers/etc do their job more efficiently. Take a look at this situation: const defineProperty = Object.defineProperty;
const getOwnPropertyNames = Object.getOwnPropertyNames;
const getOwnPropertySymbols = Object.getOwnPropertySymbols;
const getOwnPropertyDescriptor = Object.getOwnPropertyDescriptor;
const getPrototypeOf = Object.getPrototypeOf;
const objectPrototype = Object.prototype; I highly doubt the author will be willing to change this to: const defineProperty = /* #__PURE__ */ (() => Object.defineProperty)();
const getOwnPropertyNames = /* #__PURE__ */ (() => Object.getOwnPropertyNames)();
const getOwnPropertySymbols = /* #__PURE__ */ (() => Object.getOwnPropertySymbols)();
const getOwnPropertyDescriptor = /* #__PURE__ */ (() => Object.getOwnPropertyDescriptor)();
const getPrototypeOf = /* #__PURE__ */ (() => Object.getPrototypeOf)();
const objectPrototype = /* #__PURE__ */ (() => Object.prototype)(); Adding a comment without changing the structure (to, as also admitted by you, awkward one) has a much higher chance to be merged in. |
Sorry, but the use case is not compelling. The vast majority of JS coders only use default minify settings and do not annotate code. If a library author is truly concerned about dead code elimination the existing pure annotation syntax would not be an impediment. As mentioned above, the existing mechanism has been used successfully in very large projects for years. This proposed feature would add unnecessary complexity to the minifier for something that is readily accomplishable. |
The example cited https://github.com/mridgway/hoist-non-react-statics/blob/695c9884d19133dc97b2b2d37118836c03abcd41/src/index.js#L58-L65 could be better optimized by simply moving the const definitions in to the default function exported, since all those definitions are required by the function anyway. The key to dead code elimination is keeping everything in functions and not having toplevel property accesses. @filipesilva wrote a linting plugin to detect such scenarios in the Angular project. |
I think it's a bit awkward that getters have to be treated this way by terser, but I think the "effect vs maintenance cost" ratio here isn't worth it sadly. Even for a different annotation (like |
For some extra context, at Angular we also think it's too onerous to write any But we also do a lot of testing to ensure tree shaking is working as expected in our codebases. Just making a PR for a lib to have the comment would not be enough, it would need a supporting tests to prevent regressions (e.g. ReactiveX/rxjs#4769). The linting plugin that @kzc mentioned is used in that PR along with https://github.com/filipesilva/check-side-effects. |
Maybe we need a package.json field to specify "hey, this package doesn't have any side effects outside of functions, so module init code is all game for you to drop babe". Maybe this is a better solution than spamming pure comments. Then module bundlers would drop a comment or directive saying "(in this here closure/in this here file) getters outside of functions are pure, and calls too." What do you fellow humans think? Reopening for discussion. |
Webpack (here) and Rollup (rollup/rollup#2593) support the I don't think Terser can do anything about that because Terser operates on individual JS files/modules. Checking package.json fields is something that bundlers can do because they actually load all the code according to some logic, but Terser does not do that. This is part of the reason why Rollup does so much better at tree-shaking than Webpack+Terser. Rollup has the module loading information accessible for use by tree shaking, whereas that information is lost in the transition from webpack to terser. |
Since we're having a meta discussion... Terser would be much more effective if it were to become an ES6+ code bundler in its own right. The projects I work on tend to produce equally small bundles with Rollup tree shaking disabled and subsequent minification with Uglify and Terser have to be more conservative by default because they don't know what sort of code they are dealing with. Historically Rollup has been a bit loose with rules governing property accesses in order to make some of its gains, and there are a few longstanding bugs related to that. I won't bother to link to them here, but I was commenting on one such issue in that project just yesterday. Rollup is also becoming more conservative by default as result as these issues are addressed. Micro-optimizations ultimately have much less of an impact than user code reorganization to facilitate dead code elimination. Users by in large do not understand why their apparently unreferenced code and external libraries are being retained. There is no silver bullet. Making upstream tooling tree shaking friendly would have the most bang for the buck. One such example is the yet-to-be-merged Babel PR @Andarist wrote concerning wrapping static class properties into the pure class IIFE would go a long way to dropping a lot of unreferenced code in the wild: babel/babel#6963. |
Wrapping static class props/methods into an IIFE is one of highest yield optimizations that Build Optimizer does. Whenever something goes wrong in that one, bundle sizes balloon up. The single namespace/module concatenation issue is not the full story though. Even if webpack were to do module concatenation perfectly, it would still not be optimal when dynamic imports are present. Terser is now able to follow the indirection provided by webpack module loader, whereas rollup performs it's tree shaking with that knowledge. I don't know if Terser would be able to follow dynamic ES6 imports though. That might be an interesting avenue. |
I've been thinking of creating a terser-integrated module bundler for a while! The problem is, the Terser AST is pretty memory hungry at the moment. However, we might be able to do some interesting stuff by pre-parsing the files and storing a disk representation of the AST (hopefully with something more stable than https://nodejs.org/api/v8.html#v8_serialization_api) in a cache directory. We can even parse them in different threads if we want. Once parsed, they are not re-parsed unless the underlying file changed. Then we find out:
This would allow for some really interesting avenues, where we can compress with knowledge of what exists in other files. And also, with the knowledge of where the output file is going to run (node, worker, browser or all three), allowing us to unwrap UMDs and mark And of course we can use propmangle much more efficiently using a shared cache, which is pretty tricky to set up at the moment. Besides optimizations, I think we can offer some things other bundlers already offer, such as:
It would of course be a major undertaking. |
Were Terser to become a bundler, it would have to replace existing bundlers on a users toolchain. This sounds like a unlikely proposition as bundlers ride on top of their ecosystem of configurations, which Terser would need to match, and exceed, in order to justify the change. Changing bundlers is a undertaking that grows more difficult as projects grow and depend on the bundler featureset. Module-aware is a cool proposition that doesn't exist elsewhere though, except in the Google Closure Compiler. i think Rollup does cross module inline though. I wonder if there'd be a better way for Terser to remain a module processor while having access to bundler module graph information... E.g. something that bundlers can feed to Terser. |
Regarding ES6 dynamic imports, maybe the user can tell us what they are. We can expose a module with a known package name
This hint function (which will be actually be exported from an actual NPM module) is simply an identity function. However, we can statically read these calls and do what needs to be done. |
The cool thing about the import function, is that you can import data-URIs :) we might be able to use that to do something cool. |
In rollup-like output it's probably enough to just be able to follow imports and dynamic imports. I think webpack is looking at ESM output as well. So if Terser has a way to understand those, it can do all the nice things Terser does without needing to bundle anything itself. |
At the very least we can start by taking what the bundlers already know and use it? But how, though? Are you aware of a way to get module bundlers to expose an import graph kind of thing? |
It would quickly run out of RAM. It's what happens when chunks grow too large. |
With that mindset no one would build any software ever. ;-) No LLVM because there's GCC, no V8 engine because there's Spidermonkey, no Webpack because there's Browserify, no Rollup because there's Webpack, no Angular because there's JQuery, no React because there's Angular - you get the idea. There's always room for improvement and innovation. Competition is good. Writing a vanilla ES bundler with the Terser code base is probably a few weeks effort. The first and most important step would be to import modules and rename symbols to coexist in a single namespace. It gets more challenging when multiple output formats and chunking are supported. These things could be supported incrementally over time. I wouldn't be overly concerned about the memory footprint. The other offerings are no less memory hungry. |
I think we can achieve a low memory footprint and acceptable speed as long as we cache the parsed modules on disk. That way we can achieve less cold starts. Every time I restart my create-react-app (it has webpack underneath) it takes very long, and my computer isn't slow. |
@fabiosantoscode Rollup has no such file caching and it has acceptable speed. In watch mode Rollup appears to retain in-memory SHAs for each input. I think it would be better to concentrate on the main features before looking at optimizations. Just import all the code into memory initially - just get ES bundling working. Sometime after the initial implementation if you think you still need caching it can be as simple as having a per-module SHA key of the original source and a minified module payload with |
Caching the AST to disk wouldn't result in any gains and would likely be slower than dealing with ES input. AST in text form can be an order of magnitude larger than minified source code. Binary AST size is only roughly on par with minified source. Nevermind that Terser/Uglify AST is not compatible with ESTree and would require translation that'd erase any gains. Also keep in mind that comments are not currently handled by ESTree. For these reasons it's better to deal with minified source - ES parse time is insignificant (~10%) compared to compress and mangle. I think Terser should strive to move up the tools food chain, which is why I suggested going the full-fledged bundler route. As a minifier library it's relegated to the bottom rung in terms of visibility and funding. Frankly, it's not worth the effort to maintain. For the time spent you'd make more money as an UberEats driver. It's not a sustainable model. |
@kzc when I said I wanted to store the AST on disk, I meant to store a straight on-disk representation of what's in memory, Terser scoping information, a dependency graph and all, not just estree. But nevermind all that, you've showed me that this is not necessary to complete a bundler :) |
Feature request
#__PURE__
works only for functions calls at the moment - it seems to me that it seems reasonable to extend its meaning to also support getters. Especially that there is alreadypure_getters
option but which has some drawbacks:If this is something that you would consider adding then I could potentially look into implementing it myself - especially that it seems that most pieces in the code are already there (pure_getters support and
#__PURE__
handling for function calls).We probably would have to agree if it should make all nested getters pure (as in
i.want.to.access.this
) - but it seems that it should. If we'd consider a single getter marked as pure in such a situation then annotating all of them within a single expression would just look weird and become unwieldy.The text was updated successfully, but these errors were encountered: