Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow preventing inlining by function call #350

Closed
dead-claudia opened this issue May 25, 2019 · 32 comments
Closed

Allow preventing inlining by function call #350

dead-claudia opened this issue May 25, 2019 · 32 comments

Comments

@dead-claudia
Copy link

Just a simple feature request that there should be an escape hatch for inline to force a function to not be inlined. Of course, this is almost never a good idea, but in performance-critical scenarios, there might be a reason to explicitly not inline a function, mainly to guide engines to inline the caller more easily. Here's a few examples from Node's own code base: 1 2 3. I've also in the past done this in Acorn many years back, speeding it up by 5% simply by not inlining stuff into a giant switch statement.

When a function would not normally be inlined, this of course would be a no-op and could generate a warning (but it doesn't have to).

@leeoniya
Copy link

related discussion & impl in Closure:

google/closure-compiler#2751

@dead-claudia
Copy link
Author

dead-claudia commented May 25, 2019

I do have a preference: the annotation should be on the caller's side, not the callee's side. Engines usually care about the caller's side when performing optimizations, so this is where it'd have the greatest impact. Plus, there's already /* @__PURE__ */ which operates on the expression side, so this would glide in naturally in a place similar to that, and this in general operates closer to how Terser implements inlining anyways IIUC.

Edit: By "caller's side" I mean like in /* @noinline */ foo() and by "callee's side" I mean something like /* @noinline */ function foo() { ... }.

@leeoniya
Copy link

leeoniya commented May 26, 2019

i wonder how the "caller" side hint would work from a minifier's perspective.

it seems like it would rarely make sense to inline in the first place since you'd be saving just function a(b,c) {} worth of chars. but let's say this does make sense, then it would only happen in a single place since duplicating the body of the function for inlining in separate locations would be more wasteful for a minifier than not inlining at all. are there any minifiers that inline the function body in multiple locations?

for perf i imagine you'd want exactly the opposite on the caller side: /* @inline */ foo() while disabling auto-inlining globally (which to me only makes sense in a single location, so works well at the callee def). edit this essentially means you're handling any inlining minification manually of non-hot functions.

@dead-claudia
Copy link
Author

@leeoniya I've not had much of an issue with perf when it comes to automatic inlining, and I'd rather keep the option enabled for the 90% of code that could generally benefit from the reduced code size. It's just that 10% where I need a little more fine-grained control.

@dead-claudia
Copy link
Author

And Closure Compiler IIRC does (or at least did) inline small functions across multiple locations in its advanced optimization mode, which optimizes for speed over size somewhat.

@fabiosantoscode
Copy link
Collaborator

I like this feature. It doesn't like a lot of lines of code, for a quite significant benefit. Thanks for this issue!

@gdh1995
Copy link
Contributor

gdh1995 commented Jul 17, 2019

I also expect this, because currently I have to disable the reduce_funcs option, in order to avoid too many temporary functions in closures.

@dead-claudia
Copy link
Author

Any news on this? It's a mild blocker for me. There's a few cases where I've explicitly split a closure into a separate function to force the engine to generate a smaller closure to save some memory (the closure is what's returned), and I can't ensure this remains the case without such an escape hatch.

@fabiosantoscode
Copy link
Collaborator

I think it depends on your VM, but IIUC v8 and spidermonkey don't have unused variables sitting around in closure memory. Unless you use eval and the engine can't tell what you need in there?

To answer your question, I don't have plans to work on this right now, it depends on how much time I get and how serious are the issues coming in.

It might be done this year, but I have a day job to attend to :)

@fabiosantoscode
Copy link
Collaborator

Regarding plans on how to implement this, I've been advised by people smarter than me to use a caller-side annotation. It might be a bother if you're writing these by hand and have multiple call sites, but then again these annotations are meant more for compilers and macros than anything else.

So something like /* __NOINLINE__ */ call() (and possibly /* __INLINE__ */ call() too).

@dead-claudia
Copy link
Author

@fabiosantoscode That's my preference, too, a caller-side annotation.

I think it depends on your VM, but IIUC v8 and spidermonkey don't have unused variables sitting around in closure memory. Unless you use eval and the engine can't tell what you need in there?

It's been a while since I've read and tested this, so I tested this just now and profiled the last two in an empty data:text/html, page inside a Chrome DevTools REPL, using the following code:

// Merged
function getSetA(init, a, b, c, d, e, f, g, h, onchange) {
	return onchange != null
		? function () {
			window.temp = [a, b, c, d, e, f, g, h]
			if (arguments.length) {
				var temp = onchange, prev = init
				init = arguments[0]
				onchange = undefined
				try {
					if (temp) temp(init, prev)
				} finally {
					onchange = temp
				}
			}
			return init
		}
		: function () {
			if (arguments.length) init = arguments[0]
			return init
		}
}

// Split
function observableGetSet(init, a, b, c, d, e, f, g, h, onchange) {
	return function () {
		if (arguments.length) {
			window.temp = [a, b, c, d, e, f, g, h]
			var temp = onchange, prev = init
			init = arguments[0]
			onchange = undefined
			try {
				if (temp) temp(init, prev)
			} finally {
				onchange = temp
			}
		}
		return init
	}
}

function getSetB(init, a, b, c, d, e, f, g, h, onchange) {
	return onchange != null
		? observableGetSet(init, onchange)
		: function () {
			if (arguments.length) init = arguments[0]
			return init
		}
}

// I ran the above, started the profiler, ran these four statements, and then stopped it.
window.a0 = getSetA("string", 1, 2, 3, 4, 5, 6, 7, 8, null)
window.a1 = getSetA("string", 1, 2, 3, 4, 5, 6, 7, 8, (...x) => window.temp2 = x)
window.b0 = getSetB("string", 1, 2, 3, 4, 5, 6, 7, 8, null)
window.b1 = getSetB("string", 1, 2, 3, 4, 5, 6, 7, 8, (...x) => window.temp2 = x)

And here's the sizes of the allocated contexts for each:

  • window.a0 - 128 bytes
  • window.a1 - 128 bytes
  • window.b0 - 56 bytes
  • window.b1 - 128 bytes

Notes:

  • This was tested on macOS High Sierra 10.13, 64-bit. Numbers may differ in a 32-bit OS and possibly on other 64-bit operating systems.
  • The closures themselves are 64 bytes each. You have to dig deeper to see the size of the allocated closure context.
  • The global assignments are to force the arguments to be assumed used and not optimized out. The global array is unique to only the variant with the onchange callback

The big differentiating factor here is that a0 and a1 build closures within the same function, while b0 and b1 build the two closures in two different functions. V8 appears to build the same size closure context for both, even if not all variables are used.

Variables not used in any closure are generally kept out, which is why window.b0 is substantially smaller than window.b1. But this doesn't carry over to variables used in one closure but not another in the same function, as evidenced by window.a0 and window.a1 being the same size.

So yeah, you're correct for variables not used in any closure, but not for those used in some closures.

To answer your question, I don't have plans to work on this right now, it depends on how much time I get and how serious are the issues coming in.

It might be done this year, but I have a day job to attend to :)

Okay, fair enough.

@fabiosantoscode
Copy link
Collaborator

Thank you for your understanding :) I am passionate about making Terser the best it can be, but my job comes before issues, which come before features :)

@gdh1995
Copy link
Contributor

gdh1995 commented Oct 9, 2019

Well, reduce_funcs has been deprecated, and suddenly I find that I CAN NOT keep a function "not embeded".

I have some functions (they have names) in a closure. All of them are complicated enough (having arguments and variables), and one of them is exported and will be called for many times. Then I want to compress the code while keep those inner functions from being embeded as anonymous ones into the exported function.

I once set reduce_funcs false to achieve this, but today I notice terser v4.2.1+ always embeds them, even when I use compress { inline: false }. I wonder if there's still any method to avoid it. If no, then I expect a new option to do so.

My intents

My intent is to wrap some logic in not anonymous IIFE functions ({ ...; (()=>{...})(); ... }) but long-term functions in outside closures, so that:

  • they will certainly be optimized by JavaScript runtimes like V8,
  • there're no temporary functions being created during execution of the exported function.

If V8 has done some tricks to achieve these for anonymous IIFE functions, please help me and give some technology articles or links to C++ source code about it. Thanks!

@fabiosantoscode
Copy link
Collaborator

@gdh1995 when this has been implemented you should get around your issue.

I removed reduce_funcs since it was causing undue technical debt in an already complicated codebase. I didn't foresee that being a problem, but indeed there are many reasons why one would not want a function to be inlined somewhere else.

reduce_funcs is now enabled by default when reduce_vars is turned on. If you want to see if that affects performance, you could run a benchmark comparing your code before and after reduce_vars.

@gdh1995
Copy link
Contributor

gdh1995 commented Oct 24, 2019

What about this: add an option to define a limit X: number, and if a function belongs to a huge closure which has more than X variables and functions, then simply avoid embedding its definition when a current scope is in one of children of the function's scope.

The idea behind it is: if I've hard-code lots of functions (e.g. >= 20 items), then it means I've extracted enough code to avoid allocating temporary function instances frequently, so just skip embedding.

I've written a naive version of this idea in gdh1995@1365bbc and I'm expecting reviews.

Updated: I've field #500 to discuss this.

@dead-claudia
Copy link
Author

dead-claudia commented Oct 24, 2019 via email

@fabiosantoscode
Copy link
Collaborator

There is now a /*#__NOINLINE__*/ (as well as an inline equivalent) annotation you can use on a call, which tells Terser to not inline a function into a call. Please give it a go, and if it fails in any way, open an issue!

Thanks!

@gdh1995
Copy link
Contributor

gdh1995 commented Nov 13, 2019

@fabiosantoscode Sorry but I think the /*#__NOINLINE__*/ is not what I want at all.

Expected

A function which is used only once can keep itself standalone.

The real result

A function is still embeded as an IIFE.

Here's my test:

// input.js
(function(){
  function foo(val) { return val; }
  function bar() {
    var pass = 1;
    pass = /*@__NOINLINE__*/ foo(pass);
    window.data = pass;
  }
  window.bar = bar;
  bar();
})();

And terser -c inline=true input.js prints:

!function(){function bar(){var pass=1;pass=function(val){return val}(pass),window.data=pass}window.bar=bar,bar()}();

I think a better result should be !function(){function foo(val){return val}function bar(){var pass=1;pass=foo(pass),window.data=pass}window.bar=bar,bar()}();

@fabiosantoscode
Copy link
Collaborator

My bad!

Thanks for the test case!

@dead-claudia
Copy link
Author

dead-claudia commented Nov 13, 2019 via email

@fabiosantoscode
Copy link
Collaborator

Yeah @isiahmeadows my intention was for it to work this way, but I did something wrong because there's two ways that terser can inline a function and I didn't cover both :)

@gdh1995
Copy link
Contributor

gdh1995 commented Dec 29, 2019

Um, it seems that this is not solved yet?

@fabiosantoscode
Copy link
Collaborator

@gdh1995 I can't reproduce the issue anymore with the latest Terser.

@fabiosantoscode
Copy link
Collaborator

Oh, actually I can :)

@fabiosantoscode
Copy link
Collaborator

I've made some progress towards this.

@fabiosantoscode
Copy link
Collaborator

I'm going to release 4.6.2 tomorrow with this. I've tried a few ways that noinline might fail to prevent an inline, but there might still be something missing.

@gdh1995
Copy link
Contributor

gdh1995 commented Jan 9, 2020

Oops, I tested terser@4.6.2 just now and all of the three /*#__NOINLINE__*/ func() in my project Vimium C failed in protecting the three functions.

I'll test more and put a shorter example here in 1~2 days.

@fabiosantoscode
Copy link
Collaborator

Thank you, I'll look into it once there's a repro. If there are three cases, please include all three. Each might fail for its own specific reason.

@gdh1995
Copy link
Contributor

gdh1995 commented Jan 13, 2020

@fabiosantoscode Sorry I made a mistake when doing tests. The comments do work now. But now I have a new related problem.

I once wrote:

// getGulpUglify means return `gulp-uglify` with `terser`
stream = stream.pipe(getGulpUglify()(config));
stream = stream.pipe(getGulpUglify()({...config, mangle: null}));

and the first pass removed all /*#__NOINLINE__*/ functions, so the second pass lost my instructions and then embeded all target functions.

Now I test config.output.comments = /^[#@!]/ but it has no influence.

Therefore, I wonder:

  • should terser always remove /*#__NOINLINE__*/ in compress phase?
  • maybe it's better to wait for .output.comment to filter them out.

@fabiosantoscode
Copy link
Collaborator

@gdh1995 that's been fixed in #550. I'm going to ship the new version now.

Also, be careful, since Terser modifies your config object. Unless gulp-uglify copies them beforehand for you.

@gdh1995
Copy link
Contributor

gdh1995 commented Jan 15, 2020

Thanks. Now it works as I expect.

gulp-uglify always shallow-copies config object since v3.0.2.

@fabiosantoscode
Copy link
Collaborator

Awesome. If you have any further issues regarding this feature please open a new issue!

Cheers!

developit added a commit to preactjs/preact that referenced this issue Jun 3, 2021
We are heavily reliant on `reduce_funcs`, which was removed in Terser 5, but [will be added back in 5.2.2 or 5.3](terser/terser#350).
In the meantime, we have to disable `reduce_vars` to prevent function inlining, although this has a large size cost since it also prevents variable inlining.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants