Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL] mangle property #3293

Open
lifaon74 opened this issue Nov 14, 2018 · 3 comments
Open

[PROPOSAL] mangle property #3293

lifaon74 opened this issue Nov 14, 2018 · 3 comments

Comments

@lifaon74
Copy link

Feature request

Idea to mangle repetitive object properties.

Uglify version (uglifyjs -V)
3.4.9

Uglify script
Lets assume this default code to test compression:

var UglifyJS = require('uglify-js');
var code = ``; // see following examples
var result = UglifyJS.minify(code, {
  toplevel: true,
  mangle: {
    properties: false,
    toplevel: true
  }
});
console.log(result.code);

Repetitive object properties access example
Lets assume a script which inserts 10 text nodes in the DOM, or in a more generic manner, a script which gets many times the same property name from same or different objects.

var code = `
(function() {
  var doc = new DocumentFragment();
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  document.body.appendChild(doc);
})();
`;

In this example, the property appendChild is called 11 times.
The output code is:

!function(){var e=new DocumentFragment;e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),e.appendChild(new Text("a")),document.body.appendChild(e)}();

The length is 303B
As we can see, appendChild is repeated many times.

If we use properties: true the output becomes:

!function(){var e=new DocumentFragment;e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),e.e(new Text("a")),document.n.e(e)}();

The length is 210B
But the code becomes invalid !

Repetitive object properties access optimization
Lets rewrite the code in such a manner than an object property, is not accessed with dot but with a function instead. doc.appendChild becomes appendChild(doc).

var code = `
function appendChild(obj) {
  return obj.appendChild.bind(obj);
}
(function() {
  var doc = new DocumentFragment();
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(document.body)(doc);
})();
`;

The output is:

function e(e){return e.appendChild.bind(e)}var n;e(n=new DocumentFragment)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(n)(new Text("a")),e(document.body)(n);

The length is 250B
And more important: the code is totally valid !

obj.property1.property2.property3.property4 could be written as property4( property3( property2( property1(obj)))) which could potentially compress to a(b(c(d(e))))

Introducing this method to uglify js
The parser could detect repetitive properties access, and convert them to functions to enable stronger compression. I suggest kind of: properties: true | false | 'none' | 'hard' | 'soft'.
Where false map to 'none', truemap to 'hard' and have the same current behavior. Plus the introduction of 'soft' which tries to convert properties to functions (only if resulting size is smaller).
This technique could save a lot of bytes in classes using long property names.

Performances
This is a simple test of performances on chrome 70 :

function appendChild(obj) {
  return obj.appendChild.bind(obj);
}
(function() {
  var doc = new DocumentFragment();
  console.time('perf');
  for (let i = 0; i < 1e6; i++) {
    // doc.appendChild(new Text(Math.random().toString())); // 1450ms
    appendChild(doc, new Text(Math.random().toString())); // 1823ms~1500ms
  }
  console.timeEnd('perf');
  console.log(doc.firstChild.wholeText.length);
})();

As we can see, V8 keeps really good performances on this pattern.

PS: I present here a generic idea how to optimize object's properties with functions for compression.
This method would probably requires some adjustments according to the access context: call, set, get ?

function appendChildCall(obj) {
  return obj.appendChild.bind(obj);
}
// document.body.appendChild(new Text('a'));
appendChildCall(document.body)(new Text('a'));

// --- OR
function appendChildCall(obj) {
  return obj.appendChild.apply(obj, slice.call(arguments, 1));
}
// document.body.appendChild(new Text('a'));
appendChildCall(document.body, new Text('a'));


// ----
function appendChildGet(obj) {
  return obj.appendChild;
}
// console.log(document.body.appendChild === document.documentElement.appendChild);
console.log(appendChildGet(document.body) === appendChildGet(document.documentElement));

// ----
function appendChildSet(obj, value) {
  obj.appendChild = value;
}
// document.body.appendChild = function() { console.log('appendChild '); };
appendChildSet(document.body, function() { console.log('appendChild '); });
@kzc
Copy link
Contributor

kzc commented Nov 15, 2018

This sort of subexpression aliasing proposal comes up a lot. The gzip output is larger.

$ cat ex1.js
(function() {
  var doc = new DocumentFragment();
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  doc.appendChild(new Text('a'));
  document.body.appendChild(doc);
})();

$ cat ex1.js | terser --toplevel -mc | gzip | wc -c
      91
$ cat ex2.js
function appendChild(obj) {
  return obj.appendChild.bind(obj);
}
(function() {
  var doc = new DocumentFragment();
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(doc)(new Text('a'));
  appendChild(document.body)(doc);
})();

$ cat ex2.js | terser --toplevel -mc | gzip | wc -c
     121

@lifaon74
Copy link
Author

lifaon74 commented Nov 15, 2018

Yes, I agree than gzip loves repetitions for better compression. But it could be worth testing this method on big libraries (like angular or react which use a lot of DOM methods) and see what append.

PS: that's why I proposed a 'soft' flag which allow developpers to test if the code is smaller or not with this optimization.

@kzc
Copy link
Contributor

kzc commented Nov 15, 2018

But it could be worth testing this method on big libraries (like angular or react which use a lot of DOM methods) and see

Tested a few aliasing variations in the past. Current optimizations seem to work best for most code post gzip. In addition to toplevel, see also: passes, pure_getters and unsafe.

But don't let me stop you from creating a PR and proving otherwise. Compare sizes with test/benchmark.js. The trick is creating a general purpose solution that works with all code. Easier said than done when side effects are considered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants