Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esbuild minify fails some recent uglify-js tests #1305

Closed
kzc opened this issue May 22, 2021 · 18 comments
Closed

esbuild minify fails some recent uglify-js tests #1305

kzc opened this issue May 22, 2021 · 18 comments

Comments

@kzc
Copy link
Contributor

kzc commented May 22, 2021

Using node v16.1.0 with the patch below, esbuild e76b49f produces:

$ make clean-all uglify
...
29 failed out of 3163
make: *** [uglify] Error 1

Note that all the original unminified expect_stdout test cases will run successfully on node v16.1.0.

Some failures are due to esbuild being more strict than recent versions of NodeJS, but around half are genuine esbuild minify errors.

diff --git a/Makefile b/Makefile
index 19dfe06..cf97c76 100644
--- a/Makefile
+++ b/Makefile
@@ -372,3 +372,3 @@ github/uglify:
        cd github/uglify && git init && git remote add origin https://github.com/mishoo/uglifyjs.git
-       cd github/uglify && git fetch --depth 1 origin 83a3cbf1514e81292b749655f2f712e82a5a2ba8 && git checkout FETCH_HEAD
+       cd github/uglify && git fetch --depth 1 origin d2a45ba441ed6975021b3524215c01a011dfb46a && git checkout FETCH_HEAD
 
diff --git a/scripts/uglify-tests.js b/scripts/uglify-tests.js
index 4214959..e4cf715 100644
--- a/scripts/uglify-tests.js
+++ b/scripts/uglify-tests.js
@@ -102,2 +102,14 @@ async function test_case(esbuild, test) {
 
+  // Ignore tests without expect_stdout, as these tests cannot be verified when run
+  if (!("expect_stdout" in test)) return;
+
+  // Ignore tests that will likely result in a syntax error in recent NodeJS versions.
+  // uglify-js generally accepts whatever inputs that NodeJS allowed/allows
+  // as long as it is unambiguous with grammar lookahead -
+  // even if it deviates from the latest ECMAScript specification.
+  // `true` in this context means "the minified test program output matches the
+  // unminified test program output for this version of NodeJS - whether valid
+  // or an error."
+  if (test.expect_stdout === true) return;
+
   // Ignore tests that no longer pass in modern versions of node. These tests
@@ -117,17 +129,4 @@ async function test_case(esbuild, test) {
       minify: true,
-      target: 'es5',
     });
   } catch (e) {
-    // These two tests fail because they contain setters without arguments,
-    // which is a syntax error. These test failures do not indicate anything
-    // wrong with esbuild so the failures are ignored. Here is one of the
-    // tests:
-    //
-    //   function f(){var a={get b(){},set b(){}};return{a:a}}
-    //
-    if (test.name === 'unsafe_object_accessor' || test.name === 'keep_name_of_setter') {
-      console.error("*** skipping test with known syntax error:", test.name);
-      return;
-    }
-
     const formatError = ({ text, location }) => {
@evanw
Copy link
Owner

evanw commented May 22, 2021

Thanks for reporting this. I appreciate the heads up. Looking into these now...

@kzc
Copy link
Contributor Author

kzc commented May 22, 2021

29 failed out of 3163

Hmm... I think the skipped test cases have been incorrectly folded into passed.

The figures with the following patch look more reasonable:

$ make clean-all uglify
...
29 failed out of 2180, with 983 skipped
make: *** [uglify] Error 1
diff --git a/Makefile b/Makefile
index 19dfe06..cf97c76 100644
--- a/Makefile
+++ b/Makefile
@@ -372,3 +372,3 @@ github/uglify:
        cd github/uglify && git init && git remote add origin https://github.com/mishoo/uglifyjs.git
-       cd github/uglify && git fetch --depth 1 origin 83a3cbf1514e81292b749655f2f712e82a5a2ba8 && git checkout FETCH_HEAD
+       cd github/uglify && git fetch --depth 1 origin d2a45ba441ed6975021b3524215c01a011dfb46a && git checkout FETCH_HEAD
 
diff --git a/scripts/uglify-tests.js b/scripts/uglify-tests.js
index 4214959..61745a6 100644
--- a/scripts/uglify-tests.js
+++ b/scripts/uglify-tests.js
@@ -9,2 +9,3 @@ const testDir = path.join(repoDir, 'scripts', '.uglify-tests');
 const uglifyDir = path.join(repoDir, 'demo', 'uglify');
+const SKIP = new Error("SKIP");
 let U;
@@ -36,6 +37,8 @@ async function main() {
   let failedTotal = 0;
+  let skippedTotal = 0;
   const runTest = file => test_file(esbuild, path.join(compressDir, file))
-    .then(({ passed, failed }) => {
+    .then(({ passed, failed, skipped }) => {
       passedTotal += passed;
       failedTotal += failed;
+      skippedTotal += skipped;
     });
@@ -46,3 +49,3 @@ async function main() {
 
-  console.log(`${failedTotal} failed out of ${passedTotal + failedTotal}`);
+  console.log(`${failedTotal} failed out of ${passedTotal + failedTotal}, with ${skippedTotal} skipped`);
   if (failedTotal) {
@@ -55,2 +58,3 @@ async function test_file(esbuild, file) {
   let failed = 0;
+  let skipped = 0;
   const tests = parse_test(file);
@@ -59,8 +63,11 @@ async function test_file(esbuild, file) {
     .catch(e => {
-      failed++;
-      console.error(`<U+274C> ${file}: ${name}: ${(e && e.message || e).trim()}\n`);
-      pass = false;
+      if (e === SKIP) {
+        skipped++;
+      } else {
+        failed++;
+        console.error(`<U+274C> ${file}: ${name}: ${(e && e.message || e).trim()}\n`);
+      }
     });
   await Promise.all(Object.keys(tests).map(runTest));
-  return { passed, failed };
+  return { passed, failed, skipped };
 }
@@ -102,2 +109,14 @@ async function test_case(esbuild, test) {
 
+  // Ignore tests without expect_stdout, as these tests cannot be verified when run
+  if (!("expect_stdout" in test)) throw SKIP;
+
+  // Ignore tests that will likely result in a syntax error in recent NodeJS versions.
+  // uglify-js generally accepts whatever inputs that NodeJS allowed/allows
+  // as long as it is unambiguous with grammar lookahead -
+  // even if it deviates from the latest ECMAScript specification.
+  // `true` in this context means "the minified test output matches the
+  // unminified test output for this version of NodeJS - whether valid
+  // or an error."
+  if (test.expect_stdout === true) throw SKIP;
+
   // Ignore tests that no longer pass in modern versions of node. These tests
@@ -110,3 +129,3 @@ async function test_case(esbuild, test) {
     console.error("*** skipping test %j with node_version %j", test.name, test.node_version);
-    return;
+    throw SKIP;
   }
@@ -117,17 +136,4 @@ async function test_case(esbuild, test) {
       minify: true,
-      target: 'es5',
     });
   } catch (e) {
-    // These two tests fail because they contain setters without arguments,
-    // which is a syntax error. These test failures do not indicate anything
-    // wrong with esbuild so the failures are ignored. Here is one of the
-    // tests:
-    //
-    //   function f(){var a={get b(){},set b(){}};return{a:a}}
-    //
-    if (test.name === 'unsafe_object_accessor' || test.name === 'keep_name_of_setter') {
-      console.error("*** skipping test with known syntax error:", test.name);
-      return;
-    }
-
     const formatError = ({ text, location }) => {

@kzc
Copy link
Contributor Author

kzc commented May 22, 2021

If this line in the latest patch is commented out:

  if (test.expect_stdout === true) throw SKIP;

there will be false failures in some tests, but more tests will be run:

51 failed out of 2488, with 675 skipped

Additional code may be added to selectively skip certain tests.

@evanw
Copy link
Owner

evanw commented May 22, 2021

Some of the tests look to me like they are actually encoding a bug in V8 (specifically mishoo/UglifyJS#4269). I have filed the bug here: https://bugs.chromium.org/p/v8/issues/detail?id=11810. I could be wrong of course but I believe properties should be defined in source order.

If the behavior of the input code is buggy in a certain implementation, then I don't think esbuild necessarily has to avoid optimizations that might accidentally generate the correct output. Either that or esbuild should automatically apply a workaround for these bugs if the target environment includes a V8-based VM. Basically I think esbuild should either have a garbage-in-garbage-out philosophy or should always generate the correct output, but it shouldn't bake in implementation bugs into its optimization passes.

@kzc
Copy link
Contributor Author

kzc commented May 22, 2021

uglify-js employs a garbage-in/garbage-out philosophy for the most part.

When possible it will try to agree with NodeJS to prevent unwanted noise in its automated test case generating fuzzer. It has encountered and/or exposed a few V8 bugs.

https://github.com/mishoo/UglifyJS/actions

@evanw
Copy link
Owner

evanw commented May 22, 2021

Some other cases test TDZ handling. I'm not currently preserving TDZ checks because doing so inhibits a lot of optimizations, both at compile-time and run-time. I suspect TDZ checks are rarely relied upon in real-world code but that disabling optimizations which assume local variable references have no side effects would be a very unpopular change (it would likely disable most tree-shaking done by the bundler). Preserving top-level TDZ checks also has extremely negative performance consequences. So I'm inclined to skip over failures due to TDZ checks, although I could potentially be convinced otherwise.

@kzc
Copy link
Contributor Author

kzc commented May 22, 2021

It's up to you. TDZ is a part of the ES spec. One could make the argument that avoiding triggering V8/Chrome bugs is not much different than avoiding the Safari JS engine TDZ slowness with correct ES code. uglify-js does manage to preserve TDZ semantics while providing good minification.

I've also noticed failures related to https://tc39.es/proposal-template-literal-revision/.

@evanw
Copy link
Owner

evanw commented May 22, 2021

One could make the argument that avoiding triggering V8/Chrome bugs is not much different than avoiding the Safari JS engine TDZ slowness with correct ES code.

Yes, one could argue that. However, it seems different to me because of the extreme scale of the problem (the VM locks up for many seconds and it affects virtually every modern code base due to the use of class and let/const). Although it only affects top-level variables so you could potentially argue that the minifier shouldn't consider references side-effect free if they are either non-top-level or maybe inside a try/catch or something. I could see potentially making a change like that.

I've also noticed failures related to https://tc39.es/proposal-template-literal-revision/.

Yeah. I'm aware of that one but haven't fixed it yet because no one has asked for it. I should probably fix it though. It comes up in test262 as well.

By the way the failures due to await are because of esbuild's top-level await support. The parser does not make a distinction between the script and module goals so it always parses top-level await. This hasn't ever been an issue with real-world code and is basically only a problem for conformance tests. But I think it's an acceptable trade-off.

I have fixed the following issues so far, although there are still more issues to be fixed. Here is the text from the release notes:

  • The in operator is now surrounded parentheses inside arrow function expression bodies inside for loop initializers:

    // Original code
    for ((x => y in z); 0; ) ;
    
    // Old output
    for ((x) => y in z; 0; ) ;
    
    // New output
    for ((x) => (y in z); 0; ) ;

    Without this, the in operator would cause the for loop to be considered a for-in loop instead.

  • The statement return undefined; is no longer minified to return; inside async generator functions:

    // Original code
    return undefined;
    
    // Old output
    return;
    
    // New output
    return void 0;

    Using return undefined; inside an async generator function has the same effect as return await undefined; which schedules a task in the event loop and runs code in a different order than just return;, which doesn't hide an implicit await expression.

  • Property access expressions are no longer inlined in template tag position:

    // Original code
    (null, a.b)``, (null, a[b])``;
    
    // Old output
    a.b``, a[b]``;
    
    // New output
    (0, a.b)``, (0, a[b])``;

    The expression a.b`c` is different than the expression (0, a.b)`c`. The first calls the function a.b with a as the value for this but the second calls the function a.b with the default value for this (the global object in non-strict mode or undefined in strict mode).

  • Verbatim __proto__ properties inside object spread are no longer inlined when minifying:

    // Original code
    x = { ...{ __proto__: { y: true } } }.y;
    
    // Old output
    x = { __proto__: { y: !0 } }.y;
    
    // New output
    x = { ...{ __proto__: { y: !0 } } }.y;

    A verbatim (i.e. non-computed non-method) property called __proto__ inside an object literal actually sets the prototype of the surrounding object literal. It does not add an "own property" called __proto__ to that object literal, so inlining it into the parent object literal would be incorrect. The presence of a __proto__ property now stops esbuild from applying the object spread inlining optimization when minifying.

  • The value of this has now been fixed for lowered private class members that are used as template tags:

    // Original code
    x = (new (class {
      a = this.#c``;
      b = 1;
      #c() { return this }
    })).a.b;
    
    // Old output
    var _c, c_fn, _a;
    x = new (_a = class {
      constructor() {
        __privateAdd(this, _c);
        __publicField(this, "a", __privateMethod(this, _c, c_fn)``);
        __publicField(this, "b", 1);
      }
    }, _c = new WeakSet(), c_fn = function() {
      return this;
    }, _a)().a.b;
    
    // New output
    var _c, c_fn, _a;
    x = new (_a = class {
      constructor() {
        __privateAdd(this, _c);
        __publicField(this, "a", __privateMethod(this, _c, c_fn).bind(this)``);
        __publicField(this, "b", 1);
      }
    }, _c = new WeakSet(), c_fn = function() {
      return this;
    }, _a)().a.b;

    The value of this here should be an instance of the class because the template tag is a property access expression. However, it was previously the default value (the global object in non-strict mode or undefined in strict mode) instead due to the private member transformation, which is incorrect.

@kzc
Copy link
Contributor Author

kzc commented May 22, 2021

Some of the features in ECMAScript are just bonkers or seldom used. And TDZ is the bane of any fuzzer implementor trying to generate valid random programs that don't throw an error. But if a decision is made to have esbuild deviate from the ES specification or not support some of its features by design then I think it ought to be documented somewhere. At the very least it might start a discussion to deprecate some less than useful ES features.

@kzc
Copy link
Contributor Author

kzc commented May 22, 2021

However, it seems different to me because of the extreme scale of the problem (the VM locks up for many seconds and it affects virtually every modern code base due to the use of class and let/const).

let/const/class are not fringe features of ES6 - they are central to the language and ought to be implemented efficiently as Chrome/V8 and Mozilla have. In the case of Safari/JavaScriptCore the result was not incorrect - just slow for large bundles. The way I see it, when performance issues in a particular JS engine are papered over by minifiers and bundlers it provides less of an incentive for the vendor or project to push out fixes in a timely manner. Look at IE8 as an example of an entire developer ecosystem extending the life of an obsolete browser a decade longer than it ought to have lasted.

@evanw
Copy link
Owner

evanw commented May 23, 2021

Another issue is that I'm relying on lack of TDZ to implement lazily-initialized modules for correct ordering of dynamically-loaded modules (both require() and import()). For example:

// entry.mjs
console.log('before')
await import('./a.mjs')
console.log('after')
// a.mjs
import { b } from './b.mjs'
export let a = b
// b.mjs
import { a } from './a.mjs'
export let b = a

Evaluating that code should throw an error, and does when you run node entry.mjs:

before
b.mjs:2
export let b = a
               ^

ReferenceError: Cannot access 'a' before initialization
    at b.mjs:2:16
    at ModuleJob.run (node:internal/modules/esm/module_job:175:25)
    at async Loader.import (node:internal/modules/esm/loader:178:24)
    at async entry.mjs:2:1

However, it doesn't when bundled with esbuild (via esbuild --bundle entry.mjs --format=esm --outfile=out.mjs). The reason is that esbuild generates the following code:

var __defProp = Object.defineProperty;
var __esm = (fn, res) => function __init() {
  return fn && (res = (0, fn[Object.keys(fn)[0]])(fn = 0)), res;
};
var __export = (target, all) => {
  for (var name in all)
    __defProp(target, name, { get: all[name], enumerable: true });
};

// b.mjs
var b;
var init_b = __esm({
  "b.mjs"() {
    init_a();
    b = a;
  }
});

// a.mjs
var a_exports = {};
__export(a_exports, {
  a: () => a
});
var a;
var init_a = __esm({
  "a.mjs"() {
    init_b();
    a = b;
  }
});

// entry.mjs
console.log("before");
await Promise.resolve().then(() => (init_a(), a_exports));
console.log("after");

Even exchanging var with let doesn't fix this. If TDZ checks were important then I think I would have to actually generate code to manually track all of the initialized states of all of the variables and also code to manually check whether the variables are initialized before evaluating each reference. That would come at both a code size and run-time speed cost.

I assume enforcing TDZ checks like this would be an unpopular choice (I certainly wouldn't want this behavior for my code in production) but I can't think of another way to do it. Or, well, you could always refuse to bundle the code and generate individual files instead, but that sort of defeats the point of a bundler.

@kzc
Copy link
Contributor Author

kzc commented May 23, 2021

You don't have to convince me - I'm no fan of TDZ and appreciate why it's a hindrance to bundling. I'm just saying it should be documented, that's all. Perhaps you already have.

@kzc
Copy link
Contributor Author

kzc commented May 23, 2021

fwiw, if the multi-module TDZ example above is run through rollup it does preserve NodeJS's TDZ semantics:

$ rollup -v
rollup v2.49.0
$ rollup ./entry.mjs --inlineDynamicImports --silent
console.log('before');
await Promise.resolve().then(function () { return a$1; });
console.log('after');

let b = a;

let a = b;

var a$1 = /*#__PURE__*/Object.freeze({
	__proto__: null,
	a: a
});
$ rollup ./entry.mjs --inlineDynamicImports --silent | node --input-type=module
before
after
file:///[eval1]:5
let b = a;
        ^

ReferenceError: Cannot access 'a' before initialization

rollup can also generate multiple output chunks using:

$ rollup ./entry.mjs -d out --silent

$ ls -1 out/*
out/a-303bc69c.js
out/entry.js

But I have no idea how to get it to produce .mjs suffixed outputs to work with node with top-level await.

However, this non-top-level-await entry point will work:

// entry2.mjs
console.log('before');
(async function() {
    await import('./a.mjs');
})();
console.log('after');
$ rollup ./entry2.mjs -f cjs -d out --silent
$ node out/entry2.js 
before
after
out/a-3134a405.js:3
let b = a;
        ^

ReferenceError: Cannot access 'a' before initialization

@kzc
Copy link
Contributor Author

kzc commented May 23, 2021

But I have no idea how to get it to produce .mjs suffixed outputs to work with node with top-level await.

Finally cracked the code to get rollup to work with NodeJS TLA...

$ rollup ./entry.mjs --entryFileNames "[name].mjs" --chunkFileNames "[name]-[hash].mjs" -d out --silent
$ ls -1 out/*
out/a-a643a2b9.mjs
out/entry.mjs
$ cat out/entry.mjs 
console.log('before');
await import('./a-a643a2b9.mjs');
console.log('after');
$ cat out/a-a643a2b9.mjs 
let b = a;

let a = b;

export { a };

@kzc
Copy link
Contributor Author

kzc commented May 26, 2021

@evanw await as an identifier, TDZ and V8 bugs aside, most of the issues of consequence have been addressed. But I think the failing destructuring catch scope test case issue_4420 was missed in the noise. Here's a slightly more comprehensive catch parameter scope test that also fails in Rollup.

By the way, is Rollup's approach to TDZ semantics preservation as seen above something you would consider to support in esbuild? It doesn't appear to impose any significant code size nor run-time speed cost.

@evanw
Copy link
Owner

evanw commented May 26, 2021

But I have no idea how to get it to produce .mjs suffixed outputs to work with node with top-level await.

The --out-extension feature is intended to be used for this.

But I think the failing destructuring catch scope test case issue_4420 was missed in the noise.

It wasn't missed. That's part of why this issue is still open. I have limited time this week so I was only able to fix some of the things so far.

By the way, is Rollup's approach to TDZ semantics preservation as seen above something you would consider to support in esbuild? It doesn't appear to impose any significant code size nor run-time speed cost.

This should be what happens in esbuild too with --splitting enabled. But code splitting can also be disabled, which may not be a feature that Rollup has.

Also it doesn't look like Rollup treats TDZ checks as a side effect. For example bundling let a = a with Rollup generates an empty file even though that code will throw an exception, which is a side effect.

@kzc
Copy link
Contributor Author

kzc commented May 26, 2021

I have limited time this week so I was only able to fix some of the things so far.

My bad. I mistakenly assumed that the remaining test failures were there by design. No one truly cares about using await as an identifier, for example.

Also it doesn't look like Rollup treats TDZ checks as a side effect.

I'd consider any lack of TDZ checking in Rollup to be a bug. But just using its REPL now I see it's pretty common. It's progressing though - rollup has recently added support for side effects for setters and getters, among other checks now.

A workaround to support TDZ with Rollup would be to use --no-treeshake and employ a minifier like uglify-js or terser that respects TDZ side effects. There's a lot of overlap between Rollup optimizations and minifier optimizations. The only significant drawback with this approach is that webpack-style sideEffects: false code pruning would also be disabled, if the bundle happens to significantly rely upon it. The minifier wouldn't be able to drop the side-effect code for bundled sideEffects: false modules - unless pure annotations were used.

Edit: terser does not reliably respect TDZ side effects:

$ node_modules/.bin/terser -V
terser 5.7.0

$ echo 'w; let w, x = 1+x, y = console.log(123), z;' | node_modules/.bin/terser --toplevel -mc
console.log(123);

uglify-js does however:

$ uglify-js -V
uglify-js 3.13.9

$ echo 'w; let w, x = 1+x, y = console.log(123), z;' | uglify-js --toplevel -mc
l;let l,o=1+o;console.log(123);

@kzc
Copy link
Contributor Author

kzc commented Jul 7, 2021

I'm going to close this issue because it's a moving target with uglify-js adding a dozen or so new test cases per week. Feel free to advance the uglify-js commit hash and retest esbuild from time to time at your convenience.

On a related note, Rollup has just started to retain some of the more common TDZ violations in code, although not as comprehensively as uglify-js.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants