Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node --loader treats entrypoint as ESM when should be loaded as CJS #33226

Closed
cspotcode opened this issue May 4, 2020 · 21 comments · May be fixed by #34177
Closed

node --loader treats entrypoint as ESM when should be loaded as CJS #33226

cspotcode opened this issue May 4, 2020 · 21 comments · May be fixed by #34177
Labels
confirmed-bug Issues with confirmed bugs. esm Issues and PRs related to the ECMAScript Modules implementation.

Comments

@cspotcode
Copy link

cspotcode commented May 4, 2020

  • Version: 14.1.0
  • Platform: Ubuntu 19
  • Subsystem: entrypoint handling

What steps will reproduce the bug?

mkdir empty-dir
cd empty-dir
echo '{}' > package.json
touch entrypoint
touch hooks.mjs
node ./entrypoint # <-- no error; no output; exit code 0
node --loader ./hooks.mjs ./entrypoint # <-- error that "" is not a recognized file extension by the default ESM loader

How often does it reproduce? Is there a required condition?

Always

What is the expected behavior?

./entrypoint is executed as a CommonJS module whether or not --loader is passed.

What do you see instead?

When --loader is passed, node always tries to load entrypoint scripts as ESM, ignoring that package.json wants the file to be treated as CJS. This does not happen when --loader is omitted, suggesting that this behavior is a bug.

"" is not a recognized file extension by the default ESM loader
(node:26067) ExperimentalWarning: --experimental-loader is an experimental feature. This feature could change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
internal/modules/run_main.js:54
    internalBinding('errors').triggerUncaughtException(
                              ^

TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension "" for /d/Personal-dev/@TypeStrong/ts-node-repros/empty-dir/entrypoint
    at Loader.defaultGetFormat [as _getFormat] (internal/modules/esm/get_format.js:65:15)
    at Loader.getFormat (internal/modules/esm/loader.js:113:42)
    at Loader.getModuleJob (internal/modules/esm/loader.js:244:31)
    at async Loader.import (internal/modules/esm/loader.js:178:17) {
  code: 'ERR_UNKNOWN_FILE_EXTENSION'
}

Additional information

#33223

@ljharb
Copy link
Member

ljharb commented May 4, 2020

cc @nodejs/modules-active-members

@GeoffreyBooth GeoffreyBooth added the esm Issues and PRs related to the ECMAScript Modules implementation. label May 4, 2020
@guybedford
Copy link
Contributor

This sounds like a resolver bug to me in that extensionless files not inside of "type": "module" with no parent (is the Node.js main) should always be treated as CommonJS for resolver consistency.

@GeoffreyBooth this was likely disabled as a feature in your previous PR.

@guybedford guybedford added the confirmed-bug Issues with confirmed bugs. label May 4, 2020
@guybedford
Copy link
Contributor

Note the workaround is to write a loader that ensures the main is a CommonJS module.

The difficulty here is the getFormat separation.

I'd be open to getFormat having an isMain boolean in its context representing if it is the Node.js main entry point.

@GeoffreyBooth
Copy link
Member

It's any unknown extension, not just extensionless files. Continuing from the steps above:

➜  mv entrypoint entrypoint.foo
➜  node --loader ./hooks.mjs ./entrypoint.foo
(node:97802) ExperimentalWarning: --experimental-loader is an experimental feature. This feature could change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
internal/modules/run_main.js:54
    internalBinding('errors').triggerUncaughtException(
                              ^

TypeError [ERR_UNKNOWN_FILE_EXTENSION]: Unknown file extension ".foo" for /private/tmp/test/empty-dir/entrypoint.foo
    at Loader.defaultGetFormat [as _getFormat] (internal/modules/esm/get_format.js:65:15)
    at Loader.getFormat (internal/modules/esm/loader.js:113:42)
    at Loader.getModuleJob (internal/modules/esm/loader.js:244:31)
    at async Loader.import (internal/modules/esm/loader.js:178:17) {
  code: 'ERR_UNKNOWN_FILE_EXTENSION'
}

It seems that when loaders are enabled at all, the check for known/ESM-supported file extensions is run on the main entry point, regardless of format.

@cspotcode
Copy link
Author

How is the main entrypoint intended to be resolved: via the CommonJS resolver or via the ESM resolver?

If an ESM file imports another file, then I know the ESM resolver is invoked. For require() I know the old CJS resolver is used. After resolution, the target file is classified as either ESM and CJS.

But I don't know the rules governing which resolver is used for the entrypoint. How do I control the choice of resolver for node invocations?

Am I supposed to be able to do node --loader http-esm-loader https://example.com/script.js?

@guybedford
Copy link
Contributor

@cspotcode this function determines whether to use the ES module loader or CommonJS loader on startup - https://github.com/nodejs/node/blob/master/lib/internal/modules/run_main.js#L23.

Whenever you use the --loader flag it sets to use the ES module loader, which has exposed this bug since normally the ES module loader doesn't run extensionless files.

Making the main an exception in the CommonJS case (no type: module) seems like the appropriate fix for this bug.

@cspotcode
Copy link
Author

@guybedford thanks for explaining. It would be nice to get this added to the docs. Otherwise, I think the safe assumption is that registering a custom loader is independent of node's CJS / ESM resolver selection.

https://nodejs.org/dist/latest-v14.x/docs/api/esm.html#esm_experimental_loaders

When hooks are used they only apply to ES module loading and not to any CommonJS modules loaded.

@cspotcode
Copy link
Author

Is this on the roadmap to be fixed? I see that discussion on the linked pull request appears to have stopped last July. Is there anything that can be updated about the minimal reproduction above to help clarify the issue?

@guybedford
Copy link
Contributor

@cspotcode is the workaround desribed in #33226 (comment) not enough for your needs here? The problem is that if Node.js was all ESM and we deprecated CommonJS, node extensionless-file would always fail.

@cspotcode
Copy link
Author

That means it is still a bug? That's what seems unclear: is it still considered a bug in node's behavior, or is it a case of node's docs needing an update? I understand that loader hooks are still experimental and so bugs are to be expected.

@guybedford
Copy link
Contributor

@cspotcode it's actually not really a bug - rather it is an inconsistency in the design. So the solution to the problem entirely depends on the outcome of other discussions:

  1. If we permitted extensionless files for ESM normally, then this would be supported naturally (try adding "type": "module" without a loader in your example, and it also breaks - the loader is the same sort of problem in that it switches you into ESM mode)
  2. Exploring workarounds. Eg, a custom getFormat hook as described. Or an --input-type or a variation of --input-type for mains could be another viable workaround. With loader chaining, it could be possible to have a "extensionless loader" that could be chained in to work around this issue.
  3. Exploring a new loader hook that itself determines if the other loader hooks should be used.

If I were in your position I'd just write a getFormat function that makes it work as per (2).

If you care about the modules system design, then trying to make architectural progress on the problem would be great too.

Personally I would still prefer (1) as the ideal solution, with an isMain resolver argument. But we still potentially have pushback on that unfortunately per its own dedicated issue.

@guybedford
Copy link
Contributor

Actually, thinking about this some more - one approach might be to create an isMain argument to loaders resolve, and then for the default loader when applying --loader to properly return CommonJS when isMain is true and there is no "type": "module" for an unknown extension.

I think that could be a consistent "bug fix" here.

@cspotcode
Copy link
Author

Yes, I believe that is the appropriate bugfix.

Thanks for the information. Here's my situation:

Some node packages are written with a bin entrypoint that does not have a file extension. This means those packages mysteriously fail to launch in the presence of NODE_OPTIONS="--loader=this-is-an-empty-file.mjs ("mysteriously" meaning users are confused, though I know why it's happening)

I should advise the maintainers of these packages that their extensionless bin entrypoints are currently incompatible with the --loader flag, and they may become completely incompatible with future versions of node, if node decides to stop allowing extensionless entrypoints. These maintainers should add a file extension to their bin entrypoints.

If a loader hook were to implement a workaround that resolves extensionless entrypoints as CommonJS, this would potentially make that loader incompatible with future versions of node, or make it incompatible with other loaders it should compose with. Better to avoid any workarounds and continue deferring those decisions to node's default ESM resolver. Loader hooks are experimental so there's always possibility of future incompatiblity, but I would still like to minimize the chances.

@guybedford
Copy link
Contributor

Ok you have me in agreement this is a bug and that is the fix then :) If you feel like becoming a Node.js contributor to this that would be amazing. I will note this as well though for myself if I find any bandwidth.

@juergba
Copy link

juergba commented Jun 5, 2021

@guybedford any news on this one here?

I know --loader is still experimental and Node's docs says: Note: This API is currently being redesigned and will still change.
Is there any chance to get this one in? Thanks.

@vjpr
Copy link

vjpr commented Sep 17, 2021

I'm having the opposite issue. I use NODE_OPTIONS=--experimental-loader @my/loader/loader.mjs, and it loads the loader.mjs as CJS, unless I change @my/loader/package.json#type = 'module' on node@16.7.0. @my/loader is a workspace dependency. Wierd. I wish we had a flag to force something to load as a module.

@JakobJingleheimer
Copy link
Contributor

There is potentially a workaround:

// my-loader.mjs
import path from 'path';

export async function load(resolvedUrl, context, defaultLoad) {
  const url = new URL(resolvedUrl);
  const ext = path.extname(url.pathname);
  const parentDir = path
    .dirname(url.pathname)
    .split(path.sep)
    .at(-1);

  if (!ext && parentDir === 'bin') return defaultLoad(resolvedUrl, {
    ...context,
    format: 'commonjs',
  });

  return defaultLoad(resolvedUrl, context);
}

CommonJS can handle binaries. I just tried this with Mocha, and it worked. I'm not sure if commonjs vs esm matters for binaries.

crapStone pushed a commit to Calciumdibromid/CaBr2 that referenced this issue Mar 8, 2022
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [ts-node](https://typestrong.org/ts-node) ([source](https://github.com/TypeStrong/ts-node)) | devDependencies | minor | [`10.5.0` -> `10.6.0`](https://renovatebot.com/diffs/npm/ts-node/10.5.0/10.6.0) |

---

### Release Notes

<details>
<summary>TypeStrong/ts-node</summary>

### [`v10.6.0`](https://github.com/TypeStrong/ts-node/releases/v10.6.0)

[Compare Source](TypeStrong/ts-node@v10.5.0...v10.6.0)

Questions about this release? Ask in the official discussion thread: [#&#8203;1666](TypeStrong/ts-node#1666)

**Added**

-   Adds workaround for extensionless entrypoints with ESM loader ([#&#8203;1649](TypeStrong/ts-node#1649), [#&#8203;1654](TypeStrong/ts-node#1654))
    -   You can now combine tools such as `mocha` with `--loader ts-node/esm`, where previously node would throw `[ERR_UNKNOWN_FILE_EXTENSION]`
    -   node has a bug where combining `--loader` with an extensionless entrypoint causes this error [nodejs/node#&#8203;33226](nodejs/node#33226)
    -   Some tools, for example `mocha`, have an extensionless entrypoint. ([source](https://github.com/mochajs/mocha/blob/547ffd73535088322579d3d2026432112eae3d4b/package.json#L37), [source](https://github.com/mochajs/mocha/blob/547ffd73535088322579d3d2026432112eae3d4b/bin/mocha))
    -   Combining `NODE_OPTIONS=--loader ts-node/esm` with these tools causes this error.  [mochajs/mocha#&#8203;4645](mochajs/mocha#4645)
    -   node intends to fix this bug in a future release: [nodejs/node#&#8203;41711](nodejs/node#41711)
    -   In the interim, we have implemented a workaround in ts-node.
-   Adds support for target "ES2022" in `moduleTypes` overrides ([#&#8203;1650](TypeStrong/ts-node#1650))

**Fixed**

-   Fixed bug where `--swc` and other third-party transpilers did not respect `moduleTypes` overrides ([#&#8203;1651](TypeStrong/ts-node#1651), [#&#8203;1652](TypeStrong/ts-node#1652), [#&#8203;1660](TypeStrong/ts-node#1660))
-   Fixed bug where node flags were not preserved correctly in `process.execArgv` ([#&#8203;1657](TypeStrong/ts-node#1657), [#&#8203;1658](TypeStrong/ts-node#1658))
    -   This affected `child_process.fork()`, since it uses `process.execArgv` to create a similar child runtime.
    -   With this fix, `child_process.fork()` will preserve both node flags and `ts-node` hooks.
-   Fixed compatibility TypeScript 4.7's API changes ([#&#8203;1647](TypeStrong/ts-node#1647), [#&#8203;1648](TypeStrong/ts-node#1648))

https://github.com/TypeStrong/ts-node/milestone/9

</details>

---

### Configuration

📅 **Schedule**: At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, click this checkbox.

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).

Co-authored-by: cabr2-bot <cabr2.help@gmail.com>
Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1195
Reviewed-by: Epsilon_02 <epsilon_02@noreply.codeberg.org>
Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
@aduh95
Copy link
Contributor

aduh95 commented Jan 19, 2023

The way it's currently working:

  1. The string passed as entrypoint goes through the CJS loader1 (that was done so the entrypoint can be messed around with in --require scripts).
  2. Then, we check if the user has passed some --loader/--import CLI flags, if so it's passed to the ESM loader2.
  3. Otherwise, if the resolved entrypoint has a .mjs extension, it's passed to the ESM loader2.
  4. Otherwise, if the resolved entrypoint has a .cjs extension, it's passed to the CJS loader.
  5. Otherwise, we check the local package.json to pass the entrypoint to either the ESM loader2 if there's `"type": "module", or the CJS one.

By default, the ESM loader would reject any unknown extension, hence the error message when loading an extensionless file.

There are a few issues with the current approach:

  • we can only pass paths as entrypoints, URLs would not work at all on Windows, and on other OSs it would be a very silly guessing game (but it's still doable by hooking into the CJS loader with --require).
  • as reported here, passing an empty loader changes the behavior (but it's "easy" to workaround tbf).

I think we'd need to at least improve the docs around those issues, and if there's a way to fix them without breaking the ecosystem, that's even better.

Footnotes

  1. More precisely, the entrypoint goes through path.resolve then is passed to Module._findPath. This is to ensure that we never resolve to a node_modules package, so anything that's not an absolute path is interpreted as a relative path.

  2. Before passing to the actual ESM loader, the path returned by the CJS loader in step 1 is converted to a file: URL. 2 3

@ljharb
Copy link
Member

ljharb commented Jan 19, 2023

What if the resolved entrypoint has a .json or .node extension?

@merceyz
Copy link
Member

merceyz commented Oct 2, 2023

This seems to have been fixed by #49869, the provided reproduction doesn't throw anymore.

@GeoffreyBooth
Copy link
Member

We also might be removing --loader, so it’s unlikely to see many bugfixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
confirmed-bug Issues with confirmed bugs. esm Issues and PRs related to the ECMAScript Modules implementation.
Projects
None yet
9 participants