Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ESM node processes being unable to fork into other scripts #1814

Merged
merged 11 commits into from Jul 13, 2022
131 changes: 97 additions & 34 deletions src/bin.ts
Expand Up @@ -48,7 +48,8 @@ export function main(
const state: BootstrapState = {
shouldUseChildProcess: false,
isInChildProcess: false,
entrypoint: __filename,
isCli: true,
tsNodeScript: __filename,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this renamed to indicate that this is the path to ts-node's bootstrapper, but not the user's desired entrypoint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I renamed that mostly to avoid the ambiguity here (at least ambiguity to me). Happy to revert if you would want it the old way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renaming is good, makes sense to me.

parseArgvResult: args,
};
return bootstrap(state);
Expand All @@ -62,7 +63,12 @@ export function main(
export interface BootstrapState {
isInChildProcess: boolean;
shouldUseChildProcess: boolean;
entrypoint: string;
/**
* True if bootstrapping the ts-node CLI process or the direct child necessitated by `--esm`.
* false if bootstrapping a subsequently `fork()`ed child.
*/
isCli: boolean;
tsNodeScript: string;
parseArgvResult: ReturnType<typeof parseArgv>;
phase2Result?: ReturnType<typeof phase2>;
phase3Result?: ReturnType<typeof phase3>;
Expand All @@ -73,12 +79,16 @@ export function bootstrap(state: BootstrapState) {
if (!state.phase2Result) {
state.phase2Result = phase2(state);
if (state.shouldUseChildProcess && !state.isInChildProcess) {
// Note: When transitioning into the child-process after `phase2`,
// the updated working directory needs to be preserved.
return callInChild(state);
}
}
if (!state.phase3Result) {
state.phase3Result = phase3(state);
if (state.shouldUseChildProcess && !state.isInChildProcess) {
// Note: When transitioning into the child-process after `phase2`,
// the updated working directory needs to be preserved.
return callInChild(state);
}
}
Expand Down Expand Up @@ -264,8 +274,7 @@ function parseArgv(argv: string[], entrypointArgs: Record<string, any>) {
}

function phase2(payload: BootstrapState) {
const { help, version, code, interactive, cwdArg, restArgs, esm } =
payload.parseArgvResult;
const { help, version, cwdArg, esm } = payload.parseArgvResult;

if (help) {
console.log(`
Expand Down Expand Up @@ -319,28 +328,14 @@ Options:
process.exit(0);
}

// Figure out which we are executing: piped stdin, --eval, REPL, and/or entrypoint
// This is complicated because node's behavior is complicated
// `node -e code -i ./script.js` ignores -e
const executeEval = code != null && !(interactive && restArgs.length);
const executeEntrypoint = !executeEval && restArgs.length > 0;
const executeRepl =
!executeEntrypoint &&
(interactive || (process.stdin.isTTY && !executeEval));
const executeStdin = !executeEval && !executeRepl && !executeEntrypoint;

const cwd = cwdArg || process.cwd();
/** Unresolved. May point to a symlink, not realpath. May be missing file extension */
const scriptPath = executeEntrypoint ? resolve(cwd, restArgs[0]) : undefined;
const cwd = cwdArg ? resolve(cwdArg) : process.cwd();

// If ESM is explicitly enabled through the flag, stage3 should be run in a child process
// with the ESM loaders configured.
if (esm) payload.shouldUseChildProcess = true;

return {
executeEval,
executeEntrypoint,
executeRepl,
executeStdin,
cwd,
scriptPath,
};
}

Expand Down Expand Up @@ -372,7 +367,15 @@ function phase3(payload: BootstrapState) {
esm,
experimentalSpecifierResolution,
} = payload.parseArgvResult;
const { cwd, scriptPath } = payload.phase2Result!;
const { cwd } = payload.phase2Result!;

// NOTE: When we transition to a child process for ESM, the entry-point script determined
// here might not be the one used later in `phase4`. This can happen when we execute the
// original entry-point but then the process forks itself using e.g. `child_process.fork`.
// We will always use the original TS project in forked processes anyway, so it is
// expected and acceptable to retrieve the entry-point information here in `phase2`.
// See: https://github.com/TypeStrong/ts-node/issues/1812.
const { entryPointPath } = getEntryPointInfo(payload);

const preloadedConfig = findAndReadConfig({
cwd,
Expand All @@ -387,7 +390,12 @@ function phase3(payload: BootstrapState) {
compilerHost,
ignore,
logError,
projectSearchDir: getProjectSearchDir(cwd, scriptMode, cwdMode, scriptPath),
projectSearchDir: getProjectSearchDir(
cwd,
scriptMode,
cwdMode,
entryPointPath
),
project,
skipProject,
skipIgnore,
Expand All @@ -403,23 +411,77 @@ function phase3(payload: BootstrapState) {
experimentalSpecifierResolution as ExperimentalSpecifierResolution,
});

// If ESM is enabled through the parsed tsconfig, stage4 should be run in a child
// process with the ESM loaders configured.
if (preloadedConfig.options.esm) payload.shouldUseChildProcess = true;

return { preloadedConfig };
}

/**
* Determines the entry-point information from the argv and phase2 result. This
* method will be invoked in two places:
*
* 1. In phase 3 to be able to find a project from the potential entry-point script.
* 2. In phase 4 to determine the actual entry-point script.
*
* Note that we need to explicitly re-resolve the entry-point information in the final
* stage because the previous stage information could be modified when the bootstrap
* invocation transitioned into a child process for ESM.
*
* Stages before (phase 4) can and will be cached by the child process through the Brotli
* configuration and entry-point information is only reliable in the final phase. More
* details can be found in here: https://github.com/TypeStrong/ts-node/issues/1812.
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding all these comments.

It makes sense to me that we must repeat resolution in phase 4. Are there any situations where we know, without any risk, that we can skip this duplicate resolution? It would be a performance optimization.

If we cannot safely do this, then we shouldn't. But if there's a safe way to do this, then I might try my hand at adding it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could re-use this when the argvResult has not changed. We only need the rerun this function due to the patches in the child loader entry-point. We could indicate this change through a flag and request re-computation.

This feels like a micro optimization through, not worth it personally as this is just some simple logical expressions.

function getEntryPointInfo(state: BootstrapState) {
const { code, interactive, restArgs } = state.parseArgvResult!;
const { cwd } = state.phase2Result!;
const { isCli } = state;

// Figure out which we are executing: piped stdin, --eval, REPL, and/or entrypoint
// This is complicated because node's behavior is complicated
// `node -e code -i ./script.js` ignores -e
const executeEval = code != null && !(interactive && restArgs.length);
const executeEntrypoint = !executeEval && restArgs.length > 0;
const executeRepl =
!executeEntrypoint &&
(interactive || (process.stdin.isTTY && !executeEval));
const executeStdin = !executeEval && !executeRepl && !executeEntrypoint;

/**
* Unresolved. May point to a symlink, not realpath. May be missing file extension
* NOTE: resolution relative to cwd option (not `process.cwd()`) is legacy backwards-compat; should be changed in next major: https://github.com/TypeStrong/ts-node/issues/1834
*/
const entryPointPath = executeEntrypoint
? isCli
? resolve(cwd, restArgs[0])
: resolve(restArgs[0])
: undefined;

return {
executeEval,
executeEntrypoint,
executeRepl,
executeStdin,
entryPointPath,
};
}

function phase4(payload: BootstrapState) {
const { isInChildProcess, entrypoint } = payload;
const { isInChildProcess, tsNodeScript } = payload;
const { version, showConfig, restArgs, code, print, argv } =
payload.parseArgvResult;
const { cwd } = payload.phase2Result!;
const { preloadedConfig } = payload.phase3Result!;

const {
entryPointPath,
executeEntrypoint,
executeEval,
cwd,
executeStdin,
executeRepl,
executeEntrypoint,
scriptPath,
} = payload.phase2Result!;
const { preloadedConfig } = payload.phase3Result!;
executeStdin,
} = getEntryPointInfo(payload);

/**
* <repl>, [stdin], and [eval] are all essentially virtual files that do not exist on disc and are backed by a REPL
* service to handle eval-ing of code.
Expand Down Expand Up @@ -566,12 +628,13 @@ function phase4(payload: BootstrapState) {

// Prepend `ts-node` arguments to CLI for child processes.
process.execArgv.push(
entrypoint,
tsNodeScript,
...argv.slice(2, argv.length - restArgs.length)
);
// TODO this comes from BoostrapState

// TODO this comes from BootstrapState
process.argv = [process.argv[1]]
.concat(executeEntrypoint ? ([scriptPath] as string[]) : [])
.concat(executeEntrypoint ? ([entryPointPath] as string[]) : [])
.concat(restArgs.slice(executeEntrypoint ? 1 : 0));
devversion marked this conversation as resolved.
Show resolved Hide resolved

// Execute the main contents (either eval, script or piped).
Expand Down
18 changes: 18 additions & 0 deletions src/child/argv-payload.ts
@@ -0,0 +1,18 @@
import { brotliCompressSync, brotliDecompressSync, constants } from 'zlib';

/** @internal */
export const argPrefix = '--brotli-base64-config=';

/** @internal */
export function compress(object: any) {
return brotliCompressSync(Buffer.from(JSON.stringify(object), 'utf8'), {
[constants.BROTLI_PARAM_QUALITY]: constants.BROTLI_MIN_QUALITY,
}).toString('base64');
}

/** @internal */
export function decompress(str: string) {
return JSON.parse(
brotliDecompressSync(Buffer.from(str, 'base64')).toString()
);
}
28 changes: 18 additions & 10 deletions src/child/child-entrypoint.ts
@@ -1,16 +1,24 @@
import { BootstrapState, bootstrap } from '../bin';
import { brotliDecompressSync } from 'zlib';
import { argPrefix, compress, decompress } from './argv-payload';

const base64ConfigArg = process.argv[2];
const argPrefix = '--brotli-base64-config=';
if (!base64ConfigArg.startsWith(argPrefix)) throw new Error('unexpected argv');
const base64Payload = base64ConfigArg.slice(argPrefix.length);
const payload = JSON.parse(
brotliDecompressSync(Buffer.from(base64Payload, 'base64')).toString()
) as BootstrapState;
payload.isInChildProcess = true;
payload.entrypoint = __filename;
payload.parseArgvResult.argv = process.argv;
payload.parseArgvResult.restArgs = process.argv.slice(3);
const state = decompress(base64Payload) as BootstrapState;

bootstrap(payload);
state.isInChildProcess = true;
state.tsNodeScript = __filename;
state.parseArgvResult.argv = process.argv;
state.parseArgvResult.restArgs = process.argv.slice(3);

// Modify and re-compress the payload delivered to subsequent child processes.
// This logic may be refactored into bin.ts by https://github.com/TypeStrong/ts-node/issues/1831
if (state.isCli) {
const stateForChildren: BootstrapState = {
...state,
isCli: false,
};
state.parseArgvResult.argv[2] = `${argPrefix}${compress(stateForChildren)}`;
}

bootstrap(state);
15 changes: 8 additions & 7 deletions src/child/spawn-child.ts
@@ -1,12 +1,15 @@
import type { BootstrapState } from '../bin';
import { spawn } from 'child_process';
import { brotliCompressSync } from 'zlib';
import { pathToFileURL } from 'url';
import { versionGteLt } from '../util';
import { argPrefix, compress } from './argv-payload';

const argPrefix = '--brotli-base64-config=';

/** @internal */
/**
* @internal
* @param state Bootstrap state to be transferred into the child process.
* @param targetCwd Working directory to be preserved when transitioning to
* the child process.
*/
export function callInChild(state: BootstrapState) {
if (!versionGteLt(process.versions.node, '12.17.0')) {
throw new Error(
Expand All @@ -22,9 +25,7 @@ export function callInChild(state: BootstrapState) {
// Node on Windows doesn't like `c:\` absolute paths here; must be `file:///c:/`
pathToFileURL(require.resolve('../../child-loader.mjs')).toString(),
require.resolve('./child-entrypoint.js'),
`${argPrefix}${brotliCompressSync(
Buffer.from(JSON.stringify(state), 'utf8')
).toString('base64')}`,
`${argPrefix}${compress(state)}`,
...state.parseArgvResult.restArgs,
],
{
Expand Down
48 changes: 48 additions & 0 deletions src/test/esm-loader.spec.ts
Expand Up @@ -22,6 +22,7 @@ import {
TEST_DIR,
tsSupportsImportAssertions,
tsSupportsResolveJsonModule,
tsSupportsStableNodeNextNode16,
} from './helpers';
import { createExec, createSpawn, ExecReturn } from './exec-helpers';
import { join, resolve } from 'path';
Expand Down Expand Up @@ -358,6 +359,53 @@ test.suite('esm', (test) => {
});
}

test.suite('esm child process working directory', (test) => {
test('should have the correct working directory in the user entry-point', async () => {
const { err, stdout, stderr } = await exec(
`${BIN_PATH} --esm --cwd ./esm/ index.ts`,
{
cwd: resolve(TEST_DIR, 'working-dir'),
}
);

expect(err).toBe(null);
expect(stdout.trim()).toBe('Passing');
expect(stderr).toBe('');
});
});

test.suite('esm child process and forking', (test) => {
test('should be able to fork vanilla NodeJS script', async () => {
const { err, stdout, stderr } = await exec(
`${BIN_PATH} --esm --cwd ./esm-child-process/ ./process-forking-js/index.ts`
);

expect(err).toBe(null);
expect(stdout.trim()).toBe('Passing: from main');
expect(stderr).toBe('');
});

test('should be able to fork TypeScript script', async () => {
const { err, stdout, stderr } = await exec(
`${BIN_PATH} --esm --cwd ./esm-child-process/ ./process-forking-ts/index.ts`
);

expect(err).toBe(null);
expect(stdout.trim()).toBe('Passing: from main');
expect(stderr).toBe('');
});

test('should be able to fork TypeScript script by absolute path', async () => {
const { err, stdout, stderr } = await exec(
`${BIN_PATH} --esm --cwd ./esm-child-process/ ./process-forking-ts-abs/index.ts`
);

expect(err).toBe(null);
expect(stdout.trim()).toBe('Passing: from main');
expect(stderr).toBe('');
});
});

test.suite('parent passes signals to child', (test) => {
test.runSerially();

Expand Down
27 changes: 27 additions & 0 deletions src/test/index.spec.ts
Expand Up @@ -617,6 +617,33 @@ test.suite('ts-node', (test) => {
}
});

test('should have the correct working directory in the user entry-point', async () => {
const { err, stdout, stderr } = await exec(
`${BIN_PATH} --cwd ./cjs index.ts`,
{
cwd: resolve(TEST_DIR, 'working-dir'),
}
);

expect(err).toBe(null);
expect(stdout.trim()).toBe('Passing');
expect(stderr).toBe('');
});

// Disabled due to bug:
// --cwd is passed to forked children when not using --esm, erroneously affects their entrypoint resolution.
// tracked/fixed by either https://github.com/TypeStrong/ts-node/issues/1834
// or https://github.com/TypeStrong/ts-node/issues/1831
test.skip('should be able to fork into a nested TypeScript script with a modified working directory', async () => {
const { err, stdout, stderr } = await exec(
`${BIN_PATH} --cwd ./working-dir/forking/ index.ts`
);

expect(err).toBe(null);
expect(stdout.trim()).toBe('Passing: from main');
expect(stderr).toBe('');
});

test.suite('should read ts-node options from tsconfig.json', (test) => {
const BIN_EXEC = `"${BIN_PATH}" --project tsconfig-options/tsconfig.json`;

Expand Down