Skip to content

Commit

Permalink
add support for running tests in parallel mode
Browse files Browse the repository at this point in the history
> (this PR depends on most other PRs linked to #4198, so they should be merged first; documentation will be in another PR)

This PR adds support for running test files in parallel via `--parallel`.  For many cases, this should "just work."

When the `--parallel` flag is supplied, Mocha will swap out the default `Runner` (`lib/runner.js`) for `BufferedRunner` (`lib/buffered-runner.js`).

`BufferedRunner` _extends_ `Runner`.  `BufferedRunner#run()` is the main point of extension.  Instead of executing the tests in serial, it will create a pool of worker processes (not worker _threads_) based on the maximum job count (`--jobs`; defaults to `<number of CPU cores> - 1`).  Both `BufferedRunner` and the `worker` module consume the abstraction layer, [workerpool](https://npm.im/workerpool).

`BufferedRunner#run()` does _not_ load the test files, unlike `Runner#run()`.  Instead, it has a list of test files, and puts these into an async queue.  The `EVENT_RUN_BEGIN` event is then emitted.  As files enter the queue, `BufferedRunner#run()` tells `workerpool` to execute the `run()` function of the pool.  `workerpool` then launches as many worker processes are needed--up to the maximum--and executes the `run()` function with a single filepath and any options for a `Mocha` instance.

The first time `lib/worker.js` is invoked, it will "bootstrap" itself, by handling `--require`'d modules and validating the UI.  Note that _reporter validation_ does not occur.  Once bootstrapped, it instantiate `Mocha`, add the single file, swap any reporter out for the `Buffered` reporter (`lib/reporters/buffered.js`) then execute `Mocha#run()`, which invokes `Runner#run()`.

The `Buffered` reporter listens for events emitting from the `Runner` instance, like a reporter usually does.  But instead of outputting to the console, it buffers the events in a queue.  Once the file has completed running, the queue is drained: the events collected are (trivially) serialized for transmission back to the main process.  `BufferedRunner#run()` receives the list of events, trivially _deserializes_ them, and re-emits the events to whatever the chosen reporter is (e.g., the `spec` reporter).  In this way, the reporters don't know that the tests were run in parallel.  Practically, the user will see reporter output in "chunks" instead of the "stream" of results they usually expect.  This method ensures that while the test files run in a nondeterministic order, the reporter output will be deterministic for any given test file.

Once the result (the queue of events) has been returned to the main process, the worker process stays open, but waits for further instruction.  If there are more files in `BufferedRunner#run()`'s queue, `workerpool` will instruct the worker to take the next file from the list, and so on, and so forth.  When all files have executed, the pool terminates, the `EVENT_RUN_END` event is emitted, and the reporter handles it.

Note that exclusive tests ("only") cannot work in parallel mode, because we do not load all files up-front.

> (this section is pasted from the documentation with minimal edits)

Due to the nature of the following reporters, they cannot work when running tests in parallel:

- `markdown`
- `progress`
- `json-stream`

These reporters expect Mocha to know _how many tests it plans to run_ before execution. This information is unavailable in parallel mode, as test files are loaded only when they are about to be run.

In serial mode, tests results will "stream" as they occur. In parallel mode, reporter output is _buffered_; reporting will occur after each file is completed. In practice, the reporter output will appear in "chunks" (but will otherwise be identical).

In parallel mode, we have no guarantees about the order in which test files will be run--or what process runs them--as it depends on the execution times of the test files.

Because of this, the following options _cannot be used_ in parallel mode:

- `--file`
- `--sort`
- `--delay`

Because running tests in parallel mode uses more system resources at once, the OS may take extra time to schedule and complete some operations. For this reason, test timeouts may need to be increased either globally or otherwise.

When used with `--bail` (or `this.bail()`) to exit after the first failure, it's likely other tests will be running at the same time. Mocha must shut down its worker processes before exiting.

Likewise, subprocesses may throw uncaught exceptions. When used with `--allow-uncaught`, Mocha will "bubble" this exception to the main process, but still must shut down its processes.

> _NOTE: This only applies to test files run parallel mode_.

A root-level hook is a hook in a test file which is _not defined_ within a suite. An example using the `bdd` interface:

```js
// test/setup.js
beforeEach(function() {
  doMySetup();
});

afterEach(function() {
  doMyTeardown();
});
```

When run (in the default "serial" mode) via `mocha --file "./test/setup.js" "./test/**/*.spec.js"`, `setup.js` will be executed _first_, and install the two hooks shown above for every test found in `./test/**/*.spec.js`.

**When Mocha runs in parallel mode, test files do not share the same process.** Consequently, a root-level hook defined in test file _A_ won't be present in test file _B_.

There are a (minimum of) two workarounds for this:

1. `require('./setup.js')` or `import './setup.js'` at the top of every test file. Best avoided for those averse to boilerplate.
1. _Recommended_: Define root-level hooks in a required file, using the new (also as of VERSION) Root Hook Plugin system.

Parallel mode is only available in Node.js.

If you find your tests don't work properly when run with `--parallel`, either shrug and move on, or use this handy-dandy checklist to get things working:

- ✅ Ensure you are using a supported reporter.
- ✅ Ensure you are not using other unsupported flags.
- ✅ Double-check your config file; options set in config files will be merged with any command-line option.
- ✅ Look for root-level hooks in your tests. Move them into a root hook plugin.
- ✅ Do any assertion, mock, or other test libraries you're consuming use root hooks? They may need to be migrated for compatibility with parallel mode.
- ✅ If tests are unexpectedly timing out, you may need to increase the default test timeout (via `--timeout`)
- ✅ Ensure your tests do not depend on being run in a specific order.
- ✅ Ensure your tests clean up after themselves; remove temp files, handles, sockets, etc. Don't try to share state or resources between test files.

Some types of tests are _not_ so well-suited to run in parallel. For example, extremely timing-sensitive tests, or tests which make I/O requests to a limited pool of resources (such as opening ports, or automating browser windows, hitting a test DB, or remote server, etc.).

Free-tier cloud CI services may not provide a suitable multi-core container or VM for their build agents. Regarding expected performance gains in CI: your mileage may vary. It may help to use a conditional in a `.mocharc.js` to check for `process.env.CI`, and adjust the job count as appropriate.

It's unlikely (but not impossible) to see a performance gain from a job count _greater than_ the number of available CPU cores. That said, _play around with the job count_--there's no one-size-fits all, and the unique characteristics of your tests will determine the optimal number of jobs; it may even be that fewer is faster!

- updated signal handling in `bin/mocha` to a) better work with Windows, and b) work properly with `--parallel` to avoid leaving zombie workers
- docstrings in `lib/cli/collect-files.js`
- refactors in `lib/cli/run-helpers.js` and `lib/cli/watch-run.js`.  We now have four methods:
    - `watchRun()` - serial + watch
    - `singleRun()` - serial
    - `parallelWatchRun()` - parallel + watch
    - `parallelRun()` - parallel
- `lib/cli/run.js` and `lib/cli/run-option-metadata.js`: additions for new options and checks for incompatibility
- add `lib/reporters/buffered.js` (`Buffered`); this reporter is _not_ re-exported in `Mocha.reporters`, since it should only be invoked internally.
- tweak `landing` reporter to avoid referencing `Runner#total`, which is incompatible with parallel mode.  It didn't need to do so in the first place!
- the `tap` reporter now outputs the plan at the _end_ instead of at the beginning (avoiding a call to `Runner#grepTotal()`, which is incompatible with parallel mode).  This is within spec, so should not be a breaking change.
- add `lib/buffered-runner.js` (`BufferedRunner`); subclass of `Runner` which overrides the `run()` method.
    - There's a little custom finite state machine in here.  didn't want to pull in a third-party module, but we should consider doing so if we use FSM's elsewhere.
    - when `DEBUG=mocha:parallel*` is in the env, this module will output statistics about the worker pool every 5s
    - the `run()` method looks a little weird because I wanted to use `async/await`, but the method it is overriding (`Runner#run`) is _not_ `async`
    - traps `SIGINT` to gracefully terminate the pool
    - pulls in [promise.allsettled](https://npm.im/promise.allsettled) polyfill to handle workers that may have rejected with uncaught exceptions
    - "bail" support is best-effort.
    - the `ABORTING` state is only for interruption via `SIGINT` or if `allowUncaught` is true and we get an uncaught exception
- `Hook`, `Suite`, `Test`: add a `serialize()` method.  This pulls out the most relevant information about the object for transmission over IPC.  It's called by worker processes for each event received by its `Runner`; event arguments (e.g., `test` or `suite`) are serialized in this manner.  Note that this _limits what reporters have access to_, which may break compatibility with third-party reporters that may use information that is missing from the serialized object.  As those cases arise, we can add more information to the serialized objects (in some cases).  The `$$` convention tells the _deserializer_ to turn the property into a function which returns the passed value, e.g., `test.fullTitle()`.
- `lib/mocha.js`:
    - refactor `Mocha#reporter` for nicer parameter & variable names
    - rename `loadAsync` to `lazyLoadFiles`, which is more descriptive, IMO.  It's a private property, so should not be a breaking change.
    - Constructor will dynamically choose the appropriate `Runner`
- `lib/runner.js`: `BufferedRunner` needs the options from `Mocha#options`, so I updated the parent method to define the parameter.  It is unused here.
- add `lib/serializer.js`: on the worker process side, manages event queue serialization; manages deserialization of the event queue in the main process.
    - I spent a long time trying to get this working.  We need to account for things like `Error` instances, with their stack traces, since those can be event arguments (e.g., `EVENT_TEST_FAIL` sends both a `Test` and the `Error`).  It's impossible to serialize circular (self-referential) objects, so we need to account for those as well.
    - Not super happy with the deserialization algorithm, since it's recursive, but it shouldn't be too much of an issue because the serializer won't output circular structures.
    - Attempted to avoid prototype pollution issues
    - Much of this works by mutating objects, mostly because it can be more performant.  The code can be changed to be "more immutable", as that's likely to be easier to understand, if it doesn't impact performance too much.  We're serializing potentially very large arrays of stuff.
    - The `__type` prop is a hint for the deserializer.  This convention allows us to re-expand plain objects back into `Error` instances, for example.  You can't send an `Error` instance over IPC!
- add `lib/worker.js`:
    - registers its `run()` function with `workerpool` to be called by main process
    - if `DEBUG=mocha:parallel*` is set, will output information (on an interval) about long-running test files
    - afaik the only way `run()` can reject is if `allowUncaught` is true or serialization fails
    - any user-supplied `reporter` value is replaced with the `Buffered` reporter.  thus, reporters are not validated.
    - the worker uses `Runner`, like usual.
- tests:
    - see `test/integration/options/parallel.spec.js` for the interesting stuff
    - upgrade `unexpected` for "to have readonly property" assertion
    - upgrade `unexpected-eventemitter` for support async function support
    - integration test helpers allow Mocha's developers to use `--bail` and `--parallel`, but will default to `--no-bail` and `--no-parallel`.
    - split some node-specific tests out of `test/unit/mocha.spec.js` into `test/node-unit/mocha.spec.js`
- etc:
    - update `.eslintrc.yml` for new Node-only files
    - increase default timeout to `1000` (also seen in another PR) and use `parallel` mode by default in `.mocharc.yml`
    - run node unit tests _in serial_ as sort of a smoke test, as otherwise all our tests would be run in parallel
    - karma, browserify: ignore files for parallel support
    - force color output in CI. this is nice on travis, but ugly on appveyor.  either way, it's easier to read than having no color
    - move non-CLI-related node-specific files into `lib/nodejs/`
    - correct some issues with APIs not marked `@private`
    - add some istanbul directives to ignore some debug statements
    - add `utils.isBrowser()` for easier mocking of a `process.browser === true` situation
    - add `createForbiddenExclusivityError()`

Ref: #4198
  • Loading branch information
boneskull committed May 27, 2020
1 parent 6d60eb0 commit 155841a
Show file tree
Hide file tree
Showing 56 changed files with 4,671 additions and 440 deletions.
26 changes: 13 additions & 13 deletions .eslintrc.yml
Expand Up @@ -21,22 +21,22 @@ rules:
property: 'assign'
overrides:
- files:
- docs/js/**/*.js
- 'docs/js/**/*.js'
env:
node: false
- files:
- scripts/**/*.js
- package-scripts.js
- karma.conf.js
- .wallaby.js
- .eleventy.js
- bin/*
- lib/cli/**/*.js
- test/node-unit/**/*.js
- test/integration/options/watch.spec.js
- test/integration/helpers.js
- lib/growl.js
- docs/_data/**/*.js
- '.eleventy.js'
- '.wallaby.js'
- 'package-scripts.js'
- 'karma.conf.js'
- 'bin/*'
- 'docs/_data/**/*.js'
- 'lib/cli/**/*.js'
- 'lib/nodejs/**/*.js'
- 'scripts/**/*.js'
- 'test/integration/helpers.js'
- 'test/integration/options/watch.spec.js'
- 'test/node-unit/**/*.js'
parserOptions:
ecmaVersion: 2018
env:
Expand Down
1 change: 1 addition & 0 deletions .mocharc.yml
Expand Up @@ -5,6 +5,7 @@ global:
- 'okGlobalC'
- 'callback*'
timeout: 1000
parallel: true
watch-ignore:
- '.*'
- 'docs/_dist/**'
Expand Down
6 changes: 5 additions & 1 deletion .travis.yml
Expand Up @@ -39,7 +39,8 @@ jobs:
- script: COVERAGE=1 npm start test.node
after_success: npm start coveralls
name: 'Latest Node.js (with coverage)'

- script: MOCHA_PARALLEL=0 npm start test.node.unit
name: 'Latest Node.js (unit tests in serial mode)'
- &node
script: npm start test.node
node_js: '13'
Expand Down Expand Up @@ -95,6 +96,9 @@ jobs:
script: true
name: 'Prime cache'

env:
- 'NODE_OPTIONS="--trace-warnings"'

notifications:
email: false
webhooks:
Expand Down
19 changes: 17 additions & 2 deletions bin/mocha
Expand Up @@ -130,8 +130,23 @@ if (Object.keys(nodeArgs).length) {

// terminate children.
process.on('SIGINT', () => {
proc.kill('SIGINT'); // calls runner.abort()
proc.kill('SIGTERM'); // if that didn't work, we're probably in an infinite loop, so make it die.
// XXX: a previous comment said this would abort the runner, but I can't see that it does
// anything with the default runner.
debug('main process caught SIGINT');
proc.kill('SIGINT');
// if running in parallel mode, we will have a proper SIGINT handler, so the below won't
// be needed.
if (!args.parallel || args.jobs < 2) {
// win32 does not support SIGTERM, so use next best thing.
if (require('os').platform() === 'win32') {
proc.kill('SIGKILL');
} else {
// using SIGKILL won't cleanly close the output streams, which can result
// in cut-off text or a befouled terminal.
debug('sending SIGTERM to child process');
proc.kill('SIGTERM');
}
}
});
} else {
debug('running Mocha in-process');
Expand Down
11 changes: 8 additions & 3 deletions karma.conf.js
Expand Up @@ -30,13 +30,18 @@ module.exports = config => {
browserify: {
debug: true,
configure: function configure(b) {
b.ignore('./lib/cli/*.js')
.ignore('chokidar')
b.ignore('chokidar')
.ignore('fs')
.ignore('glob')
.ignore('./lib/esm-utils.js')
.ignore('path')
.ignore('supports-color')
.ignore('./lib/esm-utils.js')
.ignore('./lib/cli/*.js')
.ignore('./lib/nodejs/serializer.js')
.ignore('./lib/nodejs/worker.js')
.ignore('./lib/nodejs/buffered-worker-pool.js')
.ignore('./lib/nodejs/buffered-runner.js')
.ignore('./lib/nodejs/reporters/buffered.js')
.on('bundled', (err, content) => {
if (err) {
throw err;
Expand Down
3 changes: 2 additions & 1 deletion lib/browser/growl.js
Expand Up @@ -11,6 +11,7 @@
var Date = global.Date;
var setTimeout = global.setTimeout;
var EVENT_RUN_END = require('../runner').constants.EVENT_RUN_END;
var isBrowser = require('../utils').isBrowser;

/**
* Checks if browser notification support exists.
Expand All @@ -25,7 +26,7 @@ var EVENT_RUN_END = require('../runner').constants.EVENT_RUN_END;
exports.isCapable = function() {
var hasNotificationSupport = 'Notification' in window;
var hasPromiseSupport = typeof Promise === 'function';
return process.browser && hasNotificationSupport && hasPromiseSupport;
return isBrowser() && hasNotificationSupport && hasPromiseSupport;
};

/**
Expand Down
19 changes: 12 additions & 7 deletions lib/cli/collect-files.js
Expand Up @@ -17,13 +17,7 @@ const {NO_FILES_MATCH_PATTERN} = require('../errors').constants;

/**
* Smash together an array of test files in the correct order
* @param {Object} opts - Options
* @param {string[]} opts.extension - File extensions to use
* @param {string[]} opts.spec - Files, dirs, globs to run
* @param {string[]} opts.ignore - Files, dirs, globs to ignore
* @param {string[]} opts.file - List of additional files to include
* @param {boolean} opts.recursive - Find files recursively
* @param {boolean} opts.sort - Sort test files
* @param {FileCollectionOptions} [opts] - Options
* @returns {string[]} List of files to test
* @private
*/
Expand Down Expand Up @@ -84,3 +78,14 @@ module.exports = ({ignore, extension, file, recursive, sort, spec} = {}) => {

return files;
};

/**
* An object to configure how Mocha gathers test files
* @typedef {Object} FileCollectionOptions
* @property {string[]} extension - File extensions to use
* @property {string[]} spec - Files, dirs, globs to run
* @property {string[]} ignore - Files, dirs, globs to ignore
* @property {string[]} file - List of additional files to include
* @property {boolean} recursive - Find files recursively
* @property {boolean} sort - Sort test files
*/
51 changes: 41 additions & 10 deletions lib/cli/run-helpers.js
Expand Up @@ -10,7 +10,7 @@
const fs = require('fs');
const path = require('path');
const debug = require('debug')('mocha:cli:run:helpers');
const watchRun = require('./watch-run');
const {watchRun, watchParallelRun} = require('./watch-run');
const collectFiles = require('./collect-files');
const {type} = require('../utils');
const {format} = require('util');
Expand Down Expand Up @@ -151,24 +151,52 @@ const singleRun = async (mocha, {exit}, fileCollectParams) => {
};

/**
* Actually run tests
* Collect files and run tests (using `BufferedRunner`).
*
* This is `async` for consistency.
*
* @param {Mocha} mocha - Mocha instance
* @param {Object} opts - Command line options
* @param {Options} options - Command line options
* @param {Object} fileCollectParams - Parameters that control test
* file collection. See `lib/cli/collect-files.js`.
* @returns {Promise<BufferedRunner>}
* @ignore
* @private
* @returns {Promise}
*/
const parallelRun = async (mocha, options, fileCollectParams) => {
const files = collectFiles(fileCollectParams);
debug(
'executing %d test file(s) across %d concurrent jobs',
files.length,
options.jobs
);
mocha.files = files;

// note that we DO NOT load any files here; this is handled by the worker
return mocha.run(options.exit ? exitMocha : exitMochaLater);
};

/**
* Actually run tests. Delegates to one of four different functions:
* - `singleRun`: run tests in serial & exit
* - `watchRun`: run tests in serial, rerunning as files change
* - `parallelRun`: run tests in parallel & exit
* - `watchParallelRun`: run tests in parallel, rerunning as files change
* @param {Mocha} mocha - Mocha instance
* @param {Options} opts - Command line options
* @private
* @returns {Promise<Runner>}
*/
exports.runMocha = async (mocha, options) => {
const {
watch = false,
extension = [],
exit = false,
ignore = [],
file = [],
parallel = false,
recursive = false,
sort = false,
spec = [],
watchFiles,
watchIgnore
spec = []
} = options;

const fileCollectParams = {
Expand All @@ -180,11 +208,14 @@ exports.runMocha = async (mocha, options) => {
spec
};

let run;
if (watch) {
watchRun(mocha, {watchFiles, watchIgnore}, fileCollectParams);
run = parallel ? watchParallelRun : watchRun;
} else {
await singleRun(mocha, {exit}, fileCollectParams);
run = parallel ? parallelRun : singleRun;
}

return run(mocha, options, fileCollectParams);
};

/**
Expand Down
5 changes: 4 additions & 1 deletion lib/cli/run-option-metadata.js
Expand Up @@ -42,11 +42,12 @@ exports.types = {
'list-interfaces',
'list-reporters',
'no-colors',
'parallel',
'recursive',
'sort',
'watch'
],
number: ['retries'],
number: ['retries', 'jobs'],
string: [
'config',
'fgrep',
Expand Down Expand Up @@ -75,7 +76,9 @@ exports.aliases = {
growl: ['G'],
ignore: ['exclude'],
invert: ['i'],
jobs: ['j'],
'no-colors': ['C'],
parallel: ['p'],
reporter: ['R'],
'reporter-option': ['reporter-options', 'O'],
require: ['r'],
Expand Down
47 changes: 47 additions & 0 deletions lib/cli/run.js
Expand Up @@ -25,6 +25,7 @@ const {ONE_AND_DONES, ONE_AND_DONE_ARGS} = require('./one-and-dones');
const debug = require('debug')('mocha:cli:run');
const defaults = require('../mocharc');
const {types, aliases} = require('./run-option-metadata');
const coreCount = require('os').cpus().length;

/**
* Logical option groups
Expand Down Expand Up @@ -151,6 +152,14 @@ exports.builder = yargs =>
description: 'Inverts --grep and --fgrep matches',
group: GROUPS.FILTERS
},
jobs: {
description:
'Number of concurrent jobs for --parallel; use 1 to run in serial',
defaultDescription: '(number of CPU cores - 1)',
requiresArg: true,
group: GROUPS.RULES,
default: Math.max(2, coreCount - 1)
},
'list-interfaces': {
conflicts: Array.from(ONE_AND_DONE_ARGS),
description: 'List built-in user interfaces & exit'
Expand All @@ -170,6 +179,10 @@ exports.builder = yargs =>
normalize: true,
requiresArg: true
},
parallel: {
description: 'Run tests in parallel',
group: GROUPS.RULES
},
recursive: {
description: 'Look for tests in subdirectories',
group: GROUPS.FILES
Expand Down Expand Up @@ -272,6 +285,40 @@ exports.builder = yargs =>
);
}

if (argv.parallel) {
// yargs.conflicts() can't deal with `--file foo.js --no-parallel`, either
if (argv.file) {
throw createUnsupportedError(
'--parallel runs test files in a non-deterministic order, and is mutually exclusive with --file'
);
}

// or this
if (argv.sort) {
throw createUnsupportedError(
'--parallel runs test files in a non-deterministic order, and is mutually exclusive with --sort'
);
}

if (argv.reporter === 'progress') {
throw createUnsupportedError(
'--reporter=progress is mutually exclusive with --parallel'
);
}

if (argv.reporter === 'markdown') {
throw createUnsupportedError(
'--reporter=markdown is mutually exclusive with --parallel'
);
}

if (argv.reporter === 'json-stream') {
throw createUnsupportedError(
'--reporter=json-stream is mutually exclusive with --parallel'
);
}
}

if (argv.compilers) {
throw createUnsupportedError(
`--compilers is DEPRECATED and no longer supported.
Expand Down

0 comments on commit 155841a

Please sign in to comment.