Skip to content

webpack Plugin Info

Jeffrey Posnick edited this page Dec 19, 2017 · 4 revisions

This is an overview of how the workbox-webpack-plugin interacts with a webpack compilation's output.

(It's also a good opportunity to put down my (@jeffposnick's) understanding of how everything fits together, and potentially correct any misconceptions that might lead to incorrect behavior.)

chunks

What's a chunk?

For big web apps it’s not efficient to put all code into a single file, especially if some blocks of code are only required under some circumstances. Webpack has a feature to split your codebase into “chunks” which are loaded on demand. Some other bundlers call them “layers”, “rollups”, or “fragments”. This feature is called “code splitting”.

webpack includes metadata about "chunks" as part of its compilation output. This metadata includes a chunk's name (if one was configured) as well as an array of output file names that belong to that chunk. It represents a logical grouping of related files.

Each entry in a given webpack configuration will lead to at least one chunk. It's possible for some chunks to be created outside of the context of an entry, though.

A chunk might be anonymous, or it might have a name, depending on the way the entry was declared.

Anonymous chunk definition:

const webpackConfig = {
  entry: './path/to/my/entry/file.js',
  // etc.
}

Named chunk definition:

const webpackConfig = {
  entry: {
    chunk1: './path/to/my/entry/file1.js',
    chunk2: './path/to/my/entry/file2.js'
  },
  // etc.
}

How does the workbox-webpack-plugin deal with chunks?

There are a few different ways that the plugin interacts with chunks:

Whitelisting/blacklisting

If either chunks or excludeChunks is passed in as a parameter when constructing the plugin, then those values are used as a whitelist/blacklist when constructing the precache manifest. (This parameter naming seems to be a convention for webpack plugins.)

  • If chunks is provided, then only assets that are associated with those named chunks will be included in the precache manifest.
  • If excludeChunks is provided, then assets associated with those named chunks will be excluded from the precache manifest.

Custom bundle of Workbox runtime code

If importWorkboxFrom is set to the name of a chunk, then it will be assumed that that chunk name corresponds to a custom bundle of the Workbox runtime code. The assets associated with that bundle will be automatically added to importScripts, and the chunk's name will be added to excludeChunks (since the files added to a service worker via importScripts should not be included in the precache manifest).

assets

What's an asset?

An asset is a mapping of a file path to some metadata about the file, like it's contents (exposed via the .source() function) and its length (exposed via the .size() function).

Each file that ends up in the output directory after a webpack build completes has a corresponding entry in the compilation's assets.

An asset may or may not be associated with a chunk. For instance, plugins can create their own assets and inject them into the build pipeline without creating a corresponding chunk. (The workbox-webpack-plugin does this, in fact, by creating two assets: one for the precache manifest file, and the other for the generated service worker file.)

How does the workbox-webpack-plugin deal with assets?

In general, the plugin will create an entry in the precache manifest for each asset that belongs to the current compilation, regardless of whether the asset is associated with a chunk or not.

The exception to this rule is that when chunks or excludeChunks is set, the plugin will use those values as a whitelist/blacklist, and include/exclude assets based on their association with the named chunks.

There is currently no support for filtering assets based on their file names, but that is functionality we should think about adding. (See https://github.com/GoogleChrome/workbox/issues/935.)

Open questions/areas for improvement

asset => chunk mapping

As far as I can tell, the asset metadata exposed in the compilation object doesn't include a mapping of which chunk (if any) was responsible for creating an asset. Because we need that information to whitelist/blacklist based on chunk names, we attempt to derive that information by looping through a compilation's chunks and building up a list of the files associated with each:

const mapping = {};
for (const chunk of chunks) {
  for (const file of chunk.files) {
    mapping[file] = {
      chunkName: chunk.name,
      hash: chunk.renderedHash,
    };
  }
}

Is there a preferred/more reliable way of getting this information?

Identifying assets that include hashes in their filenames

For efficiency's sake, we want to be able to identify assets that already have hashes in their filenames so that we could exclude them from the cache-busting that workbox-precaching performs by default when populating the cache.

I am not aware of any way to determine whether an asset has a filename that includes a hash, so we try to guess at it by doing some matching against the hashes we "know" about:

function getKnownHashesFromAssets(assetMetadata) {
  const knownHashes = new Set();
  for (const metadata of Object.values(assetMetadata)) {
    knownHashes.add(metadata.hash);
  }
  return knownHashes;
}

const knownHashes = [
  compilation.hash,
  compilation.fullHash,
  ...getKnownHashesFromAssets(filteredAssetMetadata),
].filter((hash) => !!hash);

It's unlikely that this would lead to a false-positive, where we think an asset includes a hash but it actually doesn't, but it's definitely likely to lead to false-negatives, where we end up cache-busting an asset that already includes a hash in its URL. (See https://github.com/GoogleChrome/workbox/issues/1102.)

This is captured at https://stackoverflow.com/questions/47658134 if anyone has ideas.