Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: deduplicate assets with buffer source #4712

Merged
2 changes: 1 addition & 1 deletion docs/05-plugin-development.md
Expand Up @@ -732,7 +732,7 @@ If there are no dynamic imports, this will create exactly three chunks where the

Note that even though any module id can be used in `implicitlyLoadedAfterOneOf`, Rollup will throw an error if such an id cannot be uniquely associated with a chunk, e.g. because the `id` cannot be reached implicitly or explicitly from the existing static entry points, or because the file is completely tree-shaken. Using only entry points, either defined by the user or of previously emitted chunks, will always work, though.

If the `type` is _`asset`_, then this emits an arbitrary new file with the given `source` as content. It is possible to defer setting the `source` via [`this.setAssetSource(referenceId, source)`](guide/en/#thissetassetsource) to a later time to be able to reference a file during the build phase while setting the source separately for each output during the generate phase. Assets with a specified `fileName` will always generate separate files while other emitted assets may be deduplicated with existing assets if they have the same `string` source even if the `name` does not match. If the source is not a string but a typed array or `Buffer`, no deduplication will occur for performance reasons. If an asset without a `fileName` is not deduplicated, the [`output.assetFileNames`](guide/en/#outputassetfilenames) name pattern will be used.
If the `type` is _`asset`_, then this emits an arbitrary new file with the given `source` as content. It is possible to defer setting the `source` via [`this.setAssetSource(referenceId, source)`](guide/en/#thissetassetsource) to a later time to be able to reference a file during the build phase while setting the source separately for each output during the generate phase. Assets with a specified `fileName` will always generate separate files while other emitted assets may be deduplicated with existing assets if they have the same source even if the `name` does not match. If an asset without a `fileName` is not deduplicated, the [`output.assetFileNames`](guide/en/#outputassetfilenames) name pattern will be used.

#### `this.error`

Expand Down
71 changes: 38 additions & 33 deletions src/utils/FileEmitter.ts
Expand Up @@ -28,33 +28,34 @@ import { extname } from './path';
import { isPathFragment } from './relativeId';
import { makeUnique, renderNamePattern } from './renderNamePattern';

function generateAssetFileName(
function getSourceHash(source: string | Uint8Array, size?: number): string {
return createHash()
.update(source)
.digest('hex')
.slice(0, Math.max(0, size || defaultHashSize));
}

function pregenerateAssetFileName(
name: string | undefined,
source: string | Uint8Array,
outputOptions: NormalizedOutputOptions,
bundle: OutputBundleWithPlaceholders
): string {
outputOptions: NormalizedOutputOptions
): { fileName: string; sourceHash: string | undefined } {
let sourceHash: string | undefined;
const emittedName = outputOptions.sanitizeFileName(name || 'asset');
return makeUnique(
renderNamePattern(
typeof outputOptions.assetFileNames === 'function'
? outputOptions.assetFileNames({ name, source, type: 'asset' })
: outputOptions.assetFileNames,
'output.assetFileNames',
{
ext: () => extname(emittedName).slice(1),
extname: () => extname(emittedName),
hash: size =>
createHash()
.update(source)
.digest('hex')
.slice(0, Math.max(0, size || defaultHashSize)),
name: () =>
emittedName.slice(0, Math.max(0, emittedName.length - extname(emittedName).length))
}
),
bundle
const fileName = renderNamePattern(
typeof outputOptions.assetFileNames === 'function'
? outputOptions.assetFileNames({ name, source, type: 'asset' })
: outputOptions.assetFileNames,
'output.assetFileNames',
{
ext: () => extname(emittedName).slice(1),
extname: () => extname(emittedName),
hash: size => (sourceHash = getSourceHash(source, size)),
name: () =>
emittedName.slice(0, Math.max(0, emittedName.length - extname(emittedName).length))
}
);
return { fileName, sourceHash };
}

function reserveFileNameInBundle(
Expand Down Expand Up @@ -246,9 +247,6 @@ export class FileEmitter {
for (const emittedFile of this.filesByReferenceId.values()) {
if (emittedFile.fileName) {
reserveFileNameInBundle(emittedFile.fileName, output, this.options.onwarn);
if (emittedFile.type === 'asset' && typeof emittedFile.source === 'string') {
fileNamesBySource.set(emittedFile.source, emittedFile.fileName);
}
}
}
for (const [referenceId, consumedFile] of this.filesByReferenceId) {
Expand Down Expand Up @@ -332,17 +330,24 @@ export class FileEmitter {
referenceId: string,
{ bundle, fileNamesBySource, outputOptions }: FileEmitterOutput
): void {
const fileName =
consumedFile.fileName ||
(typeof source === 'string' && fileNamesBySource.get(source)) ||
generateAssetFileName(consumedFile.name, source, outputOptions, bundle);
let fileName = consumedFile.fileName;

// Deduplicate assets if an explicit fileName is not provided
if (!fileName) {
const pregenerated = pregenerateAssetFileName(consumedFile.name, source, outputOptions);
// Reuse source hash if it was computed as part of the asset file name
const sourceHash = pregenerated.sourceHash ?? getSourceHash(source);
patak-dev marked this conversation as resolved.
Show resolved Hide resolved
fileName = fileNamesBySource.get(sourceHash);
if (!fileName) {
fileName = makeUnique(pregenerated.fileName, bundle);
fileNamesBySource.set(sourceHash, fileName);
}
}

// We must not modify the original assets to avoid interaction between outputs
const assetWithFileName = { ...consumedFile, fileName, source };
this.filesByReferenceId.set(referenceId, assetWithFileName);
if (typeof source === 'string') {
fileNamesBySource.set(source, fileName);
}
patak-dev marked this conversation as resolved.
Show resolved Hide resolved

bundle[fileName] = {
fileName,
name: consumedFile.name,
Expand Down
1 change: 1 addition & 0 deletions test/cli/samples/watch/bundle-error/main.js
@@ -0,0 +1 @@
export default 42;
@@ -0,0 +1,18 @@
const source = Buffer.from('abc');

module.exports = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I see it, this test does not really have any assertions and a function test may not be the best fit for asset deduplication as it does not verify _expected. Why not just extend the existing chunking-form test as it was before Rollup 3: https://github.com/rollup/rollup/blob/69ff4181e701a0fe0026d0ba147f31bc86beffa8/test/chunking-form/samples/emit-file/deduplicate-assets/_config.js
We had a test for buffer deduplication, so it should just be green again

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also include cross Buffer/string deduplication as well in the test by emitting each asset a third time using the other format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nice! I missed the deduplicate-assets test before. Extended it in rollup/rollup@9dbffe1 (#4712) and also added a different buffer source for completeness.

I also didn't know that Rollup 2 was already deduplicating binary sources. So then there is even less of a reason for the option if this was already how things worked before. Is my understanding correct that cross-buffer/string deduplication didn't happen in Rollup 2 though?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my understanding correct that cross-buffer/string deduplication didn't happen in Rollup 2 though?

Yes, that is indeed a new feature!

description: 'warns if multiple files with the same name are emitted',
options: {
input: 'main.js',
plugins: [
{
buildStart() {
this.emitFile({ type: 'asset', name: 'buildStart', source });
},
generateBundle() {
this.emitFile({ type: 'asset', name: 'generateBundle', source });
}
}
]
}
};
@@ -0,0 +1,9 @@
define(['exports'], function (exports) { 'use strict';

function hi() { return 2 }

exports.hi = hi;

Object.defineProperty(exports, '__esModule', { value: true });

});
@@ -0,0 +1 @@
abc
@@ -0,0 +1,7 @@
'use strict';

Object.defineProperty(exports, '__esModule', { value: true });

function hi() { return 2 }

exports.hi = hi;
@@ -0,0 +1 @@
abc
@@ -0,0 +1,3 @@
function hi() { return 2 }

export { hi };
@@ -0,0 +1 @@
abc
@@ -0,0 +1,12 @@
System.register([], function (exports) {
'use strict';
return {
execute: function () {

exports('hi', hi);

function hi() { return 2 }

}
};
});
@@ -0,0 +1 @@
abc
@@ -0,0 +1 @@
export function hi() { return 2 }