Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to GenMapping for sourcemap generation #1190

Merged
merged 7 commits into from May 10, 2022

Conversation

jridgewell
Copy link
Collaborator

gen-mapping is faster, smaller, and lighter sourcemap generation library. This doesn't remove WASM (because SourceMapGenerator doesn't require it), but it's still nice to get off an unmaintained dependency.

This also exposes a new decodedMap property on the result object. Decoded maps are free to create (it's a shallow clone of the GenMapping instance), and passing them to @jridgewell/trace-mapping is copy-free. With Babel recently adding a decodedMap field, a dev could pass from the Babel transpilation to Terser without any added memory use for sourcemaps.

And if there's a multi stage build process, a dev could use @ampproject/remapping to remap Babel, Terser, and (eg) a bundler's outputs without having to feed input maps into each stage.

[`gen-mapping`](https://github.com/jridgewell/gen-mapping) is faster, smaller, and lighter  sourcemap generation library.

This also exposes a new `decodedMap` property on the result object. Decoded maps are free to create (it's a shallow clone of the `GenMapping` instance), and passing them to `@jridgewell/trace-mapping` is copy-free. With Babel  [recently](babel/babel#14497) adding a `decodedMap` field, a dev could pass from the Babel transpilation to Terser without any added memory use for sourcemaps.

And if there's a multi stage build process, a dev could use `@ampproject/remapping` to remap Babel, Terser, and (eg) a bundler's outputs without having to feed input maps into each stage.
lib/sourcemap.js Outdated
generator.addMapping({
generated : { line: gen_line + options.dest_line_diff, column: gen_col },
original : { line: orig_line + options.orig_line_diff, column: orig_col },
maybeAddMapping(generator, {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybeAddMapping produces a cleaner sourcemap by discarding mappings that add no value. Eg, if two consecutive segments point to the same source location, that provides no new information.

This is a change from the source-map's addMapping. If you'd like, I can switch to gen-mapping's addMapping, which gives us the same keep-everything behavior.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like less RAM usage to me, so this is a nice change!

lib/minify.js Outdated
});
}
});
result.decodedMap = options.format.source_map.getDecoded();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decoded maps are free to generate from GenMapping (because mappings are stored in decoded form), it's just a shallow clone.

enumerable: true,
get() {
const map = options.format.source_map.getEncoded();
return (result.map = options.sourceMap.asObject ? map : JSON.stringify(map));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoded maps require serialization from decoded form into an encoded VLQ string. I've placed this behind a getter so that it's not performed unless required by the dev.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated suggestion:

I looked at the implementation underneath getEncoded here and I wonder if a JSON string rendition of it could be optimized.

VLQ strings are guaranteed to not contain \ or " so for large maps it may be more memory efficient to generate the JSON manually.

'{(other sourcemap properties...), "mappings": "' + encode(decoded.mappings) + '"}'

I'm making the assumption that V8 would make this more memory efficient by not copying the encoded string into the JSON, but instead making the output JSON into a rope containing 3 strings.

});
if (options.sourceMap.includeSources) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section was moved into the SourceMap. Before, calling setSourceContent for the original source contents and the input contents would mean there are 2 contents stored because they have different file names. SourceMapGenerator would traverse all mappings and determine if a particular file isn't referenced in the mappings, and would exclude that file's contents from the output.

GenMapping doesn't do that traversal. If you tell it that file.js has contents, then file.js will be in the output map even if it's not actually referenced by any mappings. By moving this into SourceMap, we can do a property lookup for source contents as they're needed.

Copy link
Collaborator

@fabiosantoscode fabiosantoscode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested a couple of changes.

Additionally, can you update the typescript definition? It still refers to source-map and doesn't have the new decoded_map result property.

https://github.com/terser/terser/blob/master/tools/terser.d.ts

lib/minify.js Show resolved Hide resolved
lib/minify.js Outdated Show resolved Hide resolved
"@jridgewell/trace-mapping": "^0.3.5",
"acorn": "^8.5.0",
"commander": "^2.20.0",
"source-map": "~0.8.0-beta.0",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests still use this (namely test/mocha/sourcemaps.js) so it should go into devDependencies

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't purge my node_modules. Fixed.

@fabiosantoscode
Copy link
Collaborator

I'm thinking both this PR and your previous #1180 are breaking changes, as they change how Terser can be loaded in the browser without a package manager (see https://github.com/terser/terser#api-reference).

So the next release will be Terser 6.0.0, and I'll likely take away this browser-loading capability.

jridgewell and others added 2 commits May 1, 2022 10:47
Co-authored-by: Fábio Santos <fabiosantosart@gmail.com>
Copy link
Collaborator Author

@jridgewell jridgewell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as they change how Terser can be loaded in the browser without a package manager

  1. Oh, we forgot to update the rollup.config.js file
  2. I wrote a small wrapper package that will provides the same SourceMapConsumer and SourceMapGenerator interface as source-map. That way, if loaded in node, it'll require the new libraries. And if a loaded in a browser, the dev could load either source-map or @jridgewell/source-map and it'll work.

This preserves API compatibility, so that loading source-map in the browser is still supported.
@fabiosantoscode
Copy link
Collaborator

I've noticed a small uptick in memory usage, but better speed.

I'm comparing with this branch merged to master, not just this branch.

## Before
--before creating OutputStream and printing 
heapTotal 790.9 MB                          
heapUsed 571.83 MB                          
--after OutputStream prints                 
heapTotal 967.02 MB                         
heapUsed 698.39 MB                          
--after assigning decoded_map
heapTotal 950.77 MB                         
heapUsed 707.11 MB                          
--after result.map getter
heapTotal 935.77 MB                         
heapUsed 707.11 MB                          
## After
--before creating OutputStream and printing
heapTotal 791.15 MB                        
heapUsed 571.63 MB                         
--after OutputStream prints                
heapTotal 1072.77 MB                       
heapUsed 713.45 MB                         
--after assigning decoded_map              
heapTotal 1024.52 MB                       
heapUsed 712.96 MB                         
--after result.map getter                  
heapTotal 990.77 MB                        
heapUsed 730.88 MB                         

This isn't concerning at all, just wanted to point it out.

@fabiosantoscode
Copy link
Collaborator

Since you're switching to a library called source-map this might just be compatible with the current browser usage!

@fabiosantoscode
Copy link
Collaborator

I'll check later. Sadly I'm back from holidays tomorrow so I'll be as non-responsive as usual.

@jridgewell
Copy link
Collaborator Author

I've noticed a small uptick in memory usage, but better speed.

This is a little surprising. I added memory usage benchmarks which are favorable. I can massage the addMapping call a bit to avoid temporary object allocations. The generated and original objects, and the mapping object itself are all temporary in both APIs. We could allocate once, assign properties, and call.

Since you're switching to a library called source-map this might just be compatible with the current browser usage!

Hopefully. I reused the same window.sourceMap name in UMD, and provide the same class and prototype methods.

// We support both @jridgewell/source-map (which has a sync
// SourceMapConsumer) and source-map (which has an async
// SourceMapConsumer).
orig_map = await new SourceMapConsumer(options.orig);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@fabiosantoscode
Copy link
Collaborator

This looks good to merge!

@fabiosantoscode
Copy link
Collaborator

Awesome work, thanks @jridgewell!

@jridgewell jridgewell deleted the gen-mapping branch May 10, 2022 01:59
CvX added a commit to discourse/discourse that referenced this pull request Nov 19, 2022
The `decodedMap` prop comes from terser/terser#1190

> This also exposes a new `decodedMap` property on the result object. Decoded maps are free to create (it's a shallow clone of the `GenMapping` instance), and passing them to `@jridgewell/trace-mapping` is copy-free. With Babel [recently](babel/babel#14497) adding a `decodedMap` field, a dev could pass from the Babel transpilation to Terser without any added memory use for sourcemaps.
CvX added a commit to discourse/discourse that referenced this pull request Nov 24, 2022
The `decodedMap` prop comes from terser/terser#1190

> This also exposes a new `decodedMap` property on the result object. Decoded maps are free to create (it's a shallow clone of the `GenMapping` instance), and passing them to `@jridgewell/trace-mapping` is copy-free. With Babel [recently](babel/babel#14497) adding a `decodedMap` field, a dev could pass from the Babel transpilation to Terser without any added memory use for sourcemaps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants