Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Graph information for pluginContext calls #2565

Merged
merged 4 commits into from Dec 16, 2018

Conversation

samccone
Copy link
Contributor

@samccone samccone commented Nov 21, 2018

Closes #2578

This adds the ability for plugins to look at the module / chunk graph
and to perform further introspection as to what modules pull in what
files and what the resolvedIds of all sources are.

Prior to this change the only way to get this information was by a
recursive call to resolveIds which seemed much uglier.

Now plugins can be authored like the following in the buildEnd phase
of the hooks.

buildEnd() {
	const deps = [];
	if (this.modules) {
		for (const module of this.modules().allModules()) {
			const m = this.modules().getModuleInfo(module);
			if (m != null && !m.isExternal) {
				for (const s of m.sourceIds) {
					const resolvedId = m.sourceIdToSourceFile(s);
					if (resolvedId != null) {
						deps.push({target: m.id, source: resolvedId})
					}
				}
			}
			
		fs.writeFileSync(path.join(__dirname, 'graph.json'), JSON.stringify(deps, null, 2));
	}
}

@@ -1,5 +1,6 @@
import * as ESTree from 'estree';
import { EventEmitter } from 'events';
import Graph from '../Graph';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets file compiled into the test folder which has it's own tsconfig which no longer has Graph.ts thus this import fails, I am wondering if this was internal?

Is the solution y'all use in your codebase to re-type the types in this file and then do something like this in your Graph.ts file?

In types.d.ts
export interface GraphInterface {
}
in Graph.ts
import {GraphInterface} from './types';

export class Graph implements GraphInterface {
}

Let me know!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that all types that are exposed to the user are strictly controlled by us and originate from types.d.ts. We simply cannot expose the Graph because it exposes everything: Internal AST data structures, internal module data structures etc. Isn't there a way to create a well-defined JSON data structure that contains the information you desire?
If we expose Graph, then from then on, virtually ALL Rollup releases will be breaking changes. We should really try to find a better way.

@@ -1,5 +1,6 @@
import * as ESTree from 'estree';
import { EventEmitter } from 'events';
import Graph from '../Graph';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that all types that are exposed to the user are strictly controlled by us and originate from types.d.ts. We simply cannot expose the Graph because it exposes everything: Internal AST data structures, internal module data structures etc. Isn't there a way to create a well-defined JSON data structure that contains the information you desire?
If we expose Graph, then from then on, virtually ALL Rollup releases will be breaking changes. We should really try to find a better way.

@lukastaegert
Copy link
Member

Why don't we extend the modules property of chunks in the bundle provided to the generateBundle hook to contain more information about the individual modules? At the moment, each module lists the variable names of its removed and rendered exports + some size information. What else do you need?
Also if you want information about the modules that never ended up in a chunk due to tree-shaking, we could put them in a separate place.

My feeling is, this should go into the 1.0 branch because there we changed the structure of the bundle object to make it easier to add additional information. We could add graph information there as well. This could also be a getter so that this information is only compiled on demand.

The plugin context is probably not a good place to put such information as it will only be available in very late hooks. If you need this information earlier than generateBundle we can definitely look into this.

@tivac
Copy link
Contributor

tivac commented Nov 21, 2018

Also if you want information about the modules that never ended up in a chunk due to tree-shaking, we could put them in a separate place.

I desperately want this for @modular-css/rollup, fwiw.

@lukastaegert
Copy link
Member

So what is the minimal information that you need? Based on that, we can sketch out an API.

@tivac
Copy link
Contributor

tivac commented Nov 21, 2018

@lukastaegert Right now I have to do a ton of work to support bundle splitting in my rollup plugin because the CSS dependency graph is ignored by rollup. Even if I inject import statements to reflect the CSS dependencies they'd still get treeshaken away since nothing uses that code. I then have to do a whole bunch of bundle-walking to recreate the chunks that rollup already created but in the CSS.

https://github.com/tivac/modular-css/blob/2d4af90239d03a0e53a9da83921daba76780eb7f/packages/rollup/rollup.js#L147-L270

It's a ton of code. I'm not totally sure what a better solution would be, but being able to see the original full dependency graph before all my "fake" CSS modules get thrown away would be a huge help.

@guybedford
Copy link
Contributor

How about a getter API that computes the graph only when asked outputting the dependency map for each module Id in the graph:

bundle.getInputGraph: () => ({
  [moduleId: string]: Record<string, string> // dependency map
});

Alternatively we could expose this on the plugin context for plugins similarly.

@lukastaegert
Copy link
Member

Sounds good to me. Or make it a getter. Putting it on the plugin context would mean that we need to carefully select which hooks can get this property. But of course just the basic graph would be available just after the modules have been loaded.

@samccone
Copy link
Contributor Author

Thanks all for the feedback, and I apologise for the lag as I was taking some time off. I will follow up with the recommendation around implementing the getter to expose inner module graph information.

@samccone
Copy link
Contributor Author

samccone commented Dec 1, 2018

@lukastaegert I made some changes to the approach here, please let me know if this is roughly what you are thinking, happy to make whatever additional changes.

Copy link
Member

@lukastaegert lukastaegert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. Apart from the comments, this new feature would require an adequate test. Usually I would recommend to add one test to the hooks folder but those tests all work with "virtual" files provided themselves via plugins which may to be what we are interested in.
To write an actual file system based test, you might create a new test in the function category and provide it with a custom plugin that checks the new context API. Note that those tests actually execute the files so avoid putting logs in your tested code to not spam the console :)
To check if your plugin has actually been called, you can add a bundle (){} key to the _config which will be called after bundling with the bundle object and allows you to make assertions as well.

};
}
};
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, yes, that is indeed very maintainable now!
Some notes that might stream-line this even more:

  • I really like the idea of exposing the modules as an iterator!
  • My feeling is we could just move this API to the top level as the way you designed it, I think there will not be much need to extend it except in what is returned by getModuleInfo. I.e. I would just put getModuleInfo directly on the context.
  • allModules could be come top-level as well but I am not 100% sold on the name as technically, this is providing ids and I think the all might be obvious and the name could reflect better that it is a getter. Maybe getModuleIds?

}

return {
id: foundModule.id,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: At the moment, this will always be moduleId

return {
id: foundModule.id,
isExternal: foundModule.isExternal,
sourceIds: foundModule.isExternal ? [] : (foundModule as Module).sources,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just what is written to the right side of the from tag of the import. Are you actually interested in that? In that case, I would not call this ids to avoid confusion with the actual ids but maybe something like importSources. Otherwise, why not directly provide the mapped ids as an array as people calling this function will probably be interested in that, e.g.

importedIds: foundModule.isExternal ? [] : (foundModule as Module).sources.map(source => (foundModule as Module).resolvedIds[source])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is unless you want to keep the function below as well for other purposes.

id: foundModule.id,
isExternal: foundModule.isExternal,
sourceIds: foundModule.isExternal ? [] : (foundModule as Module).sources,
sourceIdToSourceFile: (key: string) => (foundModule as Module).resolvedIds[key]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this, from the naming perspective I would prefer functions to start with a verb i.e. something like getIdForImportSource (depending on what we call the previous)

@lukastaegert lukastaegert changed the title Expose Graph instance for pluginContext calls. Expose Graph information for pluginContext calls Dec 3, 2018
@lukastaegert lukastaegert added this to In progress in Release 1.0.0 via automation Dec 10, 2018
@lukastaegert lukastaegert moved this from In progress to In Review in Release 1.0.0 Dec 10, 2018
@lukastaegert
Copy link
Member

@samccone I permitted myself to add some changes to your branch including a test. Please have a look if this would work for you and contains enough information. If so, I would release this in the next days.

@lukastaegert lukastaegert dismissed their stale review December 12, 2018 07:34

Implemented new draft

isExternal: !!foundModule.isExternal,
importedIds: foundModule.isExternal
? []
: (foundModule as Module).sources.map(id => (foundModule as Module).resolvedIds[id])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if this is called before the module has been resolved? Is that possible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, moduleIds will be empty and getModuleInfo will throw a "module not found" error. Added a test to make sure this is the case. This may or may not be what users want but if we e.g. do not provide this information on certain hooks at all, I fear this is something that is hard to maintain properly when e.g. new hooks are added. Thus here you get a snapshot of what the current state of the graph is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. It might be nice to separate the graph into two states then - resolved and unresolved, and for this function to always throw an early error when called when the graph is partially unresolved so that we aren't exposing the partial state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point but is this really an issue or more a way to avoid people relying on unreliable information? But having an internal flag marking the graph as incomplete would definitely be a future-proof way of handling this. Created #2598 to track. There may be more opinions on this issue.

@samccone
Copy link
Contributor Author

Thanks fo much for picking this up @lukastaegert your changes LGTM!

@lukastaegert lukastaegert merged commit 1ef9f6b into rollup:master Dec 16, 2018
Release 1.0.0 automation moved this from In Review to Done Dec 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Release 1.0.0
  
Done
Development

Successfully merging this pull request may close these issues.

Plugin API: where are the dependencies?
4 participants