Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ngcc: implement a new program-based entry-point finder #37075

Closed

Conversation

petebacondarwin
Copy link
Member

@petebacondarwin petebacondarwin commented May 12, 2020

Add a new EntryPointFinder that can be seeded from the imports in the program specified by a tsconfig.json file. This should be much faster than the DirectoryWalkerEntryPointFinder when the active program only imports a small proportion of the installed entry-points.

On the AIO project only half the entry-points would be processed by ngcc if the CLI async integration with ngcc made use of the new flag.

@petebacondarwin petebacondarwin added comp: ngcc state: WIP feature Issue that requests a new feature labels May 12, 2020
@ngbot ngbot bot modified the milestone: needsTriage May 12, 2020
@petebacondarwin
Copy link
Member Author

The approach here, which involves parsing each source file, could be slow for a large app.
We should investigate using the TS scanner directly or even a simple regular expression, since we only care about capturing the import paths and can be quite tolerant of false positives.

@petebacondarwin petebacondarwin force-pushed the ngcc-target-program branch 4 times, most recently from 562fc7e to 87c7485 Compare June 2, 2020 18:35
@petebacondarwin petebacondarwin changed the title feat(ngcc): WIP - implement a new program-based entry-point finder feat(ngcc): implement a new program-based entry-point finder Jun 2, 2020
@petebacondarwin petebacondarwin force-pushed the ngcc-target-program branch 3 times, most recently from 423b945 to 2754a92 Compare June 3, 2020 15:13
@petebacondarwin petebacondarwin added action: review The PR is still awaiting reviews from at least one requested reviewer area: performance and removed state: WIP feature Issue that requests a new feature labels Jun 3, 2020
@petebacondarwin petebacondarwin marked this pull request as ready for review June 3, 2020 15:13
@pullapprove pullapprove bot requested review from alxhub and gkalpak June 3, 2020 15:13
Copy link
Member

@JoostK JoostK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in "capture dynamic import expressions as well as declarations" commit:

  • "and creates lesss" -> "and creates less"

@petebacondarwin petebacondarwin changed the title feat(ngcc): implement a new program-based entry-point finder ngcc: implement a new program-based entry-point finder Jun 3, 2020
atscott pushed a commit that referenced this pull request Jun 4, 2020
The various dependency hosts had a lot of duplicated code.
This commit refactors them to move this into the base class.

PR Close #37075
atscott pushed a commit that referenced this pull request Jun 4, 2020
…#37075)

Previously we only checked for static import declaration statements.
This commit also finds import paths from dynamic import expressions.

Also this commit should speed up processing: Previously we were parsing
the source code contents into a `ts.SourceFile` and then walking the parsed
AST to find import paths.
Generating an AST is unnecessary work and it is faster and creates less
memory pressure to just scan the source code contents with the TypeScript
scanner, identifying import paths from the tokens.

PR Close #37075
atscott pushed a commit that referenced this pull request Jun 4, 2020
This finder is designed to only process entry-points that are reachable
by the program defined by a tsconfig.json file.

It is triggered by calling `mainNgcc()` with the `findEntryPointsFromTsConfigProgram`
option set to true. It is ignored if a `targetEntryPointPath` has been
provided as well.

It is triggered from the command line by adding the `--use-program-dependencies`
option, which is also ignored if the `--target` option has been provided.

Using this option can speed up processing in cases where there is a large
number of dependencies installed but only a small proportion of the
entry-points are actually imported into the application.

PR Close #37075
@atscott atscott closed this in 7f98b87 Jun 4, 2020
atscott pushed a commit that referenced this pull request Jun 4, 2020
The various dependency hosts had a lot of duplicated code.
This commit refactors them to move this into the base class.

PR Close #37075
atscott pushed a commit that referenced this pull request Jun 4, 2020
…#37075)

Previously we only checked for static import declaration statements.
This commit also finds import paths from dynamic import expressions.

Also this commit should speed up processing: Previously we were parsing
the source code contents into a `ts.SourceFile` and then walking the parsed
AST to find import paths.
Generating an AST is unnecessary work and it is faster and creates less
memory pressure to just scan the source code contents with the TypeScript
scanner, identifying import paths from the tokens.

PR Close #37075
atscott pushed a commit that referenced this pull request Jun 4, 2020
This finder is designed to only process entry-points that are reachable
by the program defined by a tsconfig.json file.

It is triggered by calling `mainNgcc()` with the `findEntryPointsFromTsConfigProgram`
option set to true. It is ignored if a `targetEntryPointPath` has been
provided as well.

It is triggered from the command line by adding the `--use-program-dependencies`
option, which is also ignored if the `--target` option has been provided.

Using this option can speed up processing in cases where there is a large
number of dependencies installed but only a small proportion of the
entry-points are actually imported into the application.

PR Close #37075
@petebacondarwin petebacondarwin deleted the ngcc-target-program branch June 4, 2020 18:40
Copy link
Member

@gkalpak gkalpak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Post-merge review 😁:
tl;dr LGTM 👍

imports.add(reexportPath);
}
break;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block would benefit from some comments explaining what each case does and why. It is quite hard (for me) to understand without diving into the Scanner.

lastToken = currentToken;
}

// Clear the text from the scanner.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? 😁

@@ -93,7 +225,7 @@ export class EsmDependencyHost extends DependencyHostBase {
* in this file, true otherwise.
*/
export function hasImportOrReexportStatements(source: string): boolean {
return /(import|export)\s.+from/.test(source);
return /(?:import|export)[\s\S]+?(["'])(?:(?:\\\1|.)*?)\1/.test(source);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

There is one unneeded set of parenthesis (afaict), but not really important:

Suggested change
return /(?:import|export)[\s\S]+?(["'])(?:(?:\\\1|.)*?)\1/.test(source);
return /(?:import|export)[\s\S]+?(["'])(?:\\\1|.)+?\1/.test(source);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit frightened to touch that again :-)

* This is faster than searching the entire file-system for all the entry-points,
* and is used primarily by the CLI integration.
*
* There are two concrete implementation of this class.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* There are two concrete implementation of this class.
* There are two concrete implementations of this class.

@petebacondarwin
Copy link
Member Author

Thanks for the extra review @gkalpak - here is a fixup #37461

gkalpak added a commit to gkalpak/angular that referenced this pull request Jun 5, 2020
This is a follow-up to angular#37075, because I didn't manage to finish my
review before the PR got merged.
gkalpak added a commit to gkalpak/angular that referenced this pull request Jun 8, 2020
This is a follow-up to angular#37075, because I didn't manage to finish my
review before the PR got merged.
gkalpak added a commit to gkalpak/angular that referenced this pull request Jun 8, 2020
This is a follow-up to angular#37075, because I didn't manage to finish my
review before the PR got merged.
gkalpak added a commit to gkalpak/angular that referenced this pull request Jun 10, 2020
This is a follow-up to angular#37075, because I didn't manage to finish my
review before the PR got merged.
mhevery pushed a commit that referenced this pull request Jun 12, 2020
…pos (#37040)

This is a follow-up to #37075, because I didn't manage to finish my
review before the PR got merged.

PR Close #37040
mhevery pushed a commit that referenced this pull request Jun 12, 2020
…pos (#37040)

This is a follow-up to #37075, because I didn't manage to finish my
review before the PR got merged.

PR Close #37040
ngwattcos pushed a commit to ngwattcos/angular that referenced this pull request Jun 25, 2020
…ost` (angular#37075)

Previously this host was skipping files if they had imports that spanned
multiple lines, or if the import was a dynamic import expression.

PR Close angular#37075
ngwattcos pushed a commit to ngwattcos/angular that referenced this pull request Jun 25, 2020
…7075)

The various dependency hosts had a lot of duplicated code.
This commit refactors them to move this into the base class.

PR Close angular#37075
ngwattcos pushed a commit to ngwattcos/angular that referenced this pull request Jun 25, 2020
…angular#37075)

Previously we only checked for static import declaration statements.
This commit also finds import paths from dynamic import expressions.

Also this commit should speed up processing: Previously we were parsing
the source code contents into a `ts.SourceFile` and then walking the parsed
AST to find import paths.
Generating an AST is unnecessary work and it is faster and creates less
memory pressure to just scan the source code contents with the TypeScript
scanner, identifying import paths from the tokens.

PR Close angular#37075
ngwattcos pushed a commit to ngwattcos/angular that referenced this pull request Jun 25, 2020
This finder is designed to only process entry-points that are reachable
by the program defined by a tsconfig.json file.

It is triggered by calling `mainNgcc()` with the `findEntryPointsFromTsConfigProgram`
option set to true. It is ignored if a `targetEntryPointPath` has been
provided as well.

It is triggered from the command line by adding the `--use-program-dependencies`
option, which is also ignored if the `--target` option has been provided.

Using this option can speed up processing in cases where there is a large
number of dependencies installed but only a small proportion of the
entry-points are actually imported into the application.

PR Close angular#37075
ngwattcos pushed a commit to ngwattcos/angular that referenced this pull request Jun 25, 2020
…pos (angular#37040)

This is a follow-up to angular#37075, because I didn't manage to finish my
review before the PR got merged.

PR Close angular#37040
mgechev pushed a commit to angular/angular-cli that referenced this pull request Jul 1, 2020
With this change we consume the change in angular/angular#37075 where the NGCC now exposes a new option `--use-program-dependencies`, to only process packages which are part of the TypeScript program.

From the initial benchmarking the time taken for NGCC to process that dependencies needed for an `ng new` application went down by 36% from 10526ms to 6708ms.
villelahdenvuo pushed a commit to villelahdenvuo/angular-cli that referenced this pull request Jul 6, 2020
With this change we consume the change in angular/angular#37075 where the NGCC now exposes a new option `--use-program-dependencies`, to only process packages which are part of the TypeScript program.

From the initial benchmarking the time taken for NGCC to process that dependencies needed for an `ng new` application went down by 36% from 10526ms to 6708ms.
@angular-automatic-lock-bot
Copy link

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot bot locked and limited conversation to collaborators Jul 6, 2020
profanis pushed a commit to profanis/angular that referenced this pull request Sep 5, 2020
…ost` (angular#37075)

Previously this host was skipping files if they had imports that spanned
multiple lines, or if the import was a dynamic import expression.

PR Close angular#37075
profanis pushed a commit to profanis/angular that referenced this pull request Sep 5, 2020
…7075)

The various dependency hosts had a lot of duplicated code.
This commit refactors them to move this into the base class.

PR Close angular#37075
profanis pushed a commit to profanis/angular that referenced this pull request Sep 5, 2020
…angular#37075)

Previously we only checked for static import declaration statements.
This commit also finds import paths from dynamic import expressions.

Also this commit should speed up processing: Previously we were parsing
the source code contents into a `ts.SourceFile` and then walking the parsed
AST to find import paths.
Generating an AST is unnecessary work and it is faster and creates less
memory pressure to just scan the source code contents with the TypeScript
scanner, identifying import paths from the tokens.

PR Close angular#37075
profanis pushed a commit to profanis/angular that referenced this pull request Sep 5, 2020
This finder is designed to only process entry-points that are reachable
by the program defined by a tsconfig.json file.

It is triggered by calling `mainNgcc()` with the `findEntryPointsFromTsConfigProgram`
option set to true. It is ignored if a `targetEntryPointPath` has been
provided as well.

It is triggered from the command line by adding the `--use-program-dependencies`
option, which is also ignored if the `--target` option has been provided.

Using this option can speed up processing in cases where there is a large
number of dependencies installed but only a small proportion of the
entry-points are actually imported into the application.

PR Close angular#37075
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
action: merge The PR is ready for merge by the caretaker area: performance cla: yes target: patch This PR is targeted for the next patch release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants