fix(ngcc): update `package.json` deterministically #34870

gkalpak · 2020-01-20T16:47:25Z

Ngcc adds properties to the package.json files of the entry-points it processes to mark them as processed for a format and point to the created Ivy entry-points (in case of --create-ivy-entry-points). When running ngcc in parallel mode (which is the default for the standalone ngcc command), multiple formats can be processed simultaneously for the same entry-point and the order of completion is not deterministic.

Previously, ngcc would append new properties at the end of the target object in package.json as soon as the format processing was completed. As a result, the order of properties in the resulting package.json (when processing multiple formats for an entry-point in parallel) was not deterministic. For tools that use file hashes for caching purposes (such as Bazel), this lead to a high probability of cache misses.

This commit fixes the problem by ensuring that the position of properties added to package.json files is deterministic and independent of the order in which each format is processed.

Jira issue: FW-1801

Fixes #34635.

mary-poppins · 2020-01-20T16:54:27Z

You can preview 9b1d9df at https://pr34870-9b1d9df.ngbuilds.io/.

mary-poppins · 2020-01-20T17:06:39Z

You can preview 9ccfaec at https://pr34870-9ccfaec.ngbuilds.io/.

petebacondarwin

Nice work @gkalpak - it was not a trivial fix after all.

I would be much happier (unless you can convince me otherwise) if there was no switch statement.

petebacondarwin · 2020-01-20T17:19:22Z

packages/compiler-cli/ngcc/src/writing/package_json_updater.ts

@@ -157,4 +174,42 @@ export function applyChange(ctx: JsonObject, propPath: string[], value: JsonValu
  }

  ctx[lastProp] = value;
+  positionProperty(ctx, lastProp, positioning);


Functional programming: for the win!

Instead of using "enumeration" and a switch statement you could do this by creating instances of isNextProp function instead of positioning. I would envisage something like:

const unimportant = (propToInsert: string, currentProp: string) => boolean = false; const alphabetic = (propToInsert: string, currentProp: string) => boolean = currentProp > propToInsert; const createBefore = (beforeProp: string) => (_propToInsert: string, currentProp: string) => currentProp === beforeProp;

This would also avoid the awkward union of string and object.

I am not sure I 100% understand what you are suggesting 😁

The problem (if I understand correctly) is that for unimportant I want to leave existing properties at their old positions (to avoid unnecessary changes to files). This can't be coded as an isNextProp() function. Also, in the future there might be different positionings, such as {after: <prop>}, which again cannot be coded as an isNextProp() function.

We could come up with ways to work around these issues, but I don't think there will be much benefit in doing it (as it will rather complicate the code than simplify it).

In general, I think it is just a coincidence that two (out of three) positioning options (alphabetic and {before: ...}) can be implemented with similar logic (movePropBefore()).
_{Fun fact:

Original, I specified positioning values as unimportant, alphabetic and {after: <prop>}, but switched to {before: <prop>} (instead of after), because it allowed the implementations to share more code and I didn't really care as long as the new prop was next to the old prop 😁}

There is always the case that I might be missing something, so feel free to suggest an implementation.

This is the basic idea: 6891dde

I missed a few places that would also need updating but you get the general gist :-)

Oh, I see what you mean now. Well, the thing is that in parallel mode the recorded changes are collected on the worker side, serialized to JSON and send over to the cluster master which applies them to the actual file. Therefore, they need to be JSON-serializable (and functions are not (in a reasonable way 😉)).

I see. That is a shame. Leave it as it is then.

Although, I could envisage changing the messaging between worker and master so that the message is abstracted more to the update task rather than needing to know about the JSON structure. I.E. where the property is placed could be owned by the cluster master and the messages could be more domain oriented.

I am not sure I agree, but that's a discussion for another day 😃

petebacondarwin · 2020-01-20T17:29:15Z

packages/compiler-cli/ngcc/test/execution/cluster/package_json_updater_spec.ts

@@ -88,21 +104,31 @@ runInEachFileSystem(() => {
        it('should send an `update-package-json` message to the master process', () => {


Should there be an equivalent check to

afterEach(() => expect(processSendSpy).not.toHaveBeenCalled());

which is in the master version of this code. But in this case we should check that the writeChangesSpy was not called, right?

Good idea 👍
Done

petebacondarwin · 2020-01-20T17:43:45Z

packages/compiler-cli/ngcc/test/integration/ngcc_spec.ts

+           // Helpers
+           function expectNotToHaveProp(obj: object, prop: string) {
+             expect(obj.hasOwnProperty(prop))
+                 .toBe(false, `Expected object not to have property '${prop}'.`);


Is it worth stringifying the object in the error message too?

petebacondarwin · 2020-01-20T17:48:26Z

packages/compiler-cli/ngcc/test/integration/ngcc_spec.ts

+           expect(pkgKeys).toContain('|fesm2015_ivy_ngcc|fesm2015|');
+           expect(pkgKeys).toContain('|fesm5_ivy_ngcc|fesm5|');
+           expect(stringifyKeys(pkg.__processed_by_ivy_ngcc__ !))
+               .toMatch(/\|esm5\|(?:.+\|)?fesm2015\|(?:.+\|)?fesm5\|/);


Why is it not possible to simply check the array of keys directly here? What pre-existing properties might exist on this?
This regex matching seems overkill, no?

Because processing one format will also marked all other formats pointing to the same entry-point as processed. For example, the fesm5 property may point to the same file as the module property and thus both properties will be marked as processed once either of them gets processed. Same for fesm2015 and es2015. Also, typings gets processed the first time we process a property.

So, after having explicitly processed esm5, fesm2015 and fesm5, the __processed_by_ivy_ngcc__ object will also contain es2015, module and typings.

We do know exactly what will be in the array for this test right?
I guess it is a trade off between complexity of the test vs brittleness of the test.

In this case the test is harder to read and reason about.

In the other case the test is more prone to failing if something about the other properties that are added changes.

But I would argue that we need to ensure that "all" properties added to the __processed_by_ivy_ngcc__ are deterministic and so I don't think it would be brittle to check all fo them.

Perhaps I am missing something...

Yes, we know which properties will be there and we can check with a string match.

In this case the test is harder to read and reason about.

Well...that is subjective 😛
It is a little unexpected why there are other props there, so we would have to have a comment explaining what is going on and some people claim that reading a simple RegExp would be more straight forward 😛

I don't think it makes a big difference anyway, so happy to switch to exact string match 👍

packages/compiler-cli/ngcc/test/writing/package_json_updater_spec.ts

mary-poppins · 2020-01-21T13:35:58Z

You can preview e04a8fd at https://pr34870-e04a8fd.ngbuilds.io/.

…from worker processes

Ngcc adds properties to the `package.json` files of the entry-points it processes to mark them as processed for a format and point to the created Ivy entry-points (in case of `--create-ivy-entry-points`). When running ngcc in parallel mode (which is the default for the standalone ngcc command), multiple formats can be processed simultaneously for the same entry-point and the order of completion is not deterministic. Previously, ngcc would append new properties at the end of the target object in `package.json` as soon as the format processing was completed. As a result, the order of properties in the resulting `package.json` (when processing multiple formats for an entry-point in parallel) was not deterministic. For tools that use file hashes for caching purposes (such as Bazel), this lead to a high probability of cache misses. This commit fixes the problem by ensuring that the position of properties added to `package.json` files is deterministic and independent of the order in which each format is processed. Jira issue: [FW-1801](https://angular-team.atlassian.net/browse/FW-1801) Fixes angular#34635

mary-poppins · 2020-01-21T20:57:06Z

You can preview d779c13 at https://pr34870-d779c13.ngbuilds.io/.

…from worker processes (#34870) PR Close #34870

Ngcc adds properties to the `package.json` files of the entry-points it processes to mark them as processed for a format and point to the created Ivy entry-points (in case of `--create-ivy-entry-points`). When running ngcc in parallel mode (which is the default for the standalone ngcc command), multiple formats can be processed simultaneously for the same entry-point and the order of completion is not deterministic. Previously, ngcc would append new properties at the end of the target object in `package.json` as soon as the format processing was completed. As a result, the order of properties in the resulting `package.json` (when processing multiple formats for an entry-point in parallel) was not deterministic. For tools that use file hashes for caching purposes (such as Bazel), this lead to a high probability of cache misses. This commit fixes the problem by ensuring that the position of properties added to `package.json` files is deterministic and independent of the order in which each format is processed. Jira issue: [FW-1801](https://angular-team.atlassian.net/browse/FW-1801) Fixes #34635 PR Close #34870

…from worker processes (angular#34870) PR Close angular#34870

Ngcc adds properties to the `package.json` files of the entry-points it processes to mark them as processed for a format and point to the created Ivy entry-points (in case of `--create-ivy-entry-points`). When running ngcc in parallel mode (which is the default for the standalone ngcc command), multiple formats can be processed simultaneously for the same entry-point and the order of completion is not deterministic. Previously, ngcc would append new properties at the end of the target object in `package.json` as soon as the format processing was completed. As a result, the order of properties in the resulting `package.json` (when processing multiple formats for an entry-point in parallel) was not deterministic. For tools that use file hashes for caching purposes (such as Bazel), this lead to a high probability of cache misses. This commit fixes the problem by ensuring that the position of properties added to `package.json` files is deterministic and independent of the order in which each format is processed. Jira issue: [FW-1801](https://angular-team.atlassian.net/browse/FW-1801) Fixes angular#34635 PR Close angular#34870

angular-automatic-lock-bot · 2020-02-23T16:08:44Z

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

_{This action has been performed automatically by a bot.}

gkalpak added type: bug/fix action: merge The PR is ready for merge by the caretaker target: patch This PR is targeted for the next patch release comp: ngcc labels Jan 20, 2020

ngbot bot added this to the needsTriage milestone Jan 20, 2020

googlebot added the cla: yes label Jan 20, 2020

gkalpak mentioned this pull request Jan 20, 2020

inconsistent __processed_by_ivy_ngcc__ order breaks caching #34635

Closed

gkalpak force-pushed the fix-ngcc-deterministic-packagejson branch from 9b1d9df to 9ccfaec Compare January 20, 2020 16:59

gkalpak marked this pull request as ready for review January 20, 2020 16:59

gkalpak requested a review from a team as a code owner January 20, 2020 16:59

petebacondarwin requested changes Jan 20, 2020

View reviewed changes

gkalpak force-pushed the fix-ngcc-deterministic-packagejson branch 2 times, most recently from 91ad167 to e04a8fd Compare January 21, 2020 13:27

petebacondarwin self-requested a review January 21, 2020 14:00

gkalpak added 4 commits January 21, 2020 22:45

test(ngcc): verify that PackageJsonUpdater does not write to files …

bbb58ad

…from worker processes

fixup! fix(ngcc): update package.json deterministically

5bfd4f1

fixup! fix(ngcc): update package.json deterministically

d779c13

gkalpak force-pushed the fix-ngcc-deterministic-packagejson branch from 226d71d to d779c13 Compare January 21, 2020 20:45

petebacondarwin approved these changes Jan 23, 2020

View reviewed changes

AndrewKushnir pushed a commit that referenced this pull request Jan 23, 2020

test(ngcc): verify that PackageJsonUpdater does not write to files …

8215b34

…from worker processes (#34870) PR Close #34870

AndrewKushnir closed this in 43db4ff Jan 23, 2020

gkalpak deleted the fix-ngcc-deterministic-packagejson branch January 23, 2020 20:10

sonukapoor pushed a commit to sonukapoor/angular that referenced this pull request Feb 13, 2020

test(ngcc): verify that PackageJsonUpdater does not write to files …

40b8df1

…from worker processes (angular#34870) PR Close angular#34870

angular-automatic-lock-bot bot locked and limited conversation to collaborators Feb 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ngcc): update `package.json` deterministically #34870

fix(ngcc): update `package.json` deterministically #34870

gkalpak commented Jan 20, 2020

mary-poppins commented Jan 20, 2020

mary-poppins commented Jan 20, 2020

petebacondarwin left a comment

petebacondarwin Jan 20, 2020

gkalpak Jan 21, 2020

petebacondarwin Jan 21, 2020

petebacondarwin Jan 21, 2020

gkalpak Jan 21, 2020

petebacondarwin Jan 21, 2020

gkalpak Jan 22, 2020

petebacondarwin Jan 20, 2020

gkalpak Jan 21, 2020 •

edited

petebacondarwin Jan 20, 2020

gkalpak Jan 21, 2020

petebacondarwin Jan 20, 2020

gkalpak Jan 21, 2020

petebacondarwin Jan 21, 2020

gkalpak Jan 21, 2020

gkalpak Jan 22, 2020

mary-poppins commented Jan 21, 2020

mary-poppins commented Jan 21, 2020

angular-automatic-lock-bot bot commented Feb 23, 2020

		@@ -88,21 +104,31 @@ runInEachFileSystem(() => {
		it('should send an `update-package-json` message to the master process', () => {

fix(ngcc): update package.json deterministically #34870

fix(ngcc): update package.json deterministically #34870

Conversation

gkalpak commented Jan 20, 2020

mary-poppins commented Jan 20, 2020

mary-poppins commented Jan 20, 2020

petebacondarwin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gkalpak Jan 21, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mary-poppins commented Jan 21, 2020

mary-poppins commented Jan 21, 2020

angular-automatic-lock-bot bot commented Feb 23, 2020

fix(ngcc): update `package.json` deterministically #34870

fix(ngcc): update `package.json` deterministically #34870

gkalpak Jan 21, 2020 •

edited