Quick post-mortem of recently implemented RAR optimizations #8432

ladipro · 2023-02-09T13:44:20Z

Motivation

#8422 strives to improve the performance of ResolveAssemblyReferences, which is a recurring theme. We should get a sense of how the work that's recently been done in this area moved the needle. We're specifically interested in retrospectively measuring RAR as used when building a modern .NET app.

Expected outputs

A list of RAR optimizations implemented (or attempted to be implemented) in the last ~3 years.
A quantitative assessment of complexity and perf impact for each of the optimizations.
A description of the methodology used for the above.

ladipro · 2023-02-13T11:29:58Z

Measurements have been done with a micro-benchmark, directly executing the RAR task on a single thread in a tight loop. For each iteration the task was given parameters identical to those given when building a simple AspNetCore 7.0 project with dotnet build (297 assemblies passed as references + many more params). The benchmark is focused on the hot scenario, i.e. we assume dev inner loop in VS or on command-line with MSBuild server. The benchmark was made with the .NET build of MSBuild (as opposed to MSBuild.exe).

The following table has a list of commits under src/Tasks/AssemblyDependency made from Jan 2020 to Jan 2023, that look like they could have any perf relevance.

The results are rounded to the nearest half-of-percent which is about the precision at which I was able to produce the numbers in my environment.

Commit	Title	Improvement (-) or regression (+)	Complexity	Notes
`a53a018`	Ensure trailing slash for string used as key in AssemblyTableInfo's cache (#8266)	+0.5%	S
`b8d493a`	CA1838 Avoid 'StringBuilder' parameters for P/Invokes (#7186)	0.0%	S
`56ac537`	Improve string builder caches (#7093)	+0.5%	M
`ba28ab4`	Optimized immutable files up to date checks (#6974)	-0.5%	M	Surprisingly small improvement in this scenario.
`761f06e`	Minor allocation optimizations / cleanup (#6983)	0.0%	S
`b3a99d5`	[RAR] Stop removing from file cache just because an assembly is cached in process (#6891)	+5.5%	S	It is possible that the per-project disk cache is detrimental to perf in hot scenarios.
`49d582f`	Optimize logging by moving message importance checks earlier (#6381)	-5.0%	L	Logging is costly.
`3e71818`	Normalize RAR output paths (#6533)	+39.0%	S	This was a major regression and one of top candidates for future optimization.
`4920537`	Miscellaneous logging improvements (#6326)	-5.0%	L	Logging is costly.
`8c7ccdf`	Pre cache (#6107)	0.5%	L
`6819f7a`	On disk cache serialization (#6094)	-1.5%	M	Custom serializer is faster than BinaryFormatter.
`8fb627e`	Optimize RAR's GetReferenceItems method (#5929)	-6.5%	M	Nice optimization.

Overall, from Jan 2020 to Jan 2023 RAR got about 20% slower (11.7 ms -> 14.3 ms per HOT invocation).

ladipro · 2023-02-13T11:55:33Z

@Forgind, I know that pre-cache is targeted at the cold / first build scenario so it's not expected to show here. I am curious why it's not being used, though. I see that we ship SDKPrecomputedAssemblyReferences.cache with the SDK but it doesn't seem to be passed to RAR.

ladipro · 2023-02-13T12:28:24Z

Here's the profile of a hot ResolveAssemblyReference.Execute. It's still very heavy on file timestamp checks.

ladipro · 2023-02-13T12:29:09Z

The work will continue in #8441.

stan-sz · 2023-02-15T07:54:49Z

Related #2015

Forgind · 2023-02-16T04:46:50Z

@Forgind, I know that pre-cache is targeted at the cold / first build scenario so it's not expected to show here. I am curious why it's not being used, though. I see that we ship SDKPrecomputedAssemblyReferences.cache with the SDK but it doesn't seem to be passed to RAR.

As I recall, we saw some promising early numbers, so I pushed through to finish the feature, but then later testing didn't find much of a benefit, so although we can use it, I'm not sure if it's actually valuable. I'm not the best at testing perf, so if you find very different results, I'd love to be wrong!

If you find similar results, though, I wouldn't have a problem with removing it. I wouldn't expect there to be even 0.5% performance regression in dev loop scenarios; I feel like that loss is worse than the not-win from having a feature that doesn't really help.

(We'd probably have to maintain public API, but that's fine.)

Also, I'm surprised #6533 was bad. That's bad enough that I'd consider reverting it immediately and trying to figure out what went wrong afterwards. What do you think?

ladipro · 2023-02-16T08:28:28Z

Also, I'm surprised #6533 was bad. That's bad enough that I'd consider reverting it immediately and trying to figure out what went wrong afterwards. What do you think?

The problem is on this line:

msbuild/src/Tasks/AssemblyDependency/ReferenceTable.cs

Line 1294 in dfd8f41

reference.FullPath = FileUtilities.NormalizePath(resolvedPath);

Assigning a different path to the FullPath property triggers an expensive _isWinMDFile() call. I believe that calling NormalizeFullPath instead of assigning the property would fix this. I was going to make this and many more changes in a series of PRs a bit later but if you have time now I won't protest! 😀

danmoseley · 2023-02-16T17:17:51Z

Is it surprising that #6533 didn't trigger some perf gate/test at the time?

Forgind · 2023-02-16T19:51:27Z

Also, I'm surprised #6533 was bad. That's bad enough that I'd consider reverting it immediately and trying to figure out what went wrong afterwards. What do you think?

The problem is on this line:

msbuild/src/Tasks/AssemblyDependency/ReferenceTable.cs

Line 1294 in dfd8f41

reference.FullPath = FileUtilities.NormalizePath(resolvedPath);

Assigning a different path to the FullPath property triggers an expensive _isWinMDFile() call. I believe that calling NormalizeFullPath instead of assigning the property would fix this. I was going to make this and many more changes in a series of PRs a bit later but if you have time now I won't protest! 😀

I'm a little confused—looking at the diff for #6533, it didn't introduce that assignment; that assignment has existed since MSBuild moved to GitHub, at least. All 6533 did was add one new call to NormalizeFullPath and change a call from Path.GetFullPath to another NormalizePath call. Am I missing something?

Forgind · 2023-02-16T20:55:50Z

Is it surprising that #6533 didn't trigger some perf gate/test at the time?

Surprising to me, at least. RAR is a substantial fraction of total build time, so it'd be a major oversight to have it not at all represented in our tests. My guess is that it regressed 39% in some scenarios—perhaps scenarios in which _isWinMDFile() is unusually expensive—but much less in another scenario, perhaps if it's already cached. (I think that's one of the things we cache in our on-disk and in-process caches.)

Part of what's confusing to me is that my explanation suggests it would be very fast in exactly the scenario ladipro tested and slow in a scenario I'd expect our tests to catch...so perhaps I have something flipped?

In any event, the real impact is probably less than 39%, or we'd have customers ripping us to shreds by now, but I can easily imagine it's still a substantial perf regression on average—one we should try to fix.

ladipro · 2023-02-17T07:39:53Z

We don't currently have good automated perf tests that would catch this. VS tests (PerfDDRITs, Speedometer) are too noisy for this to be noticed; they are focused on end-to-end VS scenarios of which RAR is a tiny piece. The .NET perf lab runs some basic command line builds where this could make a measurable difference, but I'm not sure if anyone is looking at the results anymore.

We should start perf testing MSBuild more like a library. In this particular case, we know RAR is actually used as a direct library call by VS (a bit of a hack, alas) and plays an important role in design-time builds so I'd say introducing a microbenchmark measuring RAR only is justified. Similar for evaluation and a few more. Then of course builds (incremental, full) of solutions with varying size, with/without binlog, etc. And most importantly with good automated reporting. We'll get there!

ladipro · 2023-02-17T07:55:52Z

I'm a little confused—looking at the diff for #6533, it didn't introduce that assignment; that assignment has existed since MSBuild moved to GitHub, at least. All 6533 did was add one new call to NormalizeFullPath and change a call from Path.GetFullPath to another NormalizePath call. Am I missing something?

Without normalizing, the path assigned to FullPath was identical to what had already been probed and _isWinMDFile() could be satisfied from instanceLocalLastModifiedCache.

With normalizing we introduced a new path, unknown to the cache, which makes _isWinMDFile() hit the disk.

Forgind · 2023-02-17T22:47:58Z

The DT RAR calls do something a bit different, right? Like still call the RAR task but with different inputs?

Without normalizing, the path assigned to FullPath was identical to what had already been probed and _isWinMDFile() could be satisfied from instanceLocalLastModifiedCache.

With normalizing we introduced a new path, unknown to the cache, which makes _isWinMDFile() hit the disk.

Ok, but you said this was a dev inner loop scenario, right? Why didn't the normalized path get added to the cache at any point? I would've imagined it might force one more expensive operation once, but by the time we're talking about resolving the same references for the second time, it should be well aware of the normalized path and be just as fast for that part than the not-normalized path?

ladipro · 2023-02-20T07:33:39Z

Ok, but you said this was a dev inner loop scenario, right? Why didn't the normalized path get added to the cache at any point?

This cache is currently per RAR invocation (instanceLocalLastModifiedCache) so it starts empty with every build of every project, even in inner loop. One of the things I want to change. Files coming from the SDK and from NuGet cache should really be considered pre-resolved and immutable and have no I/O done on them in the inner loop.

ladipro added size:2 Iteration:2023February 8.0 labels Feb 9, 2023

ladipro self-assigned this Feb 9, 2023

ladipro mentioned this issue Feb 9, 2023

Developers get faster inner loop via optimized RAR in MS BUILD #8422

Closed

8 tasks

ladipro closed this as completed Feb 13, 2023

AR-May added the triaged label Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quick post-mortem of recently implemented RAR optimizations #8432

Quick post-mortem of recently implemented RAR optimizations #8432

ladipro commented Feb 9, 2023

ladipro commented Feb 13, 2023 •

edited

ladipro commented Feb 13, 2023

ladipro commented Feb 13, 2023

ladipro commented Feb 13, 2023

stan-sz commented Feb 15, 2023

Forgind commented Feb 16, 2023

ladipro commented Feb 16, 2023

danmoseley commented Feb 16, 2023

Forgind commented Feb 16, 2023

Forgind commented Feb 16, 2023

ladipro commented Feb 17, 2023

ladipro commented Feb 17, 2023

Forgind commented Feb 17, 2023

ladipro commented Feb 20, 2023

Quick post-mortem of recently implemented RAR optimizations #8432

Quick post-mortem of recently implemented RAR optimizations #8432

Comments

ladipro commented Feb 9, 2023

Motivation

Expected outputs

ladipro commented Feb 13, 2023 • edited

ladipro commented Feb 13, 2023

ladipro commented Feb 13, 2023

ladipro commented Feb 13, 2023

stan-sz commented Feb 15, 2023

Forgind commented Feb 16, 2023

ladipro commented Feb 16, 2023

danmoseley commented Feb 16, 2023

Forgind commented Feb 16, 2023

Forgind commented Feb 16, 2023

ladipro commented Feb 17, 2023

ladipro commented Feb 17, 2023

Forgind commented Feb 17, 2023

ladipro commented Feb 20, 2023

ladipro commented Feb 13, 2023 •

edited