Performance improvement during run of instrumented code #172

pjanotti · 2018-08-13T16:08:06Z

This PR has a set of small set of low risk optimizations. Coverlet can do much better than this PR by using indexes when calling the tracker code instead of relying on strings and dictionaries, however, that is a much more risky change and needs to be carefully tested and validated. I'm putting this PR up so we can ensure that on next release at least these performance gains are in place, hopefully, me or somebody has a chance to go for this more complete optimization.

The performance effects were measured against the test project System.Text.Encodings.Web.Tests in dotnet/corefx repo. This project was selected for optimization because it takes around 6 seconds to run it without instrumentation but was taking almost 300 seconds using coverlet 2.1.1. Recent changes to fix BadImageFormatException brought this down already to ~170 seconds in release 2.2.1. This PR brings it down a bit further to down to ~115 seconds.

Version	RunTestsForProject (ms)	% reduction from previous	From 2.1.1
2.1.1	295,168	N/A	N/A
2.2.1	168,524	42.91%	42.91%
This PR	113,427	32.69%	61.57%

The main performance gain comes from a custom comparer for the strings used in the code to track execution. Since these strings typically differ only on the end (e.g.: prefix for temp folder of hits file is always the same), using a comparer that generates hashes based only on the final chars of the strings saves considerable amount of work.

Some of the changes affect InstrumentationTask and CoverageResultTask. However, for the targeted project the times of InstrumentationTask (~1,250 msec) and CoverageResultTask (less than 200 msec) were too small in relation to the test execution time and as such I didn't focus on improving them, just ensuring no regression on them. The changes regarding these two tasks were expected to have effect only if a large number of source or hit files were being used by the targeted project.

Renames and restore perf test

pjanotti · 2018-08-13T16:11:02Z

/cc @tarekgh

tonerdo · 2018-08-13T16:26:42Z

src/coverlet.core/Instrumentation/Instrumenter.cs

            if (!_result.Documents.TryGetValue(sequencePoint.Document.Url, out var document))
            {
                document = new Document { Path = sequencePoint.Document.Url };
-                documentIndex = _result.Documents.Count;
+                document.Index = _result.Documents.Count;


Oh shit, why didn't I think of this 😄

tonerdo · 2018-08-13T23:44:27Z

src/coverlet.tracker/CoverageTracker.cs

@@ -25,7 +26,7 @@ public static void MarkExecuted(string file, string evt)
        {
            if (t_events == null)
            {
-                t_events = new Dictionary<string, Dictionary<string, int>>();
+                t_events = new Dictionary<string, Dictionary<string, int>>(_stringHashSuffixComparer);


You know, now that I think of it there's really no reason to store the path to the hits file as a key in a dictionary. By design, the hits file for an assembly remains the same for an entire test session. Also, the coverage.tracker assembly is usually loaded by the runtime in different contexts for each assembly that depends on it so there's no chance of multiple instrumented assemblies supplying conflicting information.

I believe this approach was added when we had multiple separate gzipped hits files because some users were experiencing considerable disk usage due to large hits files. That's no longer the case.

In this case we can simply make the hits file a field and assign to it once (if it evaluates to null) and have a standalone Dictionary<string, int> to store the marker information.

Let me know what you think

It sounds correct to me. I'll give a try during the day.

This requires some risky changes: the same thread can receive events from different modules. In other words while the hits file is the same for a given assembly the same thread may receive markers from different assemblies. I'm going to try some minimal changes to optimize that but perhaps we should go for the safe and simple optimization in the short run, while we work on a more profound one.

I don't think it's currently possible for the coverlet.tracker assembly to receive markers from different assemblies. Here's an illustration of how I believe things are setup in a multiple assembly scenario.

SomeAssembly.Tests.dll depends on SomeAssembly1.dll and SomeAssembly2.dll which are both instrument-able

SomeAssembly1.dll and SomeAssembly2.dll are instrumented by coverlet and the coverlet.tracker.dll dependency is copied to the same folder the two assemblies occupy.

The test project SomeAssembly.Tests is run and during that process, methods in SomeAssembly1.dll are called which contain instrumentation instructions that call out to coverlet.tracker.dll

During that same process, instrumented methods from SomeAssembly2.dll also call out to coverlet.tracker.dll

What happens in this case is that both SomeAssembly1.dll and SomeAssembly2.dll have coverlet.tracker.dll loaded in different contexts. Whatever changes SomeAssembly1.dll makes to static variables in its own copy of coverlet.tracker.dll doesn't interfere with how SomeAssembly2.dll interacts with its own equivalent copy. SomeAssembly1.dll and SomeAssembly2.dll depend independently on coverlet.tracker.dll the same way two different projects can depend on the same version of Newtonsoft.Json and still not get in each other's way even if their binaries all end up in the same folder. Hence, it's perfectly safe for coverlet.tracker.dll to assume that markers will always come from the same assembly.

See https://github.com/tonerdo/coverlet/blob/1d92ee66a1bb68629bf31022a300e57c36ce535f/src/coverlet.core/CoverageTracker.cs for the original implementation that shipped. It worked perfectly in a multiple assembly scenario, it just wasn't performant or thread safe

I did a test to be sure: check https://github.com/tonerdo/coverlet/compare/master...pjanotti:test.multithread?expand=1 It does hit the exception when running the performance test. The thread static is the same in a single app domain (that even doesn't really exist on core) for a given thread, but, the same is true for any static of CoverageTracker in the same app domain.

The loader is doing the right thing: let's say SomeAssembly1.dll is the first to call the marker method, it causes the runtime to load the type CoverageTracker (and perform its static initialization), from that point on any other call, no matter from which module is going to use the already loaded CoverageTracker, unless there are multiple app domains (then you get independent types/statics for each AppDomain, even if they are [ThreadStatic]).

Does this happen only in a multi-threaded scenario?

No, single thread too, you just need different modules on the same thread.

I did an experiment to generate a "tracker type" per module and get rid of the dictionaries, it seems to be going well. It needs more testing to ensure correctness and that it doesn't break some scenarios that are working now. If you want to take a peak before I do some clean-up/polishing/fine tunning take a look here. There is a bit of trade-off and it probably it will use more memory in projects with lower coverage numbers but for the project that I tested here it consistently runs the whole test in ~19 seconds (about 6 times faster than the optimization proposed on this PR).

Took a quick peek. Seems like a good crop of changes. Will merge this in now since I'm fine with it, thanks for the change and explanations

Thanks @tonerdo

tarekgh · 2018-08-13T23:48:15Z

@tonerdo I thought the same when I worked previously on the perf issue. I didn't spend time during then to investigate more but what you have said align with what I was thinking too.

tarekgh · 2018-08-22T16:27:47Z

@pjanotti I have added a comment but I am not sure if you answered it my question. why we introduced StringHashSuffixComparer and didn't use StringComparer.Ordinal?

pjanotti · 2018-08-22T17:30:22Z

Because the prefix of the strings is very similar and typically long (think paths for sources and modules) and it is only the end of the strings that actually differentiate the hashes. The next optimization is to get rid of them entirely (see the WIP mentioned above).

tarekgh · 2018-08-22T18:01:41Z

The hashing method can produce deceiving results and we may be lucky so far.
strings like:

"a" and "aaaaaaaaa" will produce the same hash code. if we are planning to change this anyway in the near future, then that is ok. otherwise, this may cause unexpected issues and will be hard to investigate.

pjanotti · 2018-08-22T18:46:55Z

"a" and "aaaaaaaaa" will produce the same hash code.

Hashes in this case will be different. The "known" case of hash collision introduced by this PR is if the strings have the same length and the last 8 chars are the same, of course there could be the normal "statistical collisions". That said if there are a very high number of the collisions the performance will suffer but correctness won't be affected.

…changes Performance improvement during run of instrumented code

Assorted perf improvements

b9d5e98

Renames and restore perf test

tonerdo reviewed Aug 13, 2018

View reviewed changes

tonerdo merged commit 30c000b into coverlet-coverage:master Aug 22, 2018

pjanotti mentioned this pull request Aug 22, 2018

Perf: create a tracker per module and use indexes to improve performance #181

Merged

NorekZ pushed a commit to NorekZ/coverlet that referenced this pull request Nov 8, 2018

Merge pull request coverlet-coverage#172 from pjanotti/assorted.perf.…

304b2c8

…changes Performance improvement during run of instrumented code

pjanotti deleted the assorted.perf.changes branch October 23, 2019 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvement during run of instrumented code #172

Performance improvement during run of instrumented code #172

pjanotti commented Aug 13, 2018

pjanotti commented Aug 13, 2018

tonerdo Aug 13, 2018 •

edited

tonerdo Aug 13, 2018 •

edited

pjanotti Aug 14, 2018

pjanotti Aug 14, 2018

tonerdo Aug 15, 2018 •

edited

pjanotti Aug 15, 2018 •

edited

tonerdo Aug 15, 2018

pjanotti Aug 15, 2018

pjanotti Aug 21, 2018 •

edited

tonerdo Aug 22, 2018

pjanotti Aug 22, 2018

tarekgh commented Aug 13, 2018 •

edited

tarekgh commented Aug 22, 2018

pjanotti commented Aug 22, 2018

tarekgh commented Aug 22, 2018

pjanotti commented Aug 22, 2018

Performance improvement during run of instrumented code #172

Performance improvement during run of instrumented code #172

Conversation

pjanotti commented Aug 13, 2018

pjanotti commented Aug 13, 2018

tonerdo Aug 13, 2018 • edited

Choose a reason for hiding this comment

tonerdo Aug 13, 2018 • edited

Choose a reason for hiding this comment

pjanotti Aug 14, 2018

Choose a reason for hiding this comment

pjanotti Aug 14, 2018

Choose a reason for hiding this comment

tonerdo Aug 15, 2018 • edited

Choose a reason for hiding this comment

pjanotti Aug 15, 2018 • edited

Choose a reason for hiding this comment

tonerdo Aug 15, 2018

Choose a reason for hiding this comment

pjanotti Aug 15, 2018

Choose a reason for hiding this comment

pjanotti Aug 21, 2018 • edited

Choose a reason for hiding this comment

tonerdo Aug 22, 2018

Choose a reason for hiding this comment

pjanotti Aug 22, 2018

Choose a reason for hiding this comment

tarekgh commented Aug 13, 2018 • edited

tarekgh commented Aug 22, 2018

pjanotti commented Aug 22, 2018

tarekgh commented Aug 22, 2018

pjanotti commented Aug 22, 2018

tonerdo Aug 13, 2018 •

edited

tonerdo Aug 13, 2018 •

edited

tonerdo Aug 15, 2018 •

edited

pjanotti Aug 15, 2018 •

edited

pjanotti Aug 21, 2018 •

edited

tarekgh commented Aug 13, 2018 •

edited