Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements #134

Merged
merged 2 commits into from Jul 5, 2018
Merged

Conversation

petli
Copy link
Collaborator

@petli petli commented Jun 26, 2018

Fixes #120 see comments in that issue on the performance improvements here.

One possibly breaking change here is that it no longer periodically writes files (as there's no real need for it) but that also means that if the process exit callbacks aren't called for some reason, no results at all will be available. A periodic flush could be added based on number of events, but since such a partial result is not very useful, I think it would be better to look for ways to ensure that the results always are written correctly.

To test the performance change, just checkout commit b3c080c that just contains the new performance test. The changes are in the second commit.

This is done by aggregating hits in the CoverateTracker directly, rather than just recording events in files, and by avoiding the former quadratic complexity when translating these event files into hits on lines and branches.
@latop2604
Copy link
Contributor

partial result is not very useful

It can be useful to see in the IDE where the test failed. (dotnet watch + Coverage Gutter)
Visual Studio live unit test as this feature and it helps a lot

@tonerdo
Copy link
Collaborator

tonerdo commented Jun 26, 2018

Oh wow! This improvement is much welcome. Will take a deeper look tomorrow

@tonerdo
Copy link
Collaborator

tonerdo commented Jun 27, 2018

Also, if you've got time can you just give me a brief overview of the major bottle necks you found?

int end = int.Parse(info[3]);
for (int j = start; j <= end; j++)
{
var line = document.Lines.First(l => l.Number == j);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line was the core of the bottleneck (together with the similar line doing document.Branches.First above. Having the lines in a list means that it finding the correct line for an event will have to search through n/2 items of the n items in the list on average. The number of events in a test can be described as kn, where k is the number of times the average line is executed during the test.

Putting that together means that processing all events in the test will take (k/2) * n^2 operations. In complexity theory the factor is not interesting since when n gets big, the value of k is irrelevant, so the time complexity of this operation is expressed as O(n^2) in the big-O notation, i.e. quadratic time complexity.

@petli
Copy link
Collaborator Author

petli commented Jun 27, 2018

@tonerdo I've added a comment to the diff highlighting the lines causing the main bottleneck.

The PR addresses that by using a dictionary to tally the event counts. According to
https://msdn.microsoft.com/en-us/library/xfhwa508.aspx#Remarks the lookup time complexity of a dictionary lookup is close to O(1), i.e. constant time, but adding an element to the dictionary is worst-case O(n) if the dictionary must be extended. An minor optimisation here might be to initilialize the dictionary to be large enough to fit all events, but I reckon there's already a lot of optimisation going on under the hood in Dictionary to make that a not very huge win.

The second part of this PR moves this tallying of events from the post-processing step into CoverageTracker directly. This doesn't change anything in terms of time complexity, but saves time since a lot of events doesn't have to be written to disk and later processed.

Further optimisation possibilities worth investigating could be:

  • Instead of using strings to identify the event, increment a counter on each instrumentation and use that number to identify the event. This saves on hashing and string comparison, reduces memory consumption and should speed up test execution another bit. The post-processing step would translate these event IDs back into the lines or branches hit. If the number of events can be made known to CoverageTracker, it could also allocate an int[] of sufficient size and avoid the dictionary lookups completely.

  • Continuing on that and tying into the comment by @latop2604, instead of using a .NET array it might be possible to use a binary memory-mapped file. This means that hit counts are updated directly on disk without having to write them to a file at the end of it, and would ensure that regardless of how a test ends the hit counts are stored in a file. It would even be possible to produce reports continuously while the test runs, if that would be interesting.

</ItemGroup>

<ItemGroup>
<ProjectReference Include="..\coverlet.testsubject\coverlet.testsubject.csproj" />
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can move BigClass.cs directly into this project.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have misunderstood something then, as I thought only the dependencies of the unit tests got instrumented, not the unit test DLL itself. Is that not the case? The purpose of having this test subject DLL was to get BigClass instrumented when running the performance test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake @petli. Thanks for the clarification

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class do need a comment block explaining why it is where it is, I'll fix that.

{
fileEvents.Add(evt, 1);
}
else if (count < int.MaxValue)
Copy link
Collaborator

@tonerdo tonerdo Jun 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if count is less than int.MaxValue? Shouldn't we try a long?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed this code (but I see I changed the condition slightly) from the former site of the hit counting:
https://github.com/tonerdo/coverlet/pull/134/files/1f4da61d7aaa444ff1857f65ddddaec6f01e1f95#diff-9eb1afa9bd8141e48d456c120df924caL189

I doubt that anyone who gets 2G hits on a line cares much if the counter stops then. :)

@tonerdo tonerdo merged commit 65fa6af into coverlet-coverage:master Jul 5, 2018
@petli petli deleted the performance branch July 5, 2018 11:38
NorekZ pushed a commit to NorekZ/coverlet that referenced this pull request Nov 8, 2018
mburumaxwell pushed a commit to faluapp/falu-dotnet that referenced this pull request Jun 12, 2021
Bumps [coverlet.collector](https://github.com/coverlet-coverage/coverlet) from 1.3.0 to 3.0.1.

#Release notes

*Sourced from [coverlet.collector's releases](https://github.com/coverlet-coverage/coverlet/releases).*

> ## v3.0.0
> * [#131](coverlet-coverage/coverlet#131) makes a slight change to the Coverlet JSON format
> * 807f7b1bd5bea8158ffff343d5511cd16e0da9a0 uses a separate `coverlet.tracker` assembly to hold tracking code
> * [#128](coverlet-coverage/coverlet#128) adds support for assemblies with `.exe` extension
> * a1f18b4156374f3398d704e898ec58c7c6c64bf8 improves identifying compiler generated types
> * [#134](coverlet-coverage/coverlet#134) adds considerable coverage tracking performance improvements
>
> ## v2.0.1
> * [#102](coverlet-coverage/coverlet#102) fixes issues with NUNIT3 Test adapter ([#101](coverlet-coverage/coverlet#101))
> * [#104](coverlet-coverage/coverlet#104) shows overall averages as part of final console output
> * [#112](coverlet-coverage/coverlet#112) adds support for standard `ExcludeFromCodeCoverage` attribute to specify types and methods to exclude from code coverage. Deprecates `ExcludeFromCoverage` attribute
> * coverlet-coverage/coverlet@7f190e4 prevents Opencover and Cobertura output generated at the same time from overwriting each other ([#111](coverlet-coverage/coverlet#111))
> * [#116](coverlet-coverage/coverlet#116) strongly signs the Coverlet assembly and aims to fix [#40](coverlet-coverage/coverlet#40)
>
> ## v2.0.0
> * [#78](coverlet-coverage/coverlet#78) adds support for generating multiple report formats in a single run
> * [#73](coverlet-coverage/coverlet#73) improves branch coverage support and output formats*
> * coverlet-coverage/coverlet@d2effb3 shows method coverage in summary output
> * [#88](coverlet-coverage/coverlet#88) improves disk usage by using gzip compression
> * [#93](coverlet-coverage/coverlet#93) adds `ThresholdType` property that allows you to specify the coverage type to apply the `Threshold` property to
> * coverlet-coverage/coverlet@ebedd70 renames `Exclude` property to `ExcludeByFile`*
> * coverlet-coverage/coverlet@9ed0864 supports using filter expressions to exclude assemblies, namespaces or types. Uses the `Exclude` property*
> * [#99](coverlet-coverage/coverlet#99) adds improvements to evaluation of filter expressions
>
> `*` - Backwards incompatible change

#Commits

- See f...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants