Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial performance improvement for #114 #134

Merged
merged 2 commits into from Jun 19, 2017
Merged

Partial performance improvement for #114 #134

merged 2 commits into from Jun 19, 2017

Conversation

robmen
Copy link
Contributor

@robmen robmen commented Jun 11, 2017

I was starting to work towards #94 but hit some performance issues and found open issue #114. I took a couple hours to understand the code then took a shot at minimizing the number of git calls (I think I've reduced the 5 or 6 calls to 1 or 2).

I didn't delete as much code from GitExtensions.cs as I'd like to since there were a lot of unit tests using the code. If this PR looks good, I can also reduce the code in GitExtensions.cs (and GitExtensionsTests.cs) to clean up the bit of duplication between VersionOracle.cs and GitExtensions.cs remaining.

Note, there are a couple code clean up commits in the PR. The performance improvement commit is in the middle. I tried to be pretty detailed in the commit message on that commit.

@AArnott
Copy link
Collaborator

AArnott commented Jun 11, 2017

Thanks, @robmen! I look forward to reviewing this... probably tomorrow.
Did you happen to measure the perf impact of the change? Was the perf on your repo bad before? If so, do you attribute it to the number of projects building or the number of commits that are counted when calculating the git version height?

@robmen
Copy link
Contributor Author

robmen commented Jun 11, 2017

I didn't measure exactly (it was late) but each "call" to git to find the version file would end up walking the git tree to find the matching version. It isn't just calculating the height. There was lots of duplicate work when I was stepping through debugger. A fake repository I created with a hundred commits was brutal to walk all the way down and back up several times.

I found it because the VersionOptions were recreated and compared with the working tree in many places. I was experimenting with a feature to override the BuildNumberOffset passed in via MSBuild property (conversation for a different pull request). I stopped counting after finding 5 different instances VersionOptions (all read from the same version.json) where I had to overwrite the offset to make it "stick". That's how I ended up walking all the way down the tree in the debugger and why I created a repo with a bunch of commits to see if I was seeing things correctly and walking down even deeper loops in the debugger and then finding performance issue #114 on GitHub and finally saying to myself, "Oh, this is doing a lot more work than necessary" (and this is a crazy run on sentence).

I believe this change also has the possibility to reduce a lot of code. I left it all in because there were so many unit tests using the code. I didn't know if you wanted the code factored that way (for example, if the NB.GV assembly was being used for other things). Basically everything is public so hard to know the desired surface area. 😄

@AArnott
Copy link
Collaborator

AArnott commented Jun 12, 2017

Basically everything is public so hard to know the desired surface area.

I had this idea that the library that the MSBuild task calls into would be itself useful. So ya, a lot is public.
But what really needs to be public is whatever the MSBuild task, the two ps1 scripts, and the NPM module needs to function.

@robmen
Copy link
Contributor Author

robmen commented Jun 12, 2017

That's exactly what it looked like. And the "API" provided by that NB.GV assembly is pretty friendly for one-off requests but if you need to do several operations (like the MSBuild task does) you end up with lots of repeat (and potentially costly) git work. As you'll see my change reduces calls to the NB.GV assembly.

If you don't want the public surface going forward, I can reduce the overall code count quite a bit. But that's up to you. So far, it sounds like I guessed correctly. 😉

@AArnott
Copy link
Collaborator

AArnott commented Jun 17, 2017

This PR has test failures that should be dealt with. One of them is #125 and can be ignored. But there is another test failure that I haven't seen before on this PR build.

Copy link
Collaborator

@AArnott AArnott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest concern is the test failure. But I also wonder if we can shrink the size of the PR by reverting the method renames. And I'm curious about adequate test coverage after removal of these tests.

@@ -123,22 +123,6 @@ public void GetVersionHeight_VersionJsonHasParsingErrorsInHistory()
}

[Fact]
public void GetTruncatedCommitIdAsInteger_Roundtrip()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we losing coverage by deleting these tests? Or is it made up for elsewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No loss of coverage. These test were covering code that was deleted.

@@ -34,15 +32,14 @@ public static class VersionFile
/// <param name="commit">The commit to read the version file from.</param>
/// <param name="repoRelativeProjectDirectory">The directory to consider when searching for the version.txt file.</param>
/// <returns>The version information read from the file.</returns>
public static VersionOptions GetVersion(LibGit2Sharp.Commit commit, string repoRelativeProjectDirectory = null)
public static VersionOptions GetVersionFromCommit(LibGit2Sharp.Commit commit, string repoRelativeProjectDirectory = null)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this rename necessary? It doesn't look like it, and I personally prefer using overloads when they all do the same thing, just with different inputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The code was harder to understand when trying to tease out number of times the repository was being accessed. Now that I've straightened out the code, I'll try reverting these changes back to the overloads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm putting the code back but in doing so, I remember why I had such a hard time with the overloads. The behavior based on the input parameters is very different. Getting the version information from a commit is very different behavior from getting a version by searching the directory tree. Thus, the different name for the methods was extremely helpful for me.

But I'm reverting that change now and several others to simplify the salient part of the fix. Sorry, I shouldn't have included all the extra change originally (but the changes helped me learn the code).

@robmen
Copy link
Contributor Author

robmen commented Jun 18, 2017

I do not understand why these failures are happening on the build machine but not locally. I'll make the changes above and try to reproduce the issue.

@robmen
Copy link
Contributor Author

robmen commented Jun 18, 2017

Actually, I can reproduce with all the latest changes all integrated. Investigating now.

The VersionOracle constructor was using several helper methods from
GitExtensions to calculate the VersionHeight and Version for the
oracle. Unfortunately, most of the helper methods recalculate the
VersionOptions by querying git and reserializing the version.json
file. This is particularly costly when the git history gets deep.

With this commit, the VersionOracle calculates the commit and working
tree VersionOptions each once and goes about its processing from there.
This should greatly improve performance on deep git history's.

Addresses part of #114

Note: caching VersionOracle across projects in a single MSBuild
invocation still makes a lot of sense.
@robmen
Copy link
Contributor Author

robmen commented Jun 18, 2017

Okay, I simplified this PR down to just the optimization (which is the most important part). It should be much easier to see changes now. Good call.

Also, I now understand how GitExtensions.cs has an API surface that is larger than what NB.GV needs and that explains a lot of the "dead code" I was tripping across.

Copy link
Collaborator

@AArnott AArnott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if changes are warranted or not. But I'm at least curious for your response to one comment.

return !EqualityComparer<VersionOptions>.Default.Equals(workingVersion, committedVersion);
}

return false; // a missing working version is allowed and not a change.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that if I delete my version.json file, I'll still get the same version and height calculation? Why is that? In the version of this method that is in GitExtensions.cs, we consider a missing version file in the working copy to be a change. We only return false from that method if it's a "bare" repo, which doesn't seem to be considered here. Was that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I was trying to match the logic you describe. But none of the tests covered the case and this ended up giving the right answer within the set of things that I understood. For example, I removed the bare repo check because it failed tests (but everything passed without it). However, I will quickly admit that I may have missed a use case that doesn't have a test. 😄

Let me play with the code with this new information and see if I get the code to match your explanation and pass the tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be missing a test for bare repos. I can't think of anyone that actually needs this and I can't remember why I supported it to begin with. So I'll accept this PR as-is, or whenever you're satisfied. Just let me know.

Copy link
Contributor Author

@robmen robmen Jun 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, at this point, you know better than I. This is working for all my test cases right now and what is in the unit tests. If an issue comes up in the future, a test could be created for the failure and everything fixed... then we'd know for sure.

Really your call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I had some time and just stared at the code a little longer. I think I understand the deleted work tree case better and I think the bare repo case devolves to it. I'll have one last change right now.

@AArnott
Copy link
Collaborator

AArnott commented Jun 18, 2017

Thanks, @robmen. This iteration looks much easier to assess. 🙂

@AArnott AArnott merged commit af75d73 into dotnet:master Jun 19, 2017
@AArnott
Copy link
Collaborator

AArnott commented Jun 19, 2017

Thanks again.

@robmen robmen deleted the 114-partialperf branch June 19, 2017 04:59
AArnott pushed a commit that referenced this pull request Sep 5, 2022
Bumps [coverlet.msbuild](https://github.com/coverlet-coverage/coverlet) from 3.1.0 to 3.1.1.
- [Release notes](https://github.com/coverlet-coverage/coverlet/releases)
- [Commits](https://github.com/coverlet-coverage/coverlet/commits)

---
updated-dependencies:
- dependency-name: coverlet.msbuild
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants