Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for embedding arbitrary files into binlog #6339

Merged
merged 3 commits into from Apr 21, 2021

Conversation

KirillOsenkov
Copy link
Member

It would be very useful to embed more than just the project files into the binlog. For example, project.assets.json is a common request: #3529. global.json or nuget.config would be other examples. Some users may want to embed all *.cs files as well.

Add a special EmbedInBinlog item type recognized by BuildEventArgsWriter. If after evaluation the items contain EmbedInBinlog, all such items will be logged if the evaluated include points to a full file path that exists on disk. A given file can be mentioned more than once, but will only be embedded once. If a file doesn't exist or is empty, it will not be embedded. If the item is added from a target (after evaluation is done), it will not be embedded.

Checking for EmbedInBinlog is done in BuildEventArgsWriter to reuse the complicated and performance-optimized logic of iterating over all items.

This PR is a prerequisite for #3529

It would be very useful to embed more than just the project files into the binlog. For example, project.assets.json is a common request: #3529. global.json or nuget.config would be other examples. Some users may want to embed all *.cs files as well.

Add a special EmbedInBinlog item type recognized by BuildEventArgsWriter. If after evaluation the items contain EmbedInBinlog, all such items will be logged if the evaluated include points to a full file path that exists on disk. A given file can be mentioned more than once, but will only be embedded once. If a file doesn't exist or is empty, it will not be embedded. If the item is added from a target (after evaluation is done), it will not be embedded.

Checking for EmbedInBinlog is done in BuildEventArgsWriter to reuse the complicated and performance-optimized logic of iterating over all items.
KirillOsenkov added a commit to KirillOsenkov/sdk that referenced this pull request Apr 11, 2021
We're thinking of adding a way to embed arbitrary files in the binlog.
See dotnet/msbuild#6339

project.assets.json is commonly requested to be embedded. This would fix dotnet/msbuild#3529

Binlog size does increase from 3.5 MB -> 5 MB, so we'll need to think perhaps about making it off by default.
@KirillOsenkov
Copy link
Member Author

I've tested this with embedding project.assets.json:
dotnet/sdk#16840

Surprisingly, there's no perceptible slowdown of builds:

image

I'm guessing all the extra json file compressing cost is hidden on a background thread:

// enqueue the task to add a file and return quickly
// to avoid holding up the current thread
_currentTask = _currentTask.ContinueWith(t =>

Binlog size does increase however from 3.5 MB to 5 MB (the size of the files blob increases from 400 K to 1.9 MB). I suspect that for larger builds the size increase will grow slower than linearly:

  1. number of project.assets.json files grows with the number of projects
  2. the contents of all files are similar and probably compress better together

We'll need to think about whether to have it off by default.

@KirillOsenkov
Copy link
Member Author

KirillOsenkov commented Apr 11, 2021

I did more testing with project.assets.json on Roslyn.sln incremental and the build times are:

Roslyn no embed Roslyn embed
39.882 40.026
41.874 40.937
43.024 42.573
42.261 40.806
41.49 41.73
Average: 41.706 s Average: 41.214 s

Binlog size increased from 15.8 MB -> 21.1 MB. The inner files archive increased from 1.3 MB -> 6.7 MB.
A bit sad to lose a few MBs in size after all these savings, maybe project.assets.json should be off by default.

Note that this PR by itself has no adverse effects on either build speed or binlog size. The effects only show with dotnet/sdk#16840

Copy link
Member

@Forgind Forgind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

private const string EmbedInBinlogItemType = "EmbedInBinlog";

private void CheckForFilesToEmbed(string itemType, object itemList)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to check the EmbedFile event for null here and bail right away?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, but in our scenario we know it’s not null, so probably doesn’t matter either way. I was just using the ?.Invoke pattern out of habit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for scrutinizing and hope I'm not misreading the code. EmbedFile is subscribed to always, yes, but the handler EventArgsWriter_EmbedFile is a conditional no-op. And that condition looks real, i.e. projectImportsCollector does not always get created. Are you sure we can't hoist the check and avoid running the loop here in some scenarios?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see what you're saying. Sounds good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How's this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you!

private const string EmbedInBinlogItemType = "EmbedInBinlog";

private void CheckForFilesToEmbed(string itemType, object itemList)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you!

@Forgind Forgind added the merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now. label Apr 16, 2021
@benvillalobos benvillalobos merged commit 23f5b88 into main Apr 21, 2021
@benvillalobos benvillalobos deleted the dev/kirillo/embedInBinlog branch April 21, 2021 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants