Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize RAR's GetReferenceItems method #5929

Merged
merged 23 commits into from Feb 6, 2021
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
12 changes: 12 additions & 0 deletions src/Shared/CopyOnWriteDictionary.cs
Expand Up @@ -227,6 +227,18 @@ public void Add(string key, V value)
_backing = _backing.SetItem(key, value);
}

/// <summary>
/// Adds several value to the dictionary.
/// </summary>
public void SetItems(IEnumerable<KeyValuePair<string, V>> items)
{
_backing = _backing.SetItems(items);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetItems and Where will also need to be implemented on the .NET 3.5 version of ImmutableDictionary in C:\src\msbuild\src\MSBuildTaskHost\Immutable\ImmutableDictionary.cs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Why isn't Where implemented by LINQ as it is for most IEnumerables?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course! I stand corrected, only SetItems is missing.

}

public IEnumerable<KeyValuePair<string, V>> Where(Func<KeyValuePair<string, V>, bool> predicate)
{
return _backing.Where(predicate);
}
/// <summary>
/// Returns true if the dictionary contains the specified key.
/// </summary>
Expand Down
210 changes: 108 additions & 102 deletions src/Tasks/AssemblyDependency/ReferenceTable.cs
Expand Up @@ -2506,14 +2506,6 @@ internal void GetReferenceItems
out ITaskItem[] copyLocalFiles
)
{
primaryFiles = Array.Empty<ITaskItem>();
dependencyFiles = Array.Empty<ITaskItem>();
relatedFiles = Array.Empty<ITaskItem>();
satelliteFiles = Array.Empty<ITaskItem>();
serializationAssemblyFiles = Array.Empty<ITaskItem>();
scatterFiles = Array.Empty<ITaskItem>();
copyLocalFiles = Array.Empty<ITaskItem>();

var primaryItems = new List<ITaskItem>();
var dependencyItems = new List<ITaskItem>();
var relatedItems = new List<ITaskItem>();
Expand All @@ -2522,10 +2514,10 @@ internal void GetReferenceItems
var scatterItems = new List<ITaskItem>();
var copyLocalItems = new List<ITaskItem>();

foreach (AssemblyNameExtension assemblyName in References.Keys)
foreach (KeyValuePair<AssemblyNameExtension, Reference> kvp in References)
{
string fusionName = assemblyName.FullName;
Reference reference = GetReference(assemblyName);
AssemblyNameExtension assemblyName = kvp.Key;
Reference reference = kvp.Value;

// Conflict victims and badimages are filtered out.
if (!reference.IsBadImage)
Expand Down Expand Up @@ -2554,7 +2546,7 @@ internal void GetReferenceItems

if (reference.IsResolved)
{
ITaskItem referenceItem = SetItemMetadata(relatedItems, satelliteItems, serializationAssemblyItems, scatterItems, fusionName, reference, assemblyName);
ITaskItem referenceItem = SetItemMetadata(relatedItems, satelliteItems, serializationAssemblyItems, scatterItems, assemblyName.FullName, reference, assemblyName);

if (reference.IsPrimary)
{
Expand All @@ -2573,9 +2565,7 @@ internal void GetReferenceItems
}
}

primaryFiles = new ITaskItem[primaryItems.Count];
primaryItems.CopyTo(primaryFiles, 0);

primaryFiles = primaryItems.ToArray();
dependencyFiles = dependencyItems.ToArray();
relatedFiles = relatedItems.ToArray();
satelliteFiles = satelliteItems.ToArray();
Expand All @@ -2601,22 +2591,12 @@ internal void GetReferenceItems
private ITaskItem SetItemMetadata(List<ITaskItem> relatedItems, List<ITaskItem> satelliteItems, List<ITaskItem> serializationAssemblyItems, List<ITaskItem> scatterItems, string fusionName, Reference reference, AssemblyNameExtension assemblyName)
{
// Set up the main item.
ITaskItem referenceItem = new TaskItem();
TaskItem referenceItem = new TaskItem();
referenceItem.ItemSpec = reference.FullPath;
referenceItem.SetMetadata(ItemMetadataNames.resolvedFrom, reference.ResolvedSearchPath);

// Set the CopyLocal metadata.
if (reference.IsCopyLocal)
{
referenceItem.SetMetadata(ItemMetadataNames.copyLocal, "true");
}
else
{
referenceItem.SetMetadata(ItemMetadataNames.copyLocal, "false");
}

// Set the FusionName metadata.
referenceItem.SetMetadata(ItemMetadataNames.fusionName, fusionName);
referenceItem.SetMetadata(ItemMetadataNames.copyLocal, reference.IsCopyLocal ? "true" : "false");

// Set the Redist name metadata.
if (!String.IsNullOrEmpty(reference.RedistName))
Expand All @@ -2637,57 +2617,11 @@ private ITaskItem SetItemMetadata(List<ITaskItem> relatedItems, List<ITaskItem>
referenceItem.SetMetadata(ItemMetadataNames.imageRuntime, reference.ImageRuntime);
}

if (reference.IsWinMDFile)
// The redist root is "null" when there was no IsRedistRoot flag in the Redist XML
// (or there was no redist XML at all for this item).
if (reference.IsRedistRoot != null)
{
referenceItem.SetMetadata(ItemMetadataNames.winMDFile, "true");

// The ImplementationAssembly is only set if the implementation file exits on disk
if (reference.ImplementationAssembly != null)
{
if (VerifyArchitectureOfImplementationDll(reference.ImplementationAssembly, reference.FullPath))
{
if (string.IsNullOrEmpty(referenceItem.GetMetadata(ItemMetadataNames.winmdImplmentationFile)))
{
referenceItem.SetMetadata(ItemMetadataNames.winmdImplmentationFile, Path.GetFileName(reference.ImplementationAssembly));
}

// Add the implementation item as a related file
ITaskItem item = new TaskItem(reference.ImplementationAssembly);
// Clone metadata.
referenceItem.CopyMetadataTo(item);
// Related files don't have a fusion name.
item.SetMetadata(ItemMetadataNames.fusionName, "");
RemoveNonForwardableMetadata(item);

// Add the related item.
relatedItems.Add(item);
}
}

if (reference.IsManagedWinMDFile)
{
referenceItem.SetMetadata(ItemMetadataNames.winMDFileType, "Managed");
}
else
{
referenceItem.SetMetadata(ItemMetadataNames.winMDFileType, "Native");
}
}

// Set the IsRedistRoot metadata
if (reference.IsRedistRoot == true)
{
referenceItem.SetMetadata(ItemMetadataNames.isRedistRoot, "true");
}
else if (reference.IsRedistRoot == false)
{
referenceItem.SetMetadata(ItemMetadataNames.isRedistRoot, "false");
}
else
{
// This happens when the redist root is "null". This means there
// was no IsRedistRoot flag in the Redist XML (or there was no
// redist XML at all for this item).
referenceItem.SetMetadata(ItemMetadataNames.isRedistRoot, (bool)reference.IsRedistRoot ? "true" : "false");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cast stands out. Should it be

Suggested change
referenceItem.SetMetadata(ItemMetadataNames.isRedistRoot, (bool)reference.IsRedistRoot ? "true" : "false");
referenceItem.SetMetadata(ItemMetadataNames.isRedistRoot, reference.IsRedistRoot == true ? "true" : "false");

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both ways work. We did a null check just before this, so it's either true or false.

}

// If there was a primary source item, then forward metadata from it.
Expand Down Expand Up @@ -2716,14 +2650,14 @@ private ITaskItem SetItemMetadata(List<ITaskItem> relatedItems, List<ITaskItem>
}

// If the item originally did not have the implementation file metadata then we do not want to get it from the set of primary source items
// since the implementation file is something specific to the source item and not supposed to be propigated.
// since the implementation file is something specific to the source item and not supposed to be propagated.
if (!hasImplementationFile)
{
referenceItem.RemoveMetadata(ItemMetadataNames.winmdImplmentationFile);
}

// If the item originally did not have the ImageRuntime metadata then we do not want to get it from the set of primary source items
// since the ImageRuntime is something specific to the source item and not supposed to be propigated.
// since the ImageRuntime is something specific to the source item and not supposed to be propagated.
if (!hasImageRuntime)
{
referenceItem.RemoveMetadata(ItemMetadataNames.imageRuntime);
Expand All @@ -2737,68 +2671,63 @@ private ITaskItem SetItemMetadata(List<ITaskItem> relatedItems, List<ITaskItem>
}
}

if (reference.ReferenceVersion != null)
{
referenceItem.SetMetadata(ItemMetadataNames.version, reference.ReferenceVersion.ToString());
}
else
referenceItem.SetMetadata(ItemMetadataNames.version, reference.ReferenceVersion == null ? string.Empty : reference.ReferenceVersion.ToString());

// Unset fusionName so we don't have to unset it later.
referenceItem.RemoveMetadata(ItemMetadataNames.fusionName);

List<string> relatedFileExtensions = reference.GetRelatedFileExtensions();
List<string> satellites = reference.GetSatelliteFiles();
List<string> serializationAssemblyFiles = reference.GetSerializationAssemblyFiles();
string[] scatterFiles = reference.GetScatterFiles();
Comment on lines +2789 to +2792
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My gut feeling is that these will all be small enough that keeping them live all at once instead of in sequence won't be a memory problem. But do you have data on that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. How would I get such data? Build OrchardCore and...somehow check whether something is being paged out?

Dictionary<string, string> nonForwardableMetadata = null;
if (relatedFileExtensions.Count > 0 || satellites.Count > 0 || serializationAssemblyFiles.Count > 0 || scatterFiles.Length > 0)
{
referenceItem.SetMetadata(ItemMetadataNames.version, String.Empty);
// Unset non-forwardable metadata now so we don't have to do it for individual items.
nonForwardableMetadata = RemoveNonForwardableMetadata(referenceItem);
}

// Now clone all properties onto the related files.
foreach (string relatedFileExtension in reference.GetRelatedFileExtensions())
foreach (string relatedFileExtension in relatedFileExtensions)
{
ITaskItem item = new TaskItem(reference.FullPathWithoutExtension + relatedFileExtension);
// Clone metadata.
referenceItem.CopyMetadataTo(item);
// Related files don't have a fusion name.
item.SetMetadata(ItemMetadataNames.fusionName, "");
RemoveNonForwardableMetadata(item);

// Add the related item.
relatedItems.Add(item);
}

// Set up the satellites.
foreach (string satelliteFile in reference.GetSatelliteFiles())
foreach (string satelliteFile in satellites)
{
ITaskItem item = new TaskItem(Path.Combine(reference.DirectoryName, satelliteFile));
// Clone metadata.
referenceItem.CopyMetadataTo(item);
// Set the destination directory.
item.SetMetadata(ItemMetadataNames.destinationSubDirectory, FileUtilities.EnsureTrailingSlash(Path.GetDirectoryName(satelliteFile)));
// Satellite files don't have a fusion name.
item.SetMetadata(ItemMetadataNames.fusionName, "");
RemoveNonForwardableMetadata(item);

// Add the satellite item.
satelliteItems.Add(item);
}

// Set up the serialization assemblies
foreach (string serializationAssemblyFile in reference.GetSerializationAssemblyFiles())
foreach (string serializationAssemblyFile in serializationAssemblyFiles)
{
ITaskItem item = new TaskItem(Path.Combine(reference.DirectoryName, serializationAssemblyFile));
// Clone metadata.
referenceItem.CopyMetadataTo(item);
// serialization assemblies files don't have a fusion name.
item.SetMetadata(ItemMetadataNames.fusionName, "");
RemoveNonForwardableMetadata(item);

// Add the serialization assembly item.
serializationAssemblyItems.Add(item);
}

// Set up the scatter files.
foreach (string scatterFile in reference.GetScatterFiles())
foreach (string scatterFile in scatterFiles)
{
ITaskItem item = new TaskItem(Path.Combine(reference.DirectoryName, scatterFile));
// Clone metadata.
referenceItem.CopyMetadataTo(item);
// We don't have a fusion name for scatter files.
item.SetMetadata(ItemMetadataNames.fusionName, "");
RemoveNonForwardableMetadata(item);

// Add the satellite item.
scatterItems.Add(item);
Expand All @@ -2818,6 +2747,61 @@ private ITaskItem SetItemMetadata(List<ITaskItem> relatedItems, List<ITaskItem>
}
}

if (reference.IsWinMDFile)
{
// The ImplementationAssembly is only set if the implementation file exits on disk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Typo exits.

if (reference.ImplementationAssembly != null)
{
if (VerifyArchitectureOfImplementationDll(reference.ImplementationAssembly, reference.FullPath))
{
// Add the implementation item as a related file
ITaskItem item = new TaskItem(reference.ImplementationAssembly);
// Clone metadata.
referenceItem.CopyMetadataTo(item);

// Add the related item.
relatedItems.Add(item);

referenceItem.SetMetadata(ItemMetadataNames.winmdImplmentationFile, Path.GetFileName(reference.ImplementationAssembly));
// This may have been set previously (before it was removed so we could more efficiently set metadata on the various related files).
// This version should take priority, so we remove it from nonForwardableMetadata if it's there to prevent the correct value from
// being overwritten.
nonForwardableMetadata?.Remove(ItemMetadataNames.winmdImplmentationFile);
}
}

// This may have been set previously (before it was removed so we could more efficiently set metadata on the various related files).
// This version should take priority, so we remove it from nonForwardableMetadata if it's there to prevent the correct value from
// being overwritten.
nonForwardableMetadata?.Remove(ItemMetadataNames.winMDFileType);
if (reference.IsManagedWinMDFile)
{
referenceItem.SetMetadata(ItemMetadataNames.winMDFileType, "Managed");
}
else
{
referenceItem.SetMetadata(ItemMetadataNames.winMDFileType, "Native");
}

// This may have been set previously (before it was removed so we could more efficiently set metadata on the various related files).
// This version should take priority, so we remove it from nonForwardableMetadata if it's there to prevent the correct value from
// being overwritten.
nonForwardableMetadata?.Remove(ItemMetadataNames.winMDFile);
referenceItem.SetMetadata(ItemMetadataNames.winMDFile, "true");
}

// Set the FusionName late, so we don't copy it to the derived items, but it's still available on referenceItem.
referenceItem.SetMetadata(ItemMetadataNames.fusionName, fusionName);

// nonForwardableMetadata should be null here if relatedFileExtensions, satellites, serializationAssemblyFiles, and scatterFiles were all empty.
if (nonForwardableMetadata != null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment please

{
foreach (KeyValuePair<string, string> kvp in nonForwardableMetadata)
{
referenceItem.SetMetadata(kvp.Key, kvp.Value);
}
}

return referenceItem;
}

Expand Down Expand Up @@ -2951,15 +2935,37 @@ internal static UInt16 ReadMachineTypeFromPEHeader(string dllPath)
/// <summary>
/// Some metadata should not be forwarded between the parent and child items.
/// </summary>
private static void RemoveNonForwardableMetadata(ITaskItem item)
private static Dictionary<string, string> RemoveNonForwardableMetadata(ITaskItem item)
ladipro marked this conversation as resolved.
Show resolved Hide resolved
{
Dictionary<string, string> removedMetadata = new Dictionary<string, string>();
string meta = item.GetMetadata(ItemMetadataNames.winmdImplmentationFile);
if (!String.IsNullOrEmpty(meta))
{
removedMetadata.Add(ItemMetadataNames.winmdImplmentationFile, meta);
}
item.RemoveMetadata(ItemMetadataNames.winmdImplmentationFile);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Refactor these statements into a local function to be called 4 times?

meta = item.GetMetadata(ItemMetadataNames.imageRuntime);
if (!String.IsNullOrEmpty(meta))
{
removedMetadata.Add(ItemMetadataNames.imageRuntime, meta);
}
item.RemoveMetadata(ItemMetadataNames.imageRuntime);
meta = item.GetMetadata(ItemMetadataNames.winMDFile);
if (!String.IsNullOrEmpty(meta))
{
removedMetadata.Add(ItemMetadataNames.winMDFile, meta);
}
item.RemoveMetadata(ItemMetadataNames.winMDFile);
if (!Traits.Instance.EscapeHatches.TargetPathForRelatedFiles)
{
meta = item.GetMetadata(ItemMetadataNames.targetPath);
if (!String.IsNullOrEmpty(meta))
{
removedMetadata.Add(ItemMetadataNames.targetPath, meta);
}
item.RemoveMetadata(ItemMetadataNames.targetPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this one has to be next to the add to the new list, can you move the others to be next to their equivalents too?

}
return removedMetadata;
}

/// <summary>
Expand Down
23 changes: 19 additions & 4 deletions src/Utilities/TaskItem.cs
Expand Up @@ -30,7 +30,7 @@ public sealed class TaskItem :
#if FEATURE_APPDOMAIN
MarshalByRefObject,
#endif
ITaskItem, ITaskItem2
ITaskItem2
{
#region Member Data

Expand Down Expand Up @@ -297,10 +297,25 @@ public void CopyMetadataTo(ITaskItem destinationItem)

if (_metadata != null)
{
// Avoid a copy if we can
if (destinationItem is TaskItem destinationAsTaskItem && destinationAsTaskItem.Metadata == null)
if (destinationItem is TaskItem destinationAsTaskItem)
{
destinationAsTaskItem.Metadata = _metadata.Clone(); // Copy on write!
CopyOnWriteDictionary<string> copiedMetadata;
// Avoid a copy if we can, and if not, minimize the number of items we have to set.
if (destinationAsTaskItem.Metadata == null)
{
copiedMetadata = _metadata.Clone(); // Copy on write!
}
else if (destinationAsTaskItem.Metadata.Count < _metadata.Count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How likely are we to hit the case where destinationAsTaskItem.Metadata.Count < _metadata.Count and both metadata count are very large? I don't think I've seen items with a standout number of metadata, but MSBuild has surprised me before.

Do we happen to have any idea of a "realistic average" number of metadata per TaskItem that comes through here? Something more detailed than just assuming it's the number of well-known metadata?

{
copiedMetadata = _metadata.Clone(); // Copy on write!
copiedMetadata.SetItems(destinationAsTaskItem.Metadata);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is doing the same thing as the else right below. I would expect the !IsNullOrEmpty check here as well.

The else below:

  1. Take all of destination item's metadata.
  2. Overwrite with this item's metadata that are missing or empty on the destination item.

This block:

  1. Take all of this item's metadata.
  2. Overwrite with all of destination item's metadata.

For a missing destination metadatum this behaves the same, but for an empty one the SetItem call in this block will erroneously overwrite it with the empty value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. I'm wondering if we're going at this the wrong way, though, since that would mean we'd need another Where statement to filter out non-empty metadata, which would make this optimization noticeably less of an optimization. Empty metadata should be equivalent (in all cases) to missing metadata, I think. If that's true, it would probably be more efficient to maintain an invariant that TaskItems don't have empty metadata. The COW dictionary storing the metadata is private, so that isn't a problem, but it's theoretically a breaking change unless we keep a separate list for "metadata added as empty" for the unfortunately public MetadataNames (and MetadataCount). What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't go as far as to introduce a separate list for empty metadata unless it proves impactful. Can you measure how much this optimization is gaining us with and without handling empty?

Copy link
Member Author

@Forgind Forgind Dec 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I measured 3 scenarios (building from a clean state (I appended /restore to the command line for this one), building with caches but not processes to connect to, and building with both) for 3 versions of TaskItem (as it is in master, with this change plus a Where statement on the else if branch, and with all the empty metadata stored in a HashSet once each for building OrchardCore. Note that I kept the other optimizations in this PR for all three versions of TaskItem. I used my computer instead of a VM, mostly because my VM seems to be slow...but I also tried to avoid doing anything too intense in the background. Note that I forgot to use -m for the first several tests, so I continued to not use it for the last few. If you want new results with having that enabled for all of them, I can do that.

I have full results saved away if you want to look. My first takeaway was that caches really matter. It took more than twice as long to build in every case if I deleted them and restored them. In contrast, presumably because OrchardCore is big, and I neglected to enable -m, having processes already running only saved about 170ms on average.

As far as differences between the versions of TaskItem, here are some times for the entire build in ms:

MSBuild execution times Clean build Process not running Process running
Master 108475 49657 49704
Where statements 108253 48140 47949
Empty HashSet 105464 49105 48708

Since it occurred to me that this has impacts outside RAR, I don't think just looking at RAR time makes sense. Are there other data you'd like to see? I'm not sure why the Where statements was so much worse than keeping an empty list for the clean state build—it's possible that was an outlier, since I only ran each of these once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I was actually expecting you to measure this particular aspect with a microbenchmark. Basically, see if the SetItems call is helping (even with the Where) given the expected metadata count. I second Ben's question below about what the realistic number of metadata is. Maybe you could add a logging statement here while building OrchardCore and see where the breaking point is relative to the distribution of metadata count seen during the build. (Assuming there is a breaking point and the new code is actually a bit slower if the number of metadata is small.)

Measuring the entire build end-to-end would have to be repeated several times to get trustworthy results. As you wrote, it could have been an outlier and, honestly, this optimization is likely going to be well within noise when running the entire thing.

Copy link
Member Author

@Forgind Forgind Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Merging with @benvillalobos's comment) I'm currently a bit confused by the results I got when I tried to test this. I put in a simple check of how many custom metadata each taskitem has when its metadata is copied over, as well as the amount of metadata on the destination. The answer seemed to be that the destination never had any custom metadata for either MSBuild.sln or OrchardCore.sln, in which case we would never hit the case I optimized, and the time would be almost identical between them unless the other changes I made (which could have a minor effect, but I didn't expect to be really relevant) had much more of a positive effect than I'd expected. The same test suggested that the source TaskItems had 20-50 pieces of metadata, of which 15 were not custom metadata, so these wouldn't be huge numbers, but this operation also got called a lot, which might mean that a small optimization could have somewhat noticeable effects. What do you make of that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good argument against pursuing this optimization further. If you really wanted to be sure, you could run the build under a profiler to see how much time we're spending in this function - which would be the upper limit of what any micro-optimization here could give us.

}
else
{
copiedMetadata = destinationAsTaskItem.Metadata.Clone();
copiedMetadata.SetItems(Metadata.Where(entry => !destinationAsTaskItem.Metadata.ContainsKey(entry.Key)));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unsure between using Where or setting them one-by-one in this case. I did some benchmarking, and this is what I got:
image
Note that for each test case, I added NumStrings / 10 items to an ImmutableDictionary with NumStrings items in it previously.

You can see that using Where had fewer allocations for all three sizes. (Base is creating but not combining the dictionaries, so Where actually allocated very little.) It lost on time, though, unless the dataset was large, and I don't have data as to whether that's expected sometimes, always, or never. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you instead do copiedMetadata = Metadata.Clone().SetItems(destinationAsTaskItem.Metadata)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and honestly, there are probably a lot of cases in which that would be better than what I currently have. What I'm trying to optimize for is adding metadata from a small dictionary to a large dictionary, and in that case, it would be better to add a few items individually rather than using SetItems to set a lot of items. I can run a test for that, though, and see if it's optimized nicely, in which case I'll switch to that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like doing it that way was 50-75% slower if the adding dictionary had 10x as many items. Next, trying it with a smaller ratio...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried comparing adding the larger dictionary to the smaller dictionary for ratios of 1, 2, 5, and 8, and it looks like setting with Where was faster for all but 1x size. I currently think that's the best option.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original non-optimized code down in the else branch uses string.IsNullOrEmpty(value) to decide if the metadatum should be overwritten. It doesn't seem to match your new logic here - you only check if the key exists in the metadata dictionary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks!

}
destinationAsTaskItem.Metadata = copiedMetadata;
}
else
{
Expand Down