Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GB18030] Fix "Unable to translate Unicode character" #10063

Merged
merged 5 commits into from Apr 25, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
12 changes: 12 additions & 0 deletions src/Build.UnitTests/Evaluation/Expander_Tests.cs
Expand Up @@ -4380,6 +4380,18 @@ public void PropertyFunctionCheckFeatureAvailability(string featureName, string
Assert.Equal(availability, result);
}

[Theory]
[InlineData("\u3407\ud840\udc60\ud86a\ude30\ud86e\udc0a\ud86e\udda0\ud879\udeae\u2fd5\u0023", 0, 3, "\u3407\ud840\udc60\ud86a\ude30")]
[InlineData("\u3407\ud840\udc60\ud86a\ude30\ud86e\udc0a\ud86e\udda0\ud879\udeae\u2fd5\u0023", 2, 5, "\ud86a\ude30\ud86e\udc0a\ud86e\udda0\ud879\udeae\u2fd5")]
public void SubstringByTextElements(string featureName, int start, int length, string expected)
{
var expander = new Expander<ProjectPropertyInstance, ProjectItemInstance>(new PropertyDictionary<ProjectPropertyInstance>(), FileSystems.Default);

var result = expander.ExpandIntoStringLeaveEscaped($"$([MSBuild]::SubstringByTextElements({featureName}, {start}, {length}))", ExpanderOptions.ExpandProperties, MockElementLocation.Instance);

Assert.Equal(expected, result);
}

[Fact]
public void PropertyFunctionIntrinsicFunctionGetCurrentToolsDirectory()
{
Expand Down
7 changes: 7 additions & 0 deletions src/Build/Evaluation/IntrinsicFunctions.cs
Expand Up @@ -9,6 +9,7 @@
using System.Runtime.Versioning;
using System.Text;
using System.Text.RegularExpressions;
using System.Globalization;

using Microsoft.Build.Framework;
using Microsoft.Build.Internal;
Expand Down Expand Up @@ -627,6 +628,12 @@ internal static bool AreFeaturesEnabled(Version wave)
return ChangeWaves.AreFeaturesEnabled(wave);
}

internal static string SubstringByTextElements(string input, int start, int length)
{
StringInfo stringInfo = new StringInfo(input);
return stringInfo.SubstringByTextElements(start, length);
}
ilonatommy marked this conversation as resolved.
Show resolved Hide resolved

internal static string CheckFeatureAvailability(string featureName)
{
return Features.CheckFeatureAvailability(featureName).ToString();
Expand Down
4 changes: 2 additions & 2 deletions src/Tasks/Microsoft.Common.CurrentVersion.targets
Expand Up @@ -388,8 +388,8 @@ Copyright (C) Microsoft Corporation. All rights reserved.

<PropertyGroup Condition="'$(MSBuildCopyMarkerName)' == ''">
<MSBuildCopyMarkerName>$(MSBuildProjectFile)</MSBuildCopyMarkerName>
<!-- For a long MSBuildProjectFile let's shorten this to 17 chars - using the first 8 chars of the filename and a filename hash. -->
<MSBuildCopyMarkerName Condition="'$(MSBuildCopyMarkerName.Length)' &gt; '17'">$(MSBuildProjectFile.Substring(0,8)).$([MSBuild]::StableStringHash($(MSBuildProjectFile)).ToString("X8"))</MSBuildCopyMarkerName>
<!-- For a long MSBuildProjectFile let's shorten this to 17 chars - using the first 8 codepoints of the filename and a filename hash. -->
<MSBuildCopyMarkerName Condition="'$(MSBuildCopyMarkerName.Length)' &gt; '17'">$([MSBuild]::SubstringByTextElements($(MSBuildProjectFile), 0, 8)).$([MSBuild]::StableStringHash($(MSBuildProjectFile)).ToString("X8"))</MSBuildCopyMarkerName>
Comment on lines -391 to +392

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are now text elements rather than codepoints. Which has consequences:

  • $(MSBuildCopyMarkerName.Length) can still be greater than 17 after this shortening.
  • If $(MSBuildProjectFile) is 👨‍👨‍👦‍👦👨‍👨‍👦‍👦.proj, then StringInfo.SubstringByTextElements(0, 8) will throw ArgumentOutOfRangeException. That string has 27 UTF-16 code units but only 7 text elements.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to have an [MSBuild]::SubstringByTextElements intrinsic function, then either it should check for out-of-range arguments itself, or there should be an [MSBuild]::CountTextElements function as well. Otherwise it's too difficult to use reliably in projects.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think what this really needs is a property function that returns all the text elements that entirely fit within the specified substring, with arguments given in UTF-16 code units. So that the resulting string never has more UTF-16 code units than requested. Because those are what matters for file name lengths in NTFS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are excellent points. We may get away with this functionality to be just a best-effort "make the file name reasonably long" but we definitely must fix the possible ArgumentOutOfRangeException and it's worth thinking about a more useful property function too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider moving the entire operation to a new, dedicated property function so we could just do

<MSBuildCopyMarkerName>$([MSBuild]::ShortUniqueProjectName())</MSBuildCopyMarkerName>

or similar?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still keep it general for other strings as well - there seems to be other usages in org on path args that might need revision: https://github.com/search?q=repo%3Adotnet%2Fsdk+path%3A**%2F*.targets+substring&type=code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After re-checking the list of places where we use substring, none of them require a PR.

<MSBuildCopyMarkerName>$(MSBuildCopyMarkerName).Up2Date</MSBuildCopyMarkerName>
</PropertyGroup>

Expand Down