Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GB18030] Fix "Unable to translate Unicode character" #10063
[GB18030] Fix "Unable to translate Unicode character" #10063
Changes from all commits
d9ad357
e0fea3a
8fc3eed
259728c
a7fca5a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are now text elements rather than codepoints. Which has consequences:
$(MSBuildCopyMarkerName.Length)
can still be greater than 17 after this shortening.$(MSBuildProjectFile)
is👨👨👦👦👨👨👦👦.proj
, then StringInfo.SubstringByTextElements(0, 8) will throw ArgumentOutOfRangeException. That string has 27 UTF-16 code units but only 7 text elements.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to have an [MSBuild]::SubstringByTextElements intrinsic function, then either it should check for out-of-range arguments itself, or there should be an [MSBuild]::CountTextElements function as well. Otherwise it's too difficult to use reliably in projects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think what this really needs is a property function that returns all the text elements that entirely fit within the specified substring, with arguments given in UTF-16 code units. So that the resulting string never has more UTF-16 code units than requested. Because those are what matters for file name lengths in NTFS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are excellent points. We may get away with this functionality to be just a best-effort "make the file name reasonably long" but we definitely must fix the possible ArgumentOutOfRangeException and it's worth thinking about a more useful property function too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider moving the entire operation to a new, dedicated property function so we could just do
or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd still keep it general for other strings as well - there seems to be other usages in org on path args that might need revision: https://github.com/search?q=repo%3Adotnet%2Fsdk+path%3A**%2F*.targets+substring&type=code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After re-checking the list of places where we use
substring
, none of them require a PR._NormalizedPublishDirHash = 13cbcd0c3203541f1c707c5ea0b45e9f20f4d3cf71f0e2b68e723cab7e9427c3
- no surrogates.Identity
starts with a 7-char string:wwwroot/
, so theSubstring(8)
will never cut a surrogate in half.