Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NET 8 developers can verify more APIs for correct usage to speed up their development #78442

Closed
buyaa-n opened this issue Nov 16, 2022 · 16 comments
Assignees
Labels
area-Meta User Story A single user-facing feature. Can be grouped under an epic.
Milestone

Comments

@buyaa-n
Copy link
Member

buyaa-n commented Nov 16, 2022

With the Roslyn Analyzers we built in .NET 5, 6 and 7, we've found that developers benefit from analyzers that verify correct usage of .NET Libraries APIs. We plan to put more effort on adding valuable analyzers in .NET 8, we will continue working with the community to foster contributions into our collection of analyzers and code fixers.

Community contributors: If you'd like to work on one of the issues marked as "up for grabs", please add a comment directly on that issue asking to get it assigned to you. Please check the getting started guide for reference about how to start and what's need to be done.

Analyzers Planned for .NET 8

Approved analyzers

No Status Issue Estimate
1 ✔️ by @CollinAlpert Prefer Dictionary<K, V>.TryAddValue(key) over guarded Add(key) Medium
2 ✔️ by @CollinAlpert Prefer .Length/Count/IsEmpty over Any() Medium
3 ✔️ by @fowl2 Do not use OfType() with impossible types Small
4 ✔️ by @steveberdy Extract array of const to static readonly field Medium
5 in PR by @psxvoid Constructor parameters should match property names Medium
6 in PR by @NewellClark Prevent wrong usage of string Split Medium
7 in PR by @NewellClark Prefer static ReadOnlySpan properties over static readonly byte[] fields Medium
8 in PR by @NewellClark Do not pass Utf8JsonReader by value Medium
9 ✔️ @geeknoid Recommend use of concrete types to maximize devirtualization potential Medium
10 ✔️ @stephentoub Convert argument null checks to ArgumentNullException.ThrowIfNull Medium
11 ✔️ @Youssef1313 Using StartsWith instead of IndexOf == 0 Medium
12 @NewellClark Avoid implicitly allocating params arrays in loops Medium
13 ✔️ @carlossanlop Recommend string.ToLower{Invariant}().Contains/IndexOf/StartsWith("lowercasestring") be replaced by case-insensitive comparison Medium
14 up for grabs Usage of ToString in logging source generator method call sites Medium
15 ✔️ @carlossanlop string.ToLower() == otherString.ToLower() Small
16 ✔️ @mrahhal Recommend string.StartsWith('c') instead of string.StartsWith("c") Medium
17 up for grabs Replacing direct LoggerExtensions.Log* usage with source-generated logging methods Medium
18 up for grabs Prefer ValueTuple over Tuple Medium
19 ✔️ @mpidash Report inefficient use of sets Small
20 up for grabs Introduce an analyzer to recommend the more efficient string splitting features in .NET 8 Medium
21 partially implemented @stephentoub Recommend upgrading to CompositeFormat for enhanced performance Large
22 up for grabs Prefer Dictionary.Remove(key, out value) over Dictionary.this[key], followed by Dictionary.Remove(key) Medium
23 up for grabs Hide Thread.VolatileRead and Thread.VolatileWrite Medium
24 in PR by @mpidash Simplify calls to string.Substring and Memory/Span.Slice Small
25 @mpidash Detect the problem of using a System.Type overload instead of the generic overload Medium
26 up for grabs Treat calls to ROS op_Equality as warning Small
27 up for grabs Detect non-cancelable Task.Delay passed to Task.WhenAny Medium
28 @tannergooding Upgrade from platform specific to cross platform intrinsics Small
29 @tannergooding Suggest more optimal SIMD patterns where applicable Medium
30 up for grabs Adjust/simplify code for numeric IntPtr Medium
31 up for grabs Analyzer/fixer for converting Stream.Read calls to ReadAtLeast and ReadExactly Medium
32 up for grabs Exceptions thrown inside async methods should be wrapped by Task.FromException Medium
33 @CollinAlpert Make types declared in an executable internal Small
34 up for grabs Unawaited / returned tasks created with a disposable in using block for that disposable Large
35 @pedrodeandrade Async methods that don't take a CancellationToken Small
36 up for grabs Use IndexOfAnyValues instead of inlined or cached array Small
37 ✔️ @Jozkee Add an analyzer that warns on single-use JsonSerializerOptions instances

Other analyzer proposals that need to be prepared for API review

  1. Add [CallerMustBeUnsafe] attribute to denote APIs which should be called in an unsafe block #31354
  2. Warn about ArrayPool.Return without clearArray specified when T is not a value type or has references #71698
  3. Roslyn analyzer/fixer: Simplify invocations that receive a start/offset and a count/length #35981
  4. Analyzer proposal: EventSource log argument guarding #45215
  5. NonCopyable structs attribute and analyzer #50389
  6. Analyzer: Validate literal arguments to StringSyntaxAttribute parameters/members #64009
  7. Add analyzer for StringBuilder[int] indexer #64545
  8. [Analyzer]: Regex analyzers #68962
  9. Warnings for in/readonly ref #77625
  10. [Analyzer] Lift arrays of literals to static fields #78398

CC @stephentoub @jeffhandley @ericstj @carlossanlop

@buyaa-n buyaa-n added area-Meta User Story A single user-facing feature. Can be grouped under an epic. labels Nov 16, 2022
@buyaa-n buyaa-n added this to the 8.0.0 milestone Nov 16, 2022
@ghost
Copy link

ghost commented Nov 16, 2022

Tagging subscribers to this area: @dotnet/area-meta
See info in area-owners.md if you want to be subscribed.

Issue Details

With the Roslyn Analyzers we built in .NET 5, 6 and 7, we've found that developers benefit from analyzers that verify correct usage of .NET Libraries APIs. We plan to put more effort on adding valuable analyzers in .NET 8, we will continue working with the community to foster contributions into our collection of analyzers and code fixers.

Community contributors: If you'd like to work on one of the issues marked as "up for grabs", please add a comment directly on that issue asking to get it assigned to you. Please check the getting started guide for reference about how to start and what's need to be done.

Analyzers Planned for .NET 8

Approved analyzers

Status Issue Estimate
in PR by @CollinAlpert Prefer Dictionary<K, V>.TryAddValue(key) over guarded Add(key) Medium
up for grabs Hide Thread.VolatileRead and Thread.VolatileWrite Medium
in PR by @fowl2 Do not use OfType() with impossible types Small
in PR by @steveberdy Extract array of const to static readonly field Medium
in PR by @psxvoid Constructor parameters should match property names Medium
in PR by @NewellClark Prevent wrong usage of string Split Medium
up for grabs Prefer Dictionary.Remove(key, out value) over Dictionary.this[key], followed by Dictionary.Remove(key) Medium
@NewellClark Avoid implicitly allocating params arrays in loops Medium
in PR by @NewellClark Prefer static ReadOnlySpan properties over static readonly byte[] fields Medium
in PR by @NewellClark Do not pass Utf8JsonReader by value Medium
up for grabs Convert argument null checks to ArgumentNullException.ThrowIfNull Medium
up for grabs Simplify calls to string.Substring and Memory/Span.Slice Small
up for grabs Do not discard results of methods that shouldn't be ignored Large
up for grabs Detect the problem of using a System.Type overload instead of the generic overload Medium
up for grabs Treat calls to ROS op_Equality as warning Small
up for grabs Detect non-cancelable Task.Delay passed to Task.WhenAny Medium

Analyzer proposals planned for .NET 8 that ready API review

Other analyzer proposals planned for .NET 8 that needs to be prepared for API review

  1. Add [CallerMustBeUnsafe] attribute to denote APIs which should be called in an unsafe block #31354
  2. Analyzer proposal: EventSource log argument guarding #45215
  3. NonCopyable structs attribute and analyzer #50389
  4. Analyzer: Validate literal arguments to StringSyntaxAttribute parameters/members #64009
  5. Add analyzer for StringBuilder[int] indexer #64545
  6. Add an analyzer that warns on single-use JsonSerializerOptions instances #65396
  7. [Analyzer]: Regex analyzers #68962
  8. [Analyzer Proposal]: Analyzer/fixer for converting Stream.Read calls to ReadAtLeast and ReadExactly #69159
  9. [Analyzer/fixer suggestion]: Exceptions thrown inside async methods should be wrapped by Task.FromException #70905
  10. Warnings for in/readonly ref #77625

CC @stephentoub @jeffhandley @ericstj

Author: buyaa-n
Assignees: -
Labels:

area-Meta, User Story

Milestone: 8.0.0

@aradalvand
Copy link

aradalvand commented Jul 20, 2023

The .Any() => .Length != 0 warning makes no sense whatsoever.

.Any() should check under the hood whether the incoming IEnumerable is in fact an array, in which case it should do .Length != 0 automatically. LINQ methods already do this sort of optimization-based-on-the-type extensively, as far as I know.

We shouldn't be telling users to change their code in these cases, .Any() is without doubt clearer and more readable than .Length != 0. This was such a strange decision.

@stephentoub
Copy link
Member

.Any() should check under the hood whether the incoming IEnumerable is in fact an array, in which case it should do .Length != 0 automatically. LINQ methods already do this sort of optimization-based-on-the-type extensively, as far as I know.

It does, and it's still more costly than the caller just doing it itself.

@aradalvand
Copy link

aradalvand commented Jul 20, 2023

It does

Good to know.

and it's still more costly than the caller just doing it itself.

Obviously, but then again if Any() merits an analyzer/warning, so does every other LINQ method that does type-checking under the hood — in cases where the caller knows the specific type of the collection, no? Why make an exception for .Any() in this regard?
If you're in such a hot path that even the performance overhead of .Any() makes a difference, you shouldn't be using any LINQ anyway. But in most cases, the code clarity that LINQ methods offer (including .Any()) trumps the performance cost, that's why they exist in the first place. I don't understand why .Any() is being singled out here.

@mrahhal
Copy link

mrahhal commented Jul 20, 2023

I don't understand why .Any() is being singled out here.

Likely because it's the most common usage in linq.

But I agree that, at least, this probably should have a lower diagnostic level than Suggestion by default. Most code won't benefit from this particular optimization, any code that is sensitive enough to benefit would probably be already aware of this. I also feel that Any() has become more readable overtime than Length/Count == 0. Maybe because it's a common interface that doesn't need to change when you're switching impls from arrays to collections and vice versa as you're developing. Having largely irrelevant squiggles randomly popping up isn't pretty, I always end up silencing this particular family of diags.

@Bellarmine-Head
Copy link

Just found this (via Nick Chapsas).

Re. #78606 - it's a pity that the word "ordinal" does not appear even once... especially wrt ignore-case. I think it should have featured in the discussion.

This table looks wrong to me. In every case, the assumption is that the string is linguistic, or is "linguistically relevant".

I refer you to:- https://learn.microsoft.com/en-us/dotnet/standard/base-types/best-practices-strings

Specifially I refer you to: https://learn.microsoft.com/en-us/dotnet/standard/base-types/best-practices-strings#choosing-a-stringcomparison-member-for-your-method-call

For over 20 years, my case-insensitve string comparisons have been wrt

  • Case-insensitive internal identifiers.
  • Case-insensitive identifiers in standards such as XML and HTTP.
  • File paths.
  • Registry keys and values.
  • Environment variables.
  • Resource identifiers (for example, handle names).
  • Case-insensitive security-related settings.

ergo... StringComparison.OrdinalIgnoreCase.

The number of times I have had to worry about case-insensitivity in linguistic data that's input by the user or displayed to the user: zero.

The more common case (by far) for case-insensitive string comparisons is for non-linguistic strings. And getting it right for non-linguistic strings is important for security.

In every single case in the table, I'd go for OrdinalIgnoreCase as the default analyzer recommendation, with the caveat that "if the string is linguistic, go for culture-ignore-case".

@mirmostafa
Copy link

Instead of sending warning when using .Any() in arrays and list, why not to add the following bunch of code to the .Net sources:

    public static bool Any<T>(this IList<T> source) =>
        source.Count != 0;
    public static bool Any<T>(this T[] source) =>
        source.Length != 0;
    public static bool Any<T>(this ICollection source) =>
        source.Count != 0;

@buyaa-n
Copy link
Member Author

buyaa-n commented Aug 14, 2023

As .NET 8 development is wrapping up closing this issue. Thank you, all the contributors, for the hard work and hope you will continue contributing into new analyzers in .NET 9 as we will continue the effort in .NET 9, analyzers that haven't completed with this issue will be moved to a new issue for .NET 9.

@buyaa-n buyaa-n closed this as completed Aug 14, 2023
@Bellarmine-Head
Copy link

Bellarmine-Head commented Aug 15, 2023

@buyaa-n

I am concerned that analyzers #78606 and dotnet/roslyn-analyzers#6720 and possibly others that are related have been implemented incorrectly, and I left multiple comments 3 weeks ago explaining why - all of which have been ignored.

I would like to see some comment on these concerns. We don't want to wrap things up in an incorrect state.

My concerns are expressed here:-

@buyaa-n
Copy link
Member Author

buyaa-n commented Aug 15, 2023

@Bellarmine-Head I assume your comments related to this issue which is not closed and still have 8.0 milestone. I expect @carlossanlop would take care of it within 8.0 timeframe.

@Bellarmine-Head
Copy link

@buyaa-n I added my comments to #89740 because you asked me to. I'm not sure what the issue is all about tbh (!) nor how it relates to my main concerns over issues:-

... which are closed/merged, but I believe have been implemented incorrectly for the reasons that I have stated multiple times.

It would be great if @carlossanlop and other experts could comment on these concerns.

@Bellarmine-Head
Copy link

Bellarmine-Head commented Aug 17, 2023

@buyaa-n

To expand on my concerns...

Let us take the example of #78606 which was opened by @stephentoub

It's fixed by dotnet/roslyn-analyzers#6662

Nowhere in the discussion in those two pages is the word "ordinal". This was the first red flag for me.

Now, to take a small example, it seems to me that the analyzer will recommend that where you previously did:-

str.ToLowerInvariant().IndexOf(substr)

you should now do:-

str.IndexOf(substr, StringComparison.InvariantCultureIgnoreCase)

This seems reasonable. What you had before, you have afterwards, except arguably in a better form.

But you have overlooked something. This section in this long-standing (and correct) article, which has been around in one form or another since 2005, is very clear on the matter. For comparing:-

  • Case-insensitive internal identifiers.
  • Case-insensitive identifiers in standards such as XML and HTTP.
  • File paths.
  • Registry keys and values.
  • Environment variables.
  • Resource identifiers (for example, handle names).
  • Case-insensitive security-related settings.

you should be using StringComparison.OrdinalIgnoreCase.

For me this boils down to:-

"Case-insensitive string comparisons should always use StringComparison.OrdinalIgnoreCase except in the very rare cases where the strings you are comparing are linguistic in some shape or form."

Examples of linguistic strings: a blog piece the user has typed in; a date-time formatted in the user's culture for end-user display.

Why you would ever want to compare linguistic strings case-insensitively, I don't know. In 20 years, I've never done it. Comparing (non-linguistic) identifiers, on the other hand, is something you do all day, every day.

If our analyzer (and related ones; see links in my post above) doesn't even consider or mention or recommend the StringComparison.OrdinalIgnoreCase option, then in my view it has to be deemed as grossly deficient and represents three problems:-

Now, all of this is my opinion. The lack of response to these concerns, which I have written about in sundry places (see links above) concerns me.

Talk of wrapping things up for .NET 8 without addressing this issue concerns me.

I'm putting in a lot of effort here in an attempt to help... and I'm not getting much/any response from it.

@MichalPetryka
Copy link
Contributor

@buyaa-n

To expand on my concerns...

Let us take the example of #78606 which was opened by @stephentoub

It's fixed by dotnet/roslyn-analyzers#6662

Nowhere in the discussion in those two pages is the word "ordinal". This was the first red flag for me.

Now, to take a small example, it seems to me that the analyzer will recommend that where you previously did:-

str.ToLowerInvariant().IndexOf(substr)

you should now do:-

str.IndexOf(substr, StringComparison.InvariantCultureIgnoreCase)

This seems reasonable. What you had before, you have afterwards, except arguably in a better form.

But you have overlooked something. This section in this long-standing (and correct) article, which has been around in one form or another since 2005, is very clear on the matter. For comparing:-

  • Case-insensitive internal identifiers.
  • Case-insensitive identifiers in standards such as XML and HTTP.
  • File paths.
  • Registry keys and values.
  • Environment variables.
  • Resource identifiers (for example, handle names).
  • Case-insensitive security-related settings.

you should be using StringComparison.OrdinalIgnoreCase.

For me this boils down to:-

"Case-insensitive string comparisons should always use StringComparison.OrdinalIgnoreCase except in the very rare cases where the strings you are comparing are linguistic in some shape or form."

Examples of linguistic strings: a blog piece the user has typed in; a date-time formatted in the user's culture for end-user display.

Why you would ever want to compare linguistic strings case-insensitively, I don't know. In 20 years, I've never done it. Comparing (non-linguistic) identifiers, on the other hand, is something you do all day, every day.

If our analyzer (and related ones; see links in my post above) doesn't even consider or mention or recommend the StringComparison.OrdinalIgnoreCase option, then in my view it has to be deemed as grossly deficient and represents three problems:-

Now, all of this is my opinion. The lack of response to these concerns, which I have written about in sundry places (see links above) concerns me.

Talk of wrapping things up for .NET 8 without addressing this issue concerns me.

I'm putting in a lot of effort here in an attempt to help... and I'm not getting much/any response from it.

The issue would be that changing stuff to Ordinal in such case would be a visible behavioural change, which is something that analyzers shouldn't always promote.

@Bellarmine-Head
Copy link

Bellarmine-Head commented Aug 17, 2023

@MichalPetryka

The issue would be that changing stuff to Ordinal in such case would be a visible behavioural change, which is something that analyzers shouldn't always promote.

I suspected that something of this sort was involved in the reasoning behind the analyzers, and I get your point.

The thing is, if you write code like:-

    if (stringOne.ToUpperInvariant() == "SOME_CONSTANT")

or

    if (stringOne.ToUpperInvariant() == stringTwo.ToUpperInvariant())

then you don't have any kind of OrdinalIgnoreCase option or opportunity (as far as I'm aware). It's always a cultural/linguistic comparison, even if it does use the InvariantCulture.

We've all written the code above (in some form or another) a thousand times, and it's been fine!!

But converting the code to the better String.Equals and StringComparison.XYZ alternative, as the analyzers correctly promote, gives programmers the much-needed opportunity to think about the context... are the strings cultural/linguistic in any way? or are they identifiers? If identifiers, then actually a much better way if we need to be case-insensitive (and the recommended Microsoft way for security and correct operation generally, since 2005) is to use OrdinalIgnoreCase.

For the analyzers to not mention OrdinalIgnoreCase seems to me to occlude this option, and seems to not urge programmers to think about the context. Which I think would be a big shame... an opportunity will have been missed. I worry that we will start to see a proliferation of instances of InvariantCultureIgnoreCase in codebases when OrdinalIgnoreCase is so obviously the correct choice.

I can tell you that in 20 years I have never had cause to use StringComparison options:-

  • CurrentCulture
  • CurrentCultureIgnoreCase
  • InvariantCulture
  • InvariantCultureIgnoreCase

As I said above, when on earth would you knowingly want to compare cultural / linguistically-relevant strings? Probably never, even though (oh the irony!) you probably did it all the time with:-

    if (stringOne.ToUpperInvariant() == stringTwo.ToUpperInvariant())

and everything was fine.

Bottom line: when comparing two strings case-insensitively, OrdinalIgnoreCase is always the correct choice, except for those ultra-rare occasions when you really do want to compare two linguistic strings in this way, even if you used to do if (stringOne.ToUpperInvariant() == stringTwo.ToUpperInvariant()) in the bad old days.

The analyzers should at least clue developers into the OrdinalIgnoreCase option, and in what contexts it should be used.

@Bellarmine-Head
Copy link

Bellarmine-Head commented Aug 17, 2023

"Hey programmer...

You're using:-

if (stringOne.ToUpperInvariant() == stringTwo.ToUpperInvariant())

to do a case-insensitive comparison of two strings, which is fine up to a point. But we'd like you use:-

if (stringOne.Equals(stringTwo, StringComparison.InvariantCultureIgnoreCase))

instead, because [insert reasons].

This is better [see reasons], and is the same as what you were doing before. So it's a safe change to make. But wait: if your strings are identifiers (including file paths, Registry keys, case-insensitive internal identifiers, case-insensitive identifiers in standards such as XML and HTTP and case-insensitive security-related settings, etc. see here for the complete list then an even better option is to use:-

if (stringOne.Equals(stringTwo, StringComparison.OrdinalIgnoreCase))

but doing so will require you to test your code carefully in case the behaviour has changed.

For new code, it's best to use the StringComparison.OrdinalIgnoreCase option for comparing case-insensitive identifiers from the outset.

Thanks."

@Bellarmine-Head
Copy link

For case-sensitive string comparisons, we do

if (str1 == str2)

and this gives an ordinal comparison (i.e. StringComparison.Ordinal).

So these analyzers are concerned with case-insensitive comparisons for the most part.

Recommending to use String.Equals instead of if (stringOne.ToUpperInvariant() == stringTwo.ToUpperInvariant()) for performance reasons, while retaining maximum backwards compatibility (no change in behaviour) via use of StringComparison.InvariantCultureIgnoreCase is a worthy thing.

But a higher, better, nobler and more worthy thing is to recommend the use of StringComparison.OrdinalIgnoreCase for correct operation and good security - in a word: "safety".

Safety of implementation beats out perf any day of the week.

The downside is that ppl have to check that there is no change in behaviour, I get that.

@ghost ghost locked as resolved and limited conversation to collaborators Sep 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Meta User Story A single user-facing feature. Can be grouped under an epic.
Projects
None yet
Development

No branches or pull requests

7 participants