Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New string-specific options for string equivalency assertions #2364

Closed
vbreuss opened this issue Oct 5, 2023 · 37 comments · Fixed by #2413
Closed

New string-specific options for string equivalency assertions #2364

vbreuss opened this issue Oct 5, 2023 · 37 comments · Fixed by #2413
Labels
api-approved API was approved, it can be implemented

Comments

@vbreuss
Copy link
Contributor

vbreuss commented Oct 5, 2023

Background and motivation

Initial discussion in #2339 and #2064

I would like to add some overloads to most string equivalency assertions to tackle various use cases (e.g. ignore white space, ignore casing, ignore newline, etc.) with the goal to tackle most existing use cases and to allow for extensibility of the provided mechanism. In order to achieve this, I would add an overload with a Func<EquivalencyAssertionOptions<string>, EquivalencyAssertionOptions<string>> to specify additional options on the string comparison like ignoring leading or trailing white space or newlines to the following string assertions:

  • BeEquivalentTo
  • ContainEquivalentOf
  • EndWithEquivalentOf
  • MatchEquivalentOf
  • NotBeEquivalentTo
  • NotContainEquivalentOf
  • NotEndWithEquivalentOf
  • NotMatchEquivalentOf
  • NotStartWithEquivalentOf
  • StartWithEquivalentOf

This also allows specifying an IEqualityComparer<string> as requested in #2339 via the .Using method on the assertion options.

API Proposal

Example for BeEquivalentTo:

    public AndConstraint<TAssertions> BeEquivalentTo(string expected,
        Func<EquivalencyAssertionOptions<string>, EquivalencyAssertionOptions<string>> config,
        string because = "", params object[] becauseArgs)
    {
      // ...
    }

    public AndConstraint<TAssertions> NotBeEquivalentTo(string unexpected,
        Func<EquivalencyAssertionOptions<string>, EquivalencyAssertionOptions<string>> config,
        string because = "", params object[] becauseArgs)
    {
      // ...
    }

and add the following methods to the EquivalencyAssertionOptions<T> class:

    public EquivalencyAssertionOptions<T> IgnoringLeadingWhitespace()
    {
        // ...
        return this;
    }

    public EquivalencyAssertionOptions<T> IgnoringTrailingWhitespace()
    {
        // ...
        return this;
    }

    public EquivalencyAssertionOptions<T> IgnoringCase()
    {
        // ...
        return this;
    }

API Usage

subject.Should().BeEquivalentTo(expected, o => o
    .IgnoringLeadingWhitespace()
    .IgnoringTrailingWhitespace()
    .IgnoringCase());
subject.Should().BeEquivalentTo(expected, o => o
    .Using(StringComparer.InvariantCultureIgnoreCase));

Alternative Designs

No response

Risks

As I would support arbitrary IEqualityComparer<string>, I can no longer use the IndexOfFirstMismatch extension method, but would have to rewrite it, e.g. as follows:

    public static int IndexOfFirstMismatch(this string value, string expected, IEqualityComparer<string> comparer)
    {
        for (int index = 0; index < value.Length; index++)
        {
            if (index >= expected.Length || !comparer.Equals(value[index..(index + 1)], expected[index..(index + 1)]))
            {
                return index;
            }
        }

        return -1;
    }

The difference is, that I have to slice the strings and get substrings, instead of comparing char by char. This could have a performance impact...

Are you willing to help with a proof-of-concept (as PR in that or a separate repo) first and as pull-request later on?

Yes, please assign this issue to me.

@vbreuss vbreuss added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Oct 5, 2023
@dennisdoomen dennisdoomen changed the title [API Proposal]: New overloads for StringAssertions.BeEquivalentTo New string-specific options for StringAssertions.BeEquivalentTo Oct 6, 2023
@dennisdoomen
Copy link
Member

dennisdoomen commented Oct 6, 2023

I like 1 and 3, but I don't like 2. Instead, we also need to add these options to the general BeEquivalentTo. Otherwise, we would not offer an inconsistent surface area.

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 6, 2023

@dennisdoomen : I adjusted the API proposal to exclude point 2.

I am not sure, I understand what you mean with "to add these options to the general BeEquivalentTo". Do you refer to the ObjectAssertions.BeEquivalentTo method?

@dennisdoomen
Copy link
Member

Yes, and all the others that take a EquivalencyAssertionOptions<T>. If you add those options to that options calls it'll become available everywhere.

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 9, 2023

@dennisdoomen :
I see your point and adjusted the API proposal accordingly.

I would split it into two pull requests:

  1. One that adds the IEqualityComparer overload (as requested in Allow IEqualityComparer in string equality assertion #2339)
  2. Another one that adds the overload with EquivalencyAssertionOptions<string> and adds additional options for string comparison there.

Is this okay for you?

@dennisdoomen
Copy link
Member

Yeah, that sounds like a great plan.

@dennisdoomen
Copy link
Member

@jnyrup I'm fine marking this as "api approved"

@IT-VBFK
Copy link
Contributor

IT-VBFK commented Oct 14, 2023

Love this proposal :)

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 14, 2023

@dennisdoomen : As requested in the #2372 I also added the two API changes for NotBeEquivalentTo.

@jnyrup
Copy link
Member

jnyrup commented Oct 22, 2023

I've been thinking about if this needs some more upfront designing here?
I.e. how this should (not) influence the other Equivalent methods in StringAssertions

BeEquivalentTo(string expected, ...)
ContainEquivalentOf(string expected, ...)
ContainEquivalentOf(string expected, OccurrenceConstraint occurrenceConstraint, ...)
EndWithEquivalentOf(string expected, ...)
MatchEquivalentOf(string wildcardPattern, ...)
NotBeEquivalentTo(string unexpected, ...)
NotContainEquivalentOf(string unexpected, ...)
NotEndWithEquivalentOf(string unexpected, ...)
NotMatchEquivalentOf(string wildcardPattern, ...)
NotStartWithEquivalentOf(string unexpected, ...)
StartWithEquivalentOf(string expected, ...)

I like 1 and 3, but I don't like 2. Instead, we also need to add these options to the general BeEquivalentTo. Otherwise, we would not offer an inconsistent surface area.

Doesn't one currently has do some extra work to make the general BeEquivalentTo use case-insensitive string comparison?
The first thing that comes to my mind is this:

new { Value = "abc" }.Should().BeEquivalentTo(new { Value = "ABC" }, opt => opt
    .Using<string>(ctx => ctx.Subject.Should().BeEquivalentTo(ctx.Expectation))
    .WhenTypeIs<string>());

Which with the options overload to StringAssertions.BeEquivalentTo would become

new { Value = "abc" }.Should().BeEquivalentTo(new { Value = "ABC" }, opt => opt
    .Using<string>(ctx => ctx.Subject.Should().BeEquivalentTo(ctx.Expectation, o => o.IgnoringNewlines()))
    .WhenTypeIs<string>());

Or are you saying that one should have the same/similar EquivalencyAssertionOptions directly available no matter if you're asserting directly on a string or an object having a string member?

"abc".Should().BeEquivalentTo("ABC", opt => opt.IgnoringNewlines())
new { Value = "abc" }.Should().BeEquivalentTo(new { Value = "ABC" }, opt => opt.IgnoringNewlines())

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 23, 2023

@jnyrup :

Or are you saying that one should have the same/similar EquivalencyAssertionOptions directly available no matter if you're asserting directly on a string or an object having a string member?

The idea is exactly this, that I add the additional methods (IgnoringNewLines, IgnoringLeadingWhitespace, ...) directly to the EquivalencyAssertionOptions, so that they can be used everywhere and are applied, when comparing a string.

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 23, 2023

I've been thinking about if this needs some more upfront designing here? I.e. how this should (not) influence the other Equivalent methods in StringAssertions

BeEquivalentTo(string expected, ...)
ContainEquivalentOf(string expected, ...)
ContainEquivalentOf(string expected, OccurrenceConstraint occurrenceConstraint, ...)
EndWithEquivalentOf(string expected, ...)
MatchEquivalentOf(string wildcardPattern, ...)
NotBeEquivalentTo(string unexpected, ...)
NotContainEquivalentOf(string unexpected, ...)
NotEndWithEquivalentOf(string unexpected, ...)
NotMatchEquivalentOf(string wildcardPattern, ...)
NotStartWithEquivalentOf(string unexpected, ...)
StartWithEquivalentOf(string expected, ...)

@jnyrup : Do you mean, that I should also add the corresponding overload with IEqualityComparer<string> comparer and IEqualityComparer<string> comparer to these methods?

@jnyrup
Copy link
Member

jnyrup commented Oct 23, 2023

Do you mean, that I should also add the corresponding overload with IEqualityComparer<string> comparer and IEqualityComparer<string> comparer to these methods?

It was meant as an open question.
In general have avoided overloads taking IEqualityComparer<T>, StringComparer, StringComparison or similar.
#2243 Adding IEqualityComparer<T> and #1145+#1869+#1880 adding OccurrenceConstraint parameters are notable exceptions to this.

So instead of looking solely at BeEquivalentTo I wanted to draw attention to the similar string assertions to include more brains.

I also just recalled that with #1284 we have support for passing a custom IEqualityComparer<T> to the non-string BeEquivalentTo, such that one can write

new { Value = "abc" }.Should().BeEquivalentTo(new { Value = "ABC" }, opt => opt
    .Using(StringComparer.OrdinalIgnoreCase));

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 24, 2023

I often had the case, that I needed to compare strings case sensitive or case insensitive and for me it would read better to specify this explicitely than to rely on the difference between .Be and .BeEquivalentTo.

"foo".Should().Be("FOO", StringComparer.OrdinalIgnoreCase);

vs.

"foo".Should().BeEquivalentTo("FOO");

In the latter example I often got review comments, why this works, as it is not immediately clear, that "BeEquivalentTo" is case-insensitive.

So for me adding the IEqualityComparer<string> to the string assertions is a good idea and I am happy to expand it also to the other string assertion methods you mention.

@dennisdoomen
Copy link
Member

Oof, this "enhancement" is blowing out of proportions. Happy that @vbreuss is taken the lead here.

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 27, 2023

@dennisdoomen , @jnyrup
I am happy to also work on a bigger improvement and add both suggested overloads to all relevant string assertions, but would like to make sure first, that the approach is okay.

So I would add the suggested additional overloads also to

  • Be
  • ContainEquivalentOf
  • EndWithEquivalentOf
  • MatchEquivalentOf
  • NotBe
  • NotContainEquivalentOf
  • NotEndWithEquivalentOf
  • NotMatchEquivalentOf
  • NotStartWithEquivalentOf
  • StartWithEquivalentOf

Is this okay?

@dennisdoomen
Copy link
Member

Is this okay?

I might be opening up an earlier decision (my brain has been very pre-occupied with conferences and some personal stuff), but what if you just added an overload to the methods @jnyrup listed that takes an EquivalencyAssertionOptions<string> and exposes the new options you mentioned in the opening of this post? That mechanism already supports a Using<T>(IEqualityComparer<T>. Then you would cover all scenarios. Right?

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 29, 2023

That mechanism already supports a Using<T>(IEqualityComparer<T>

The difficulty here is, that I can't get access to this comparer. The Using-Method just adds an IEquivalencyStep. I would need to store this separately in the EquivalencyAssertionOptions class.

But I will give your idea a try and create a pull request!

@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 29, 2023

@dennisdoomen
I created a draft in #2413 that implements your idea for BeEquivalentTo and NotBeEquivalentTo. It is thus comparable to #2372.

If you want me to continue this approach, I would add the same overloads to the other string assertion methods.

@dennisdoomen
Copy link
Member

I love it! What do you think @jnyrup ?

@jnyrup
Copy link
Member

jnyrup commented Oct 29, 2023

Primarily looked at the tests and they look great!

One thing I have never given a thought so far is whether the new options should be applied to only the subject, only the expectation or both 🤔

@IT-VBFK
Copy link
Contributor

IT-VBFK commented Oct 29, 2023

One thing I have never given a thought so far is whether the new options should be applied to only the subject, only the expectation or both

I would vote for "only subject". Why one would construct an expectation where he adds leading/trailing whitespaces manually? That makes no sense to me.

@vbreuss vbreuss changed the title New string-specific options for StringAssertions.BeEquivalentTo New string-specific options for string equivalency assertions Oct 31, 2023
@vbreuss
Copy link
Contributor Author

vbreuss commented Oct 31, 2023

I updated the description of this issue to reflect the latest suggestion, which is according to the suggested implementation in #2413.

@dennisdoomen
Copy link
Member

I would vote for "only subject". Why one would construct an expectation where he adds leading/trailing whitespaces manually? That makes no sense to me.

Why not just assume it applies to both? I can't imagine a scenario where you ignore newlines on the subject, but do care about them in the expectation. That just doesn't make sense.

@jnyrup
Copy link
Member

jnyrup commented Oct 31, 2023

I can't imagine a scenario where you ignore newlines on the subject, but do care about them in the expectation. That just doesn't make sense.

What I had in mind was something like this, but I realize that was just an example where it doesn't need to ignore newlines from expectation, but also doesn't hurt to do so.

var subject = GetMessage(); // returns "some\r\n value" due to some line breaking logic
subject.Should().BeEquivalentTo("some value", opt => opt.IgnoringNewlines()); // in this particular test I don't care about the line breaks, but that the message contains "some value".

So I'm good with applying it to both 👍

@vbreuss
Copy link
Contributor Author

vbreuss commented Nov 4, 2023

@dennisdoomen, @jnyrup :

Is this proposal okay now?

I implemented it in #2413, but currently the issue doesn't yet have the "api-approved" label...

@jnyrup
Copy link
Member

jnyrup commented Nov 4, 2023

Is this proposal okay now?

In the draft PR StringAssertions.Be and StringAssertions.BeEquivalentTo are identical.
Would it be enough with just the new BeEquivalentTo?

If we add NotContainEquivalentOf(unexpected, occurrenceConstraint, config) (which I approve), let's also add NotContainEquivalentOf(unexpected, occurrenceConstraint) to have the same four overloads as ContainEquivalentOf.

@vbreuss
Copy link
Contributor Author

vbreuss commented Nov 4, 2023

In the draft PR StringAssertions.Be and StringAssertions.BeEquivalentTo are identical. Would it be enough with just the new BeEquivalentTo?

I think so, but then I would have to also change StringEqualityEquivalencyStep to use BeEquivalentTo instead of Be, but I don't think this is a problem...

@vbreuss
Copy link
Contributor Author

vbreuss commented Nov 4, 2023

If we add NotContainEquivalentOf(unexpected, occurrenceConstraint, config) (which I approve), let's also add NotContainEquivalentOf(unexpected, occurrenceConstraint) to have the same four overloads as ContainEquivalentOf.

I agree, but as this has nothing to do with the config changes, I implemented it in a separate pull request (#2436), to keep the changes in #2413 more focused.

@vbreuss
Copy link
Contributor Author

vbreuss commented Nov 4, 2023

In the draft PR StringAssertions.Be and StringAssertions.BeEquivalentTo are identical. Would it be enough with just the new BeEquivalentTo?

I think so, but then I would have to also change StringEqualityEquivalencyStep to use BeEquivalentTo instead of Be, but I don't think this is a problem...

I changed it accordingly in #2413!

@jnyrup
Copy link
Member

jnyrup commented Nov 4, 2023

Some more thinking on the behavior of IgnoreNewlines.

#1247 is about considering "\n" and "\r\n" equivalent, i.e. ignoring OS newline style.

My example in #2339 (comment) was going one step further and considered \n, \r\n and the same.
E.g. ignoring line breaks on spaces.

The current implementation removes newlines, such that

""""
cash
ew nuts
""""

is is considered equivalent to

""""
cashew nuts
""""

Any proper line-breaking algorithm would insert a hyphen when breaking a line inside a word

""""
cash-
ew nuts
""""

What do you think?

@vbreuss
Copy link
Contributor Author

vbreuss commented Nov 4, 2023

I think, that when I read IgnoringNewlines, I would expect all newlines to be removed, as it is implemented.
However, I believe that the most common use case would be to ignore OS-specific differences.
So, what about renaming the method from IgnoringNewlines to IgnoringNewlineStyle or UsingCrossPlatformNewlineStyle or similar and the implementation just replaces "\r\n" with "\n"?

BTW, when I look at the example in #1247, it could also be fixed with IgnoringTrailingWhitespace, as this also removes newlines, tabs and blanks.

@dennisdoomen
Copy link
Member

What do you think?

I'm not entirely sure what you mean to say with the last example (about the hyphen).

@jnyrup
Copy link
Member

jnyrup commented Nov 8, 2023

What do you think?

I'm not entirely sure what you mean to say with the last example (about the hyphen).

That I don't think "cash\r\new" and "cashew" should be considered equivalent.
If "cashew" had been split across a line by some line-splitting algorithm, it should have inserted a hyphen such that it becomes e.g. `"cash-\r\new".

@eNeRGy164
Copy link
Contributor

eNeRGy164 commented Nov 8, 2023

So the question is, does IgnoreNewLines mean
compare as if there are no newlines at all
or
compare but replace 1+ newline characters as the same single newline char on both sides

@vbreuss
Copy link
Contributor Author

vbreuss commented Nov 8, 2023

What about adding two options:

  • IgnoreAllNewlines
    Which compares as if there are no newlines (current behavior)
  • IgnoreNewlineStyle
    Which replaces "\r\n" with "\n" so that OS-Specific newline differences are ignored

@dennisdoomen
Copy link
Member

What about adding two options:

I honestly don't know 🤷‍♂️

@vbreuss
Copy link
Contributor Author

vbreuss commented Dec 4, 2023

As this issue is already quite large, I removed the option IgnoringNewlines from this issue and the corresponding pull request #2413.
I hope that this allows us to finish this issue and then focus the discussion of how to work with newlines in a focused thread (see #2496).

@dennisdoomen dennisdoomen added api-approved API was approved, it can be implemented and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-approved API was approved, it can be implemented
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants