Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add string-options for ignoring newline differences #2496

Closed
vbreuss opened this issue Dec 4, 2023 · 9 comments
Closed

Add string-options for ignoring newline differences #2496

vbreuss opened this issue Dec 4, 2023 · 9 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation

Comments

@vbreuss
Copy link
Contributor

vbreuss commented Dec 4, 2023

Background and motivation

Remaining issue from #2364
(see this comment for more details)

This should also fix #1247

API Proposal

Add the following options to the EquivalencyAssertionOptions<T> for strings

public EquivalencyAssertionOptions<T> IgnoringAllNewlines()
{
    // This will remove all newlines from actual and expected before comparison
    return this;
}
public EquivalencyAssertionOptions<T> IgnoringNewlineStyle()
{
    // This will replace "\r\n" with "\n" in actual and expected before comparison
    return this;
}

API Usage

subject.Should().BeEquivalentTo(expected, o => o.IgnoringNewlineStyle());

Alternative Designs

No response

Risks

No response

Are you willing to help with a proof-of-concept (as PR in that or a separate repo) first and as pull-request later on?

Yes, please assign this issue to me.

@vbreuss vbreuss added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Dec 4, 2023
@dennisdoomen
Copy link
Member

I'm fine with this proposal. If @jnyrup is too, we can mark it as approved.

@vbreuss

This comment was marked as resolved.

@jnyrup
Copy link
Member

jnyrup commented Feb 1, 2024

Wondering whether enabling IgnoringAllNewlines means that "cashew nuts" and cash\r\new nuts are considered equivalent.

In implementation details, whether "\n" and "\r\n" are replaced with:

  • "" or
  • " "

My first thought is that IgnoringAllNewlines would consider "cashew\r\nnuts" and "cashew nuts" to be equivalent.

(This can probably be considered my litmus test for this feature)

@vbreuss
Copy link
Contributor Author

vbreuss commented Feb 1, 2024

Yes, when IgnoringAllNewlines is set, "cash\r\new nuts" and "cashew nuts" would be equivalent, but "cashew\r\nnuts" and "cashew nuts" not, as the first would not have any blank between the words.

@bart-vmware
Copy link

Wondering whether enabling IgnoringAllNewlines means that "cashew nuts" and cash\r\new nuts are considered equivalent.

I agree. If there's such a need, probably introduce IgnoreLeadingAndTrailingWhitespace for that.

@dennisdoomen
Copy link
Member

Yes, when IgnoringAllNewlines is set, "cash\r\new nuts" and "cashew nuts" would be equivalent, but "cashew\r\nnuts" and "cashew nuts" not, as the first would not have any blank between the words.

That's also what I would expect

@jnyrup
Copy link
Member

jnyrup commented Mar 23, 2024

I don't think we have come to a conclusion yet. (I apologize for the long delay from my side)

I've moved the proposal IgnoringNewlineStyle to #2612 to separate the discussion and eventual approval of the two proposed APIs.

Back to IgnoringAllNewlines.

If the intended implementation of IgnoringAllNewlines is "remove all newlines and then do simple equality", then e.g. "every\r\nday" and everyday would be considered equivalent even though "every day" and "everyday" are different language constructs and not interchangeable.

To my knowledge in regular prose line breaking occurs instead of a space, i.e. "some value" is broken into "some\r\nvalue" to fit within the margins.
To obtain a justified right margin words may be split up using a hyphen+newline, i.e "some longer sentence" is broken into "some long-\r\ner sentence".

I can't off the top of my head come up with a realistic case, where newlines don't replace something.
I'm not saying the case doesn't exists, just that I haven't found it.

We shouldn't try to guess whether a hyphen was inserted manually or by the line-breaking algorithm.
I.e. we shouldn't consider "e\r\nmail" and "e-\r\nmail" as equivalent.

I hope this more clearly expresses why I don't think "remove all newlines and then do simple equality" is the way to go.

@dennisdoomen
Copy link
Member

Good point. Now I'm wondering what this was supposed to fix in the first place.

@vbreuss
Copy link
Contributor Author

vbreuss commented Mar 24, 2024

My original idea for this feature came from this comment 😄

I agree, that completely ignoring newlines might have some unintended consequences. For the use cases, that I see, the IgnoringNewlineStyle would be sufficient. I will adapt #2565 to only implement the IgnoringNewlineStyle option and close this issue.

@vbreuss vbreuss closed this as completed Mar 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation
Projects
None yet
Development

No branches or pull requests

4 participants