Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prettifier escapes strings #2320

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

som-snytt
Copy link

A user on discord reported confusion when a test failed because of an embedded "null" character but the string was displayed "normal".

That case was a "contains" test, so a string "diff" would not help, unless a user could request a diff against every element in the collection under test.

This commit prettifies strings as Scala strings with the usual escapes, including Unicode escape if isControl.

Copy link

cla-bot bot commented Mar 30, 2024

Hi @som-snytt, we require contributors to sign our Contributor License Agreement, and we don't have yours on file. In order for us to review and merge your code, please access https://www.artima.com/cla/choose-type to sign our Contributor License Agreement. Your effort is highly appreciated. Thank you.

@artimasites
Copy link

@cla-bot[bot] check

@cla-bot cla-bot bot added the cla-signed label Mar 30, 2024
Copy link

cla-bot bot commented Mar 30, 2024

The cla-bot has been summoned, and re-checked this pull request!

@cheeseng cheeseng changed the base branch from main to 3.2.x-new March 31, 2024 01:17
@cheeseng cheeseng changed the base branch from 3.2.x-new to main March 31, 2024 01:18
@cheeseng
Copy link
Contributor

@som-snytt Hmm, I am not 100% if it is desired as default for everyone, how about if we make it as different flavor of Prettifier, e.g. Prettifier.escaping like Prettifier.basic that we have?

@som-snytt
Copy link
Author

It's never desirable to conceal information. I happened to notice that javap shows string constants in escaped form (as does the scalac compiler).

I'm unable to reason about "100%". The Scala REPL doesn't display strings in quotes, like your basic prettifier, because of decision paralysis. Also I don't know about ScalaTest expectations.

The "ideal" might be test --explain which would give detailed diffs of compared elements, including why no element of a sequence matched the test element.

The motivating case was https://discord.com/channels/632150470000902164/632150470000902166/1222916152771084298

It took that person only 15 mins (after posting their question) to realize that their data had embedded null bytes. I should have asked if they used a debugger. But it should have been obvious from the test output.

My test fails with «Array("date", "campaign", "revenue", "secondary_revenue") did not contain element "date"», what am I doing wrong? It isn't making any sense to me.

Apparently, there were null-bytes involved, but they're not shown in the error output

If it has said Array("\0date", "campaign", "revenue", "secondary_revenue\0") I wouldn't have been so confused

@som-snytt
Copy link
Author

som-snytt commented Mar 31, 2024

The failure is a friendly reminder that 2.11 did not yet understand how quotes work.

raw"\""

Force-pushed just that tweak.

@cheeseng
Copy link
Contributor

cheeseng commented Apr 1, 2024

@som-snytt We want to avoid breaking somebody tests when new version is released, while I do think your code makes sense. We also have an Differ to do the diff analysis, may be that's the better place to do the job. I'll try to take a deeper look today.

@bvenners
Copy link
Contributor

bvenners commented Apr 2, 2024

I'm not sure I ever tried inserting null characters in strings in Java or Scala:

scala> "cat\u0000dog"
val res0: String = cat?dog

The REPL puts a ? there, but toString just skips over it:

scala> println(res0)
catdog

ScalaTest should probably indicate there's an unprintable character in there also, but not sure in what way. A question mark would help point out that there's a hidden character, but would not reveal exaclty what it was. So if there were two different hidden characters in the exact same spot, you'd still see something like "cat?dog" did not equal "cat?dog". I wonder how other popular test frameworks deal with unprintable characters in strings. Perhaps we should do a quick survey.

@cheeseng
Copy link
Contributor

cheeseng commented Apr 2, 2024

@som-snytt @bvenners Fyi I have an experimental branch here to display the escaped string in the analysis section:

https://github.com/cheeseng/scalatest/tree/feature-escaping-string-differ

For the following example code:

val a = "\u0000test"
val b = "test"
List(a) should contain (b)

It will fail with:

[info] - test *** FAILED *** (0 milliseconds)
[info]   List("test") did not contain element "test" (ListShouldContainSpec.scala:45)
[info]   Analysis:
[info]   LHS element contains characters that might cause problem: \u0000test

@som-snytt
Copy link
Author

som-snytt commented Apr 2, 2024

I was about to comment that I'll leave this PR open as a placeholder; feel free to close at your convenience.

I'm taking a look at a couple of usual frameworks to see if they produce interesting diffs (and also to remember how to use them). I'm interested to see if expecty or a derivative has a visually verbose diff.

junit4

org.junit.ComparisonFailure: expected:<[]abc> but was:<[]abc>

@cheeseng
Copy link
Contributor

@som-snytt Just fyi I have submitted the following PR:

#2323

We may switch it to main branch for the upcoming 3.3.0 release since the changes shall break binary-compat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants