Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of PRNG can cause pathological performance in 'bisect' tools #2720

Open
geeksam opened this issue Feb 25, 2023 · 4 comments
Open

Use of PRNG can cause pathological performance in 'bisect' tools #2720

geeksam opened this issue Feb 25, 2023 · 4 comments

Comments

@geeksam
Copy link

geeksam commented Feb 25, 2023

TL;DR

Using Faker to generate data for tests can (in combination with complex app behavior) create test flakiness that is extremely difficult to track down. To y'all's credit, the "Deterministic Random" feature can mitigate some of this flakiness. Unfortunately, Faker's use of a PRNG creates unavoidable order dependencies between tests, because the number of times a PRNG has been called is global state.

This can horribly break RSpec's "bisect" feature (which is specifically designed to make it easier to isolate flaky tests!) by giving it pathological data.

To Reproduce

I've published a repository at https://github.com/geeksam/flaker with a Very Smol Ruby Project to reproduce the behavior. There's also quite a bit of additional discussion in the README there.

For posterity and/or convenience, however, here's a recipe/thought experiment:

  • Write two tests. Make sure they always run in the same order with the same seed.
  • Make the first test (a) generate a value using Faker, and (b) assert that the generated value matches the first value returned by Faker. (You'll need to run the suite once, then copy/paste the generated value into the test.)
  • Make the second test (b) generate a value using Faker, and (b) assert that the generated value DOES NOT match the second value returned by Faker. (As before, you'll need to copy/paste.)

If you do this, the second test will only fail if it's the second test to run.

Now imagine that the failing test is the 107th in the sequence, and running the previous 106 takes five minutes, and you think "hey, I'll use rspec --bisect=verbose!" (I hope it's cold in your office, because your computer is now a space heater.)

...again, this is all already in https://github.com/geeksam/flaker, so feel free to go there rather than following the above recipe. :)

Thanks!

I don't know of any way to solve the PRNG problem in the general case, and the only workaround I've been able to come up with is highly impractical. As such, I fully expect this issue to stay open forever—or, more likely, be closed as "won't fix". :)

I've derived a great deal of entertainment from seeing Faker-generated data in my test suite over the years, and it saddens me to have to stop using it because of this issue... but if I ever have to populate a database for a demo, I will absolutely use Faker to do it.

Thanks for all the laughs, not to mention the free library!

@thdaraujo
Copy link
Contributor

Hi, Sam!

Thanks for the in-depth report, I really appreciate it! It was fun to read, and I learned a thing or two about how you use faker and how even the deterministic random generator is not solving your problem.

That's a fascinating topic for discussion, though. I don't have an answer for you right now, I have to think about it more and play around with the code you shared to see what we could do about it. Documenting this issue could be a first step.

(or maybe we should use Dilbert's random number generator 😄)

@thdaraujo thdaraujo linked a pull request Mar 25, 2023 that will close this issue
@thdaraujo thdaraujo removed a link to a pull request Mar 25, 2023
@stefannibrasil
Copy link
Contributor

Hi @geeksam thanks for this funny writeup! Loved "flaker" 😆

I've derived a great deal of entertainment from seeing Faker-generated data in my test suite over the years, and it saddens me to have to stop using it because of this issue... but if I ever have to populate a database for a demo, I will absolutely use Faker to do it.

I don't actually recommend using any sort of generated data for testing purposes :D faker is a great tool for populating DBs for apps, demos, etc. Tests are complex and using random data to write tests is not what I do.

More recommended resources about faker as seed data and not in the tests that I follow and recommend are:
https://thoughtbot.com/blog/anonymizing-user-company-and-location-data-using
https://thoughtbot.com/blog/a-journey-towards-better-testing-practices#avoiding-randomness

@geeksam
Copy link
Author

geeksam commented Oct 14, 2023

@stefannibrasil Oh, I am 100% Team "Tests MUST Be Deterministic" (with the possible exceptions of fuzz testing or property-based testing). Unfortunately, I've joined more than one team with test suites that already had Faker in them, and it took me an embarrassingly long time to notice that that was a problem. :)

@geeksam
Copy link
Author

geeksam commented Oct 14, 2023

It occurred to me a month or two back that this issue could potentially be mitigated by providing a deterministic "random" generator that returns a value based on a hash (e.g., Digest::SHA1) of caller. Unfortunatey, I can't bring myself to actually write such a tool: it would feel like an implicit endorsement of the idea that it's okay to use Faker in a test suite, which (IMO) it isn't.

I think that if I were going to contribute anything related to this issue, it would be an update to the README saying "please don't use randomness in tests," possibly with a link to my 'flaker' repo. 😜

Given that this seems like an intractable problem, feel free to close this issue unless you'd rather leave it open forever as a warning to future generations...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants