Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many generated examples on odd number of arguments to "one_of" #2087

Closed
Stranger6667 opened this issue Sep 7, 2019 · 1 comment
Closed
Labels
question not sure it's a bug? questions welcome

Comments

@Stranger6667
Copy link
Collaborator

Hello!

I am working on a pytest plugin that will generate test data from Open API / Swagger schemas and it is built on top of hypothesis and hypothesis_jsonschema , first I observed this behavior in strategies generated by hypothesis_jsonschema but it seems like the issue is in the one_of behavior.

I noticed that the number of generated examples in one_of strategies depends on the number/type of its argument inconsistently (from what I expect).

Consider this example:

from hypothesis.strategies import just, one_of
from hypothesis import given


@given(x=one_of(just(1), just(2)))
def test_two(x):
    pass


@given(x=one_of(just(1), just(2), just(3)))
def test_three(x):
    pass


@given(x=one_of(just(1), just(2), just(3), just(4)))
def test_four(x):
    pass


@given(x=one_of(just(1), just(2), just(3), just(4), just(5)))
def test_five(x):
    pass

Environment:

  • Python 3.7.3
  • Hypothesis 4.35.0
  • pytest 5.12
  • Linux 5.1.15 (Arch)

Running this

pytest example.py --hypothesis-show-statistics

Produces the following statistic report:

================================================================================================================================================ Hypothesis Statistics =================================================================================================================================================

example.py::test_two:

  - 2 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: < 1ms
  - Fraction of time spent in data generation: ~ 34%
  - Stopped because nothing left to do

example.py::test_three:

  - 63 passing examples, 0 failing examples, 1 invalid examples
  - Typical runtimes: < 1ms
  - Fraction of time spent in data generation: ~ 61%
  - Stopped because nothing left to do

example.py::test_four:

  - 4 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: < 1ms
  - Fraction of time spent in data generation: ~ 36%
  - Stopped because nothing left to do

example.py::test_five:

  - 100 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: < 1ms
  - Fraction of time spent in data generation: ~ 50%
  - Stopped because settings.max_examples=100

I'd expect that if there are 2 st.just in st.one_of then there will be 2 examples if 3 st.just then 3 examples and so one, but the actual results differ from what I expect:

  • for 3 arguments it produces 64 examples
  • for 5 after adding @settings(max_examples=100000) it finished with 100K because of the max_examples option

Other strategies than just also cause the issue, but I didn't observe any pattern yet:

  • one_of(none(), none(), booleans()) - 85 examples
  • one_of(none(), booleans(), booleans()) - 3 examples (as expected)
  • one_of(booleans(), booleans(), booleans()) - 2 examples (as expected)

It doesn't seem like an expected behavior for me. Am I missing something? Is there any way to define such strategies differently so they will produce the expected number of examples?

Thank you for the wonderful library and looking forward to your feedback :)

@Zac-HD Zac-HD added the question not sure it's a bug? questions welcome label Sep 8, 2019
@Zac-HD
Copy link
Member

Zac-HD commented Sep 8, 2019

Hey @Stranger6667! Nice writeup, and thanks for asking. I always enjoy hearing from people who are using Hypothesis (and even my hypothesis-jsonschema project 😍)!

The short answer is that while we do our best to detect redundant arguments and stop when we've exhausted all possible inputs, designing strategies to shrink well means there is often some redundancy in how choices are represented.

Specifically, choices between power-of-two numbers of options can be made by drawing a certain number of bits (non-redundant), but other numbers can't. Hence the odd cases for three and five alternatives! This is a known perf issue we'd like to fix, but it's fairly low priority as it doesn't make much difference in any common use-case. See #1864 for more details!

For the latter examples, one_of is de-duplicating it's inputs internally - the last case collapses to just booleans(). The first two should be equivalent and I'd expect the right behaviour as "None or bool? If bool, True or False?" should draw either one or two bits. That one_of(none(), none(), booleans()) doesn't is probably something we can fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question not sure it's a bug? questions welcome
Projects
None yet
Development

No branches or pull requests

2 participants