Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boundary of search space on pairs of floats is reliably limited #2671

Closed
rsokl opened this issue Nov 18, 2020 · 10 comments
Closed

Boundary of search space on pairs of floats is reliably limited #2671

rsokl opened this issue Nov 18, 2020 · 10 comments
Labels
enhancement it's not broken, but we want it to be better internals Stuff that only Hypothesis devs should ever see

Comments

@rsokl
Copy link
Contributor

rsokl commented Nov 18, 2020

Generating a pair of floats with @given(x=st.floats(x_min, x_max), y=st.floats(y_min, y_max)) fails to ever generate cases along the "vertical strip" (x_min, y), other than the point (x_min, y_min).

This is not true for the transpose – the "horizontal strip" (x, y_min) is explored in a robust way.

The following tests demonstrate this:

import hypothesis.strategies as st
from hypothesis import given

@given(x=st.floats(0, 10), y=st.floats(0, 10))
def test_explore_xmin_y(x, y):
    # always passes, even if `max_examples` is cranked up
    if x == 0:
        assert y == 0


@given(x=st.floats(0, 10), y=st.floats(0, 10))
def test_explore_x_ymin(x, y):
    # reliably fails
    if y == 0:
        assert x == 0

I found this because I noticed that a test that I had written should have been failing along (x_min, y) other than at (x_min, y_min), but it never did. I was surprised that simply swapping the order of my parameters got the test to fail.

@Zac-HD Zac-HD added enhancement it's not broken, but we want it to be better internals Stuff that only Hypothesis devs should ever see labels Nov 18, 2020
@Zac-HD
Copy link
Member

Zac-HD commented Nov 18, 2020

I think this might be related to our zero-blocks heuristics? @DRMacIver or @Zalathar can probably say more.

@rsokl
Copy link
Contributor Author

rsokl commented Nov 18, 2020

To summarize this visually, the following plots 10,000 ceil'd (x, y) values generated by x=st.floats(0, 10), y=st.floats(0, 10) (on a log(cnt + 1) scale)

image

@Zalathar
Copy link
Contributor

I had a brief look into this, and the impression I have so far is that our strategy for bounded floats doesn't have any explicit bias towards zeros (unlike the unbounded-floats strategy).

And since Conjecture currently doesn't go out of its way to generate low-level zeros (IIRC), the end result is that zeros end up being astronomically rare outside of the guaranteed all-zeros example.

(I haven't yet investigated what's going on behind the skew towards one axis working and the other axis not working.)

@Zac-HD
Copy link
Member

Zac-HD commented Nov 22, 2020

Hmm, probably related to #1704 then - in that we may want to reengineer the bounded floats strategy.

@rsokl
Copy link
Contributor Author

rsokl commented Nov 23, 2020

And since Conjecture currently doesn't go out of its way to generate low-level zeros (IIRC), the end result is that zeros end up being astronomically rare outside of the guaranteed all-zeros example.

@Zalathar I don't quite understand how this would explain the asymmetry that we see. i.e. Why would we see zeroes along the "vertical" strip but not the horizontal?

Sorry, just saw your statement in parenthesis

@Zalathar
Copy link
Contributor

OK, I believe the observed asymmetry happens as a side-effect of #2523.

After generating a novel prefix, the engine first tries zero-extending it, to get an estimate of how long a “short” example should be. In the case of two bounded-float arguments, this process is pretty much guaranteed to generate a non-minimal value for the first, and a minimal value for the second.

This matches my observation that the (correctly) reliably-failing test would always fail on its second example (i.e. the first one after the all-zeros example).

(Aside: I managed to greatly confuse myself by assuming that Hypothesis generates argument values in left-to-right program order, which turns out to not be the case in general. Something to keep an eye on when doing this sort of low-level detective work.)

@Zac-HD
Copy link
Member

Zac-HD commented Mar 13, 2021

I think a good general solution to this would be to add another mutation pass. Currently we just try to duplicate segments (causing the bright diagonal line in @rsokl's figure above), but shuffling equivalent segments would also make sense - albeit probably with different heuristics about when to stop.

@Zalathar
Copy link
Contributor

There were two things that I had vaguely intended to do about this, but never ended up prioritizing:

  • Re-engineer the bounded-floats strategy to be more similar to the unbounded-floats strategy (especially for large ranges), so there is less risk of unintuitive behaviour when switching between the two.
  • Reintroduce some of the fancier mutator features or biased-random features (e.g. deliberately generating zeros with higher probability) that we used to have.

For the latter, my understanding is that David mostly removed them in the spirit of “let's rip out some annoying complexity and see if we get away with it”. So if this issue constitutes evidence that we didn't actually get away with it, then it would make sense to thoughtfully bring some of them back.

@Zac-HD
Copy link
Member

Zac-HD commented Mar 13, 2021

Sounds good!

Re-engineer the bounded-floats strategy to be more similar to the unbounded-floats strategy (especially for large ranges), so there is less risk of unintuitive behaviour when switching between the two.

I'm actually planning out a #2878-style merging of our bounded and unbounded floating-point strategies, with the intent of improving the distribution and shrinking behaviour - especially for 32bit and 16bit floats where rejection sampling is hilariously inefficient.

Reintroduce some of the fancier mutator features or biased-random features (e.g. deliberately generating zeros with higher probability) that we used to have.

I'm interested in building mutators which work for very long runs (~millions of test cases) without e.g. blowing through memory limits, to use in HypoFuzz, including many ideas which don't 'pay for their overhead' in Hypothesis' typically short runs. Agreed that the biased-random tricks might be worth bringing back though.

@Zac-HD
Copy link
Member

Zac-HD commented Jun 20, 2022

Fixed by #3327 - we now add the min, next_up(min), min+1, and symetrically for max to our "interesting" floats which are generated more often.

@Zac-HD Zac-HD closed this as completed Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better internals Stuff that only Hypothesis devs should ever see
Projects
None yet
Development

No branches or pull requests

3 participants