Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhauling the st.floats() internals #2907

Closed
Zac-HD opened this issue Mar 15, 2021 · 4 comments · Fixed by #3327
Closed

Overhauling the st.floats() internals #2907

Zac-HD opened this issue Mar 15, 2021 · 4 comments · Fixed by #3327
Labels
internals Stuff that only Hypothesis devs should ever see

Comments

@Zac-HD
Copy link
Member

Zac-HD commented Mar 15, 2021

Floating-point numbers are a fundamental datatype, but our backend st.floats() could be better. As per #1704, bounded floats have some weird distributional problems, and much worse shrinking behaviour than unbounded floats. On a related-ish note, the helper functions in hypothesis.internal.conjecture.floats are not width-aware, which makes 32bit and 16bit float generation much less efficient than it could be.

Fortunately, @rsokl and I are pretty sure that we can exploit a combination of rejection sampling, our existing custom bitwise float encoding, and bitmasks to solve both of these problems - and as a side effect, we'll have a single #2878-style FloatsStrategy which can grow #2701 filter rewriting in a future PR. The implementation trick:

  1. Draw a sign bit, then 64 bits as an integer using our usual custom encoding.
  2. Convert it to a float; if non-finite either return it (if allowed) or discard and goto 1.
  3. If the finite float is disallowed, use a bitmask to set any forced-bits of the exponent and mantissa to zero or one.
  4. If this rewritten value is disallowed, discard and goto 1; else discard, write the integer encoding so we'll get lucky next time, and return the allowed value. (the engine will abort this loop if it gets too long)

It's fiddly, but ensures that

  • all floats shrink the same way - simplifying the fractional part, and then as if for integers
  • that varying the width (e.g. Numpy dtype) doesn't invalidate the rest of the test as would happen if we only drew 32bits for 32bit floats
  • there's no change to unbounded 64bit floats, which work just fine at the moment
  • we dodge nasty precision issues with adding offsets to floating-point ranges
  • generating narrow floats involves no rejection sampling (unless bounded by non-bit-level bounds)
@honno

This comment has been minimized.

@Zac-HD
Copy link
Member Author

Zac-HD commented Nov 17, 2021

If you can reproduce, that sounds like a bug that we'll want to patch on a way shorter timeframe than overhauling the whole floats() internals so a new issue would be 👍

@Zac-HD
Copy link
Member Author

Zac-HD commented Jan 7, 2022

We'll also want to support non-IEEE float types like bfloat32 and bfloat16 at some point, likely after this overhaul.

@Zac-HD
Copy link
Member Author

Zac-HD commented May 12, 2022

This is mostly solved by #3327, with only "mask out bits that can't be set in narrower float types" remaining (plus the carefully handling required for non-finite numbers under this plan).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internals Stuff that only Hypothesis devs should ever see
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants