Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve arrays() performance with unique-sampled-list tricks #3066

Closed
Zac-HD opened this issue Aug 24, 2021 · 0 comments · Fixed by #3076
Closed

Improve arrays() performance with unique-sampled-list tricks #3066

Zac-HD opened this issue Aug 24, 2021 · 0 comments · Fixed by #3076
Assignees
Labels
internals Stuff that only Hypothesis devs should ever see performance go faster! use less memory!

Comments

@Zac-HD
Copy link
Member

Zac-HD commented Aug 24, 2021

UniqueSampledListStrategy has significantly better performance than the approach currently taken by arrays() when choosing the indices at which to insert elements. We can and should leverage that, and use similar tricks for selecting unique elements for eight-bit dtypes where high collisions are possible. This also implies that we should usually choose an element and then the index to avoid wasting entropy if the element is rejected.

Note that the same considerations apply to #3065 as well as hypothesis.extra.numpy; the relevant code is identical.

@Zac-HD Zac-HD added performance go faster! use less memory! internals Stuff that only Hypothesis devs should ever see labels Aug 24, 2021
@Zac-HD Zac-HD self-assigned this Aug 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internals Stuff that only Hypothesis devs should ever see performance go faster! use less memory!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant