Improve `arrays()` performance with unique-sampled-list tricks #3066

Zac-HD · 2021-08-24T14:25:32Z

UniqueSampledListStrategy has significantly better performance than the approach currently taken by arrays() when choosing the indices at which to insert elements. We can and should leverage that, and use similar tricks for selecting unique elements for eight-bit dtypes where high collisions are possible. This also implies that we should usually choose an element and then the index to avoid wasting entropy if the element is rejected.

Note that the same considerations apply to #3065 as well as hypothesis.extra.numpy; the relevant code is identical.

The text was updated successfully, but these errors were encountered:

Zac-HD added performance go faster! use less memory! internals Stuff that only Hypothesis devs should ever see labels Aug 24, 2021

Zac-HD mentioned this issue Aug 26, 2021

Array API extra #3065

Merged

Zac-HD self-assigned this Aug 26, 2021

Zac-HD mentioned this issue Aug 29, 2021

Faster strategy for arrays(..., unique=True) #3076

Merged

Zac-HD closed this as completed in #3076 Aug 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve `arrays()` performance with unique-sampled-list tricks #3066

Improve `arrays()` performance with unique-sampled-list tricks #3066

Zac-HD commented Aug 24, 2021 •

edited

Improve arrays() performance with unique-sampled-list tricks #3066

Improve arrays() performance with unique-sampled-list tricks #3066

Comments

Zac-HD commented Aug 24, 2021 • edited

Improve `arrays()` performance with unique-sampled-list tricks #3066

Improve `arrays()` performance with unique-sampled-list tricks #3066

Zac-HD commented Aug 24, 2021 •

edited