CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods #1266

wainwrightmark · 2022-11-16T16:50:05Z

Summary

IteratorRandom::choose() and related methods call gen_range() much more frequently than strictly necessary when choosing from iterators without size hints.
By reducing the number of these calls it is possible to speed up the relevant benchmarks by about 1.5-2x

Specifically

# Using Pcg32
test seq_iter_unhinted_choose_from_1000      ... bench:       4,049 ns/iter (+/- 377) #old
test seq_iter_unhinted_choose_from_1000      ... bench:       2,730 ns/iter (+/- 335) #new

# Using ChaChaRng
test seq_iter_unhinted_choose_from_1000      ... bench:       6,831 ns/iter (+/- 317) #old
test seq_iter_unhinted_choose_from_1000      ... bench:       3,447 ns/iter (+/- 627) #new

This would not require a breaking change to the API or violate any explicit promises but if this change is made then different items will be randomly returned from the choose() method and the Rng will be in a different state afterwards which may affect some users.

Details

The IteratorRandom::choose() and related methods would have to change.
The way they work now (for iterators without size hints) is that for every item, they generate a random number in the range 0..n where n is the number of items seen so far. If the generated number is zero (i.e. a 1/n chance) the current result changes to the new number.

This means that for the first twenty items, twenty random numbers are generated. My suggestion is to instead generate a number in the range 0..20! (20 factorial is the largest factorial less than u64::MAX) and use this random number instead of the first twenty random numbers.

A number in that range has a 1 in 2 chance of being 0 mod 2, then after division by 2, it has a 1 in 3 chance of being 0 mod 3 and so on up to 20.

After the first twenty numbers you then have to use 33!/20! which only gives you 13 numbers and the efficiency gradually decreases with larger n.
This seems like a lot of work but it's still much cheaper than generating new random numbers, especially with cryptographically secure rngs.

There is no measurable performance change when using choose() with a fixed slice iterator (the code for that part is unchanged).
When using an iterator with a size hint but not a fixed size, performance seems to have degraded very slightly. It would be possible to apply this optimization to that case as well though.

I believe the results of the functions would be consistent between 32 and 64 bit architectures (I am generating u64, not usize). I don't have 32 bit hardware to test on so I can't comment on the performance differences.

The following methods would also probably benefit from similar optimizations:

IteratorRandom::choose_stable()
IteratorRandom::choose_multiple_fill()
IteratorRandom::choose_multiple()
SliceRandom::shuffle()
SliceRandom::partial_shuffle()

Motivation

Performance.

Alternatives

The code could be left as it is and performance hungry users could use my crate which has a fast version of the choose function (I made the crate just to have a random_max function but got carried away optimizing it).

If this is considered worth doing, I'm happy to submit a PR. I have written most of the code already for my own crate.

Have a lovely day, Mark

The text was updated successfully, but these errors were encountered:

dhardy · 2022-11-17T12:36:34Z

I like the idea Mark. This would be a value-breaking change (acceptable at this time).

Could you run a couple more benchmarks, including for very short iterators and multiple RNGs? If they look reasonable (not too much penalty with very short iterators) then I'd like to see a PR.

wainwrightmark · 2022-11-17T19:07:00Z

I made a PR. I actually found an even faster way of doing it, that uses the minimum possible number of gens (average of 2 per iterator item). That method won't work for shuffle() and partial_shuffle() though and I'm unsure which is better for choose_multiple() so I might make a separate PR for those.

TheIronBorn · 2022-11-18T19:58:58Z

choose_max from your crate sounds useful as well. I've written my own simple versions multiple times but it'd be nice to have a robust one

wainwrightmark · 2022-11-19T20:36:56Z

choose_max from your crate sounds useful as well. I've written my own simple versions multiple times but it'd be nice to have a robust one

Good idea. Will submit a PR at some point.

dhardy · 2023-01-05T16:59:22Z

I was going to close this now that #1268 is merged, but we should answer this first:

choose_max from your crate sounds useful as well. I've written my own simple versions multiple times but it'd be nice to have a robust one

But is this useful enough to include in rand when it already exists in the kindness crate? Rand is already quite big. On the other hand, it's not much new code. Arguments for inclusion please, otherwise it's a no.

TheIronBorn · 2023-01-05T21:00:57Z

It’s quite common in machine learning and mathematical optimization at least. Before kindness I was writing my own but it wasn’t as robust

wainwrightmark · 2023-01-06T16:50:25Z

Probably not a huge deal but the coin_flipper code which would be used by choose_max and friends is private (as it should be) so it has to be duplicated in the kindness and rand crates. All the methods are #[inline] though so this probably won't have binary size implications.

wainwrightmark mentioned this issue Nov 17, 2022

Added new versions of choose and choose_stable #1268

Merged

wainwrightmark closed this as completed Nov 19, 2022

wainwrightmark reopened this Nov 19, 2022

wainwrightmark mentioned this issue Dec 6, 2022

Performance improvements for shuffle and partial_shuffle #1272

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods #1266

CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods #1266

wainwrightmark commented Nov 16, 2022

dhardy commented Nov 17, 2022

wainwrightmark commented Nov 17, 2022

TheIronBorn commented Nov 18, 2022

wainwrightmark commented Nov 19, 2022

dhardy commented Jan 5, 2023

TheIronBorn commented Jan 5, 2023

wainwrightmark commented Jan 6, 2023

CHANGE: Performance Optimizations for IteratorRandom::choose() and related methods #1266

CHANGE: Performance Optimizations for IteratorRandom::choose() and related methods #1266

Comments

wainwrightmark commented Nov 16, 2022

Summary

Details

Motivation

Alternatives

dhardy commented Nov 17, 2022

wainwrightmark commented Nov 17, 2022

TheIronBorn commented Nov 18, 2022

wainwrightmark commented Nov 19, 2022

dhardy commented Jan 5, 2023

TheIronBorn commented Jan 5, 2023

wainwrightmark commented Jan 6, 2023

CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods #1266

CHANGE: Performance Optimizations for `IteratorRandom::choose()` and related methods #1266