Future of SmallRng #767

vks · 2019-04-08T08:12:05Z

SmallRng is an RNG unsuitable for applications where unpredictable randomness is required, making it a poor default choice. There are currently two reasons to choose it over StdRng:

Better performance. I think this advantage can be rendered obsolete by using a state-of-the-art SIMD implementation of ChaCha, which has comparable performance (Would this repo be open to a SIMD implementation of ChaCha20? #667).
Smaller state size. However, ChaCha is only 8.5 times larger than StdRng, making it a suitable choice even for embedded applications. Remaining applications are massively parallel calculations, where the small state size might still be preferable, but this seems like a very niche use case that could be supported by a different crate.

Once we use a good SIMD implementation of ChaCha, there are few use cases remaining where SmallRng would be preferable. I think we could remove it, which has the following advantages:

Less API and code in rand.
Only secure RNG options in rand, requiring a more explicit opt-in to use a predictable RNG (by using an additional crate).

Note that the current implementation of StdRng is significantly slower and larger than SmallRng, so I don't think we should deprecate SmallRng just yet.

The text was updated successfully, but these errors were encountered:

dhardy · 2019-04-08T09:09:39Z

Well reasoned. As I understand, micro-benchmarks of RNGs (especially of simple ones like PCG) are not all that representative of real-world usage. Additionally, my understanding is that SIMD instructions can have varying performance depending on CPU, power provision, thermal limits and use of hyperthreading, so a SIMD CSPRNG which can win micro-benchmarks on modern x86_64 CPUs does not make this the fastest option for everyone.

All the same, I agree with the aim. Many users can switch to StdRng without issue; other users can easily switch to rand_pcg or another crate, getting more control over which PRNG they use.

vks · 2019-04-08T10:18:37Z

Benchmarks can be found here: https://bench.cr.yp.to/results-stream.html

Notably, benchmarks for generating 8 bytes are included, where SmallRng likely has an edge. (Note that our current choice, HC128, is usually the slowest (behind HC256) for generating 8 bytes.)

dhardy · 2019-04-08T11:23:11Z

Encrypting an 8-byte message is quite a different application to, say, generating a boolean with f64 probability. Part of the attraction of simple RNG algorithms is that the optimiser may be able to inline the RNG code within its application, whereas block RNGs must generate a whole block in one step. So SIMD ChaCha20 doing well in one benchmark isn't sufficient evidence to recommend it for everything (even then, it appears 20-100 times slower for 8-byte messages than for long messages).

Not really sure what your point is though, since we seem to agree on what we should do? (The first step being to optimise our ChaCha impl.)

vks · 2019-04-08T12:46:51Z

even then, it appears 20-100 times slower for 8-byte messages than for long messages

We are always generating a 64 byte block, where the ChaCha slow-down is less than 16 times, according to the benchmarks.

Not really sure what your point is though, since we seem to agree on what we should do?

I guess the point is that we have to be careful about deprecating SmallRng with regard to next_u64 performance. We should probably advertise fill more and make it work with f64 and f32.

dhardy · 2019-04-08T14:04:24Z

IIRC our advice on performance has always been roughly as follows: StdRng and thread_rng should be fast enough for most purposes; if their performance is not satisfactory then benchmark multiple PRNGs in your application. The point being, I don't think we should worry too much about speed when deprecating SmallRng.

Is your last point about SIMD float operations? #269 is probably relevant here, but even more so the SIMD implementations for Uniform. Potentially other distributions could get SIMD implementations too (but most are currently limited to f64 output only).

However, SIMD float operations are only useful operations for applications which can consume them; e.g. agent based sims & games would need agents to be batched and avoid most branching logic. This kind of optimisation wouldn't help everyone.

vks · 2019-04-10T12:17:25Z

There is another permutation that was designed to be smaller and faster for short messages than ChaCha: https://gimli.cr.yp.to

It uses 384 bytes of state, so it is only 3 times larger than SmallRng.

dhardy · 2019-06-04T07:57:31Z

#792 introduced a feature-gate for SmallRng. I think we can now close this issue?

vks · 2019-06-04T08:56:19Z

Yes. Based on how much the feature is used, we can decide whether to keep it.

vks added B-API Breakage: API T-RNG Topic: generators labels Apr 8, 2019

vks closed this as completed Jun 4, 2019

tkaitchuck mentioned this issue Jan 11, 2021

Evaluate Mwc256XXA64 rust-random/rngs#18

Closed

roderickvd mentioned this issue Oct 18, 2021

Improve dithering CPU usage librespot-org/librespot#866

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Future of SmallRng #767

Future of SmallRng #767

vks commented Apr 8, 2019 •

edited

dhardy commented Apr 8, 2019

vks commented Apr 8, 2019

dhardy commented Apr 8, 2019

vks commented Apr 8, 2019

dhardy commented Apr 8, 2019

vks commented Apr 10, 2019

dhardy commented Jun 4, 2019

vks commented Jun 4, 2019

Future of SmallRng #767

Future of SmallRng #767

Comments

vks commented Apr 8, 2019 • edited

dhardy commented Apr 8, 2019

vks commented Apr 8, 2019

dhardy commented Apr 8, 2019

vks commented Apr 8, 2019

dhardy commented Apr 8, 2019

vks commented Apr 10, 2019

dhardy commented Jun 4, 2019

vks commented Jun 4, 2019

vks commented Apr 8, 2019 •

edited