SmallRng: Replace PCG algorithm with xoshiro{128,256}++ #1038

vks · 2020-09-06T22:00:48Z

Due to close correlations of PCG streams (#907) and lack of right-state propagation (#905), the SmallRng algorithm is switched to xoshiro{128,256}++. The implementation is taken from the rand_xoshiro crate and slightly simplified.

Fixes #910.

src/rngs/mod.rs

vks · 2020-09-08T11:14:50Z

I decided to use the default impl of seed_from_u64 instead of pulling in the splitmix RNG as well. This means that the results will be different than with rand_xoshiro. I think this is okay, since we don't guarantee value stability, but maybe we should avoid this gotcha.

dhardy

Looks good to me. I like the simplified implementations and agree with the decision not to use SplitMix for seed_from_u64 (IIRC the official advice is simply to use a significantly different PRNG, so our existing PCG32 impl should be fine).

src/rngs/small.rs

dhardy · 2020-09-08T13:48:38Z

src/rngs/xoshiro256plusplus.rs

+impl RngCore for Xoshiro256PlusPlus {
+    #[inline]
+    fn next_u32(&mut self) -> u32 {
+        self.next_u64() as u32
+    }


This takes the low-order bytes. @vigna is it preferable to take the high-order bytes given that we're discarding half anyway? Or is there no point caring for the ++ variant?

Well, I'd take the high bits (longer carry chain propagation, which means higher linear complexity), but, really, the difference in quality is not detectable.

Note that if you use PCG32 to initialize the state usinga 64-bit seed, seeds with a large number of lower bits in common will initialize the state in a similar way (try with 0 and 2^63), and consequently the very first outputs will be correlated. The effect might be however very mild in practice.

You can avoid this problem by mixing the seed with a MurmurHash mix (or any other quick bijective mixing function) before using it as a seed for PCG32 (maybe you're already doing something like that, in which case forget this comment).

Of course there will be always seeds leading to similar initialization if you use PCG32, but at least they will be random-looking.

Taking the higher bits makes sense (this should also be done for the implementations in rand_xoshiro).

@vigna Would switching to splitmix for 64-bit seeding also avoid this problem?

Yes, using SplitMix avoids visible dependence on the 64-bit seed.

If there's an issue with our seed_from_u64 method, I'd prefer we deal with that directly.

Oh I see. So the issue propagates to every generator using that method. I would have used something with less dependence from the initial value, but I agree that it's unlikely significant problems might surface.

@dhardy

I'm tempted to ignore it: it's not very significant.

What is the argument for continuing to use PCG? Avoiding churn? I think it would be nice to avoid issues such as #905, even if it does not matter that much in this case.

We could just use a cryptographic hash function here; it's not performance critical, though we prefer to minimise code size.

Switching to splitmix is approximately the same number of lines and does not suffer from the correlation problem. As far as I can tell, the only downside is that this is a value-breaking change for everyone using seed_from_u64.

@vks #905 is not an "issue" so much as incorrect documentation. To quote @vigna:

Personally, I don't believe in the "middle ground" of "difficult-to-predict-but-not-crypto": usually it just means it is so easy to break that nobody tries. I mean, I can claim that xoshiro256++ is "difficult to predict"—and maybe that's even true. But until someone tries for real, that's just a vague and unsubstatiated claim.

The main issue with PCG, as far as I am aware, is that streams tend to be quite similar; this is (part) of the issue in #1032 for example. This really doesn't matter for seed_from_u64 since it only uses a single stream.

If we really want a perfect initialiser we should use a proven cryptographic PRNG or hash function, but to some extent this is overkill considering we only start with (a maximum of) 64-bits of entropy. It seems to me it is "good enough" for intended uses. If @vigna strongly disagrees, I will listen to his advice (though what I understood from the above is that we should use a hash function such as murmurhash and a PRNG).

We also are trying to keep the code-size of rand_core minimal however. I'm not entirely convinced the split between rand and rand_core still makes sense, but changing that now is more churn than justified in a mostly-mature library.

Between SplitMix64 and PCG32, I remain unconvinced that either is significantly better than the other. PCG does more mixing overall (since it emits only half the state per round, and does both multiplication and addition vs just addition), while SplitMix may have a stronger output function. Either way, I think getting good avalanche from low-order bits is much more important than from high-order bits due to the way users are likely to use this, and in this case I think both PRNGs are acceptable. But if @vigna knows better, please do correct me.

#905 is not an "issue" so much as incorrect documentation.

I was referring to the issue with similar sequences for different seeds.

This really doesn't matter for seed_from_u64 since it only uses a single stream.

Yes, but a similar issue persists anyway: Different (but similar) seeds give very similar sequences, as discussed above.

I'm not entirely convinced the split between rand and rand_core still makes sense, but changing that now is more churn than justified in a mostly-mature library.

I agree. For Rand 1.0 I could imagine merging the two crates and making rand without default features equivalent to the old rand_core.

Between SplitMix64 and PCG32, I remain unconvinced that either is significantly better than the other. PCG does more mixing overall (since it emits only half the state per round, and does both multiplication and addition vs just addition), while SplitMix may have a stronger output function. Either way, I think getting good avalanche from low-order bits is much more important than from high-order bits due to the way users are likely to use this, and in this case I think both PRNGs are acceptable. But if @vigna knows better, please do correct me.

Alright, I will revert the splitmix commit. If necessary, we can perform this change in the future.

One of the issues with PCG is that some subsequences of the same generator are very correlated (similar bit patterns), as shown above. You're confusing this issue with the issue that sequences from different streams are similar. The first one is obviously proven by the example you're showing. This is the issue at hand.

The persistent bit patterns of PCG32 for correlated seed cannot happen with SplitMix, so, yes, there is a significant and measurable difference. You can take two states of PCG32 and get highly correlated subsequences in which repeated bit patterns appear. You cannot do the same with SplitMix. You might be unconvinced, but it's mathematics. I'm not claiming that SplitMix will have uncorrelated subsequences, because the state mixing happens only leftward, as with PCG, but the problem is mitigated enormously by the powerful mix function.

If you're OK with repeated bit patterns in the initial state of your generators, that's another question—it's your library, if you want them, keep them.

vks · 2020-09-10T21:30:54Z

@dhardy I restored the PCG32-based implementation of seed_from_u64. Is this ok to merge?

dhardy · 2020-09-14T09:41:22Z

@vks thanks for the quick changes, but we should finish planning first. @vigna has a good argument that SplitMix is slightly better for seed_from_u64 and I cannot counter that, but...

Breaking changes should be minimised. This isn't applicable for SmallRng since it's changing anyway, but is for other RNGs.
There is now some vague possibility of a significant redesign which might impact what we ultimately wish to do here (after v0.8) by allowing the use of a stronger RNG/hash function.

Perhaps, then, the best option is to override seed_from_u64 to use SplitMix for the new SmallRng but not change anything else for now: that way, users of other RNGs are not impacted for now.

vigna · 2020-09-14T09:52:15Z

Yes, that's probably the best solution. Fix it locally for now, put together a more comprehensive solution for the future.

Note that if you use a 64-bit seed to initialize a larger state, there are some things which are mathematically impossible to avoid. For example, there will be always seeds leading to a similar first word of state (or any fixed word of state): if the initializing generator you are using does not output every possible value (e.g., PCG32), there will be even some seeds leading to an identical first word of state; and even if the initializing generator outputs exactly once every possible 64-bit value, there will be first words of state sharing, say, the lower 63 bits. This is a fact of life—you cannot avoid this even with a crypto-strength initializing generator.

What you can avoid is that more than one word of state is correlated, because you are tapping into a much larger state space. A crypto-strength initializing generator will do that for you. SplitMix will provide words that have no visible similarity artifacts, and that are sufficiently uncorrelated to initialize a generator. It is true, however, that if you use two seeds differing only in the higher bit the difference in initialization will be only due to the spreading of the influence of that bit by the mixing function (which, once again, is more than sufficient IMHO).

vks · 2020-09-14T16:55:01Z

Alright, I migrated SmallRng::from_seed_u64 to SplitMix64, avoiding the value-breaking change to rand_core that can be considered after Rand 0.8 is released. I think this PR can be merged now?

dhardy · 2020-09-14T17:09:19Z

src/rngs/small.rs

+/// Note that depending on the application, [`StdRng`] is faster on many modern
+/// platforms while providing higher-quality randomness. Furthermore, `SmallRng`


Can we really say it is faster on many platforms? Micro-benchmarks aren't fully representative, and I don't think we have enough other data to go on, so better simply to say that it may be faster?

dhardy · 2020-09-14T17:09:48Z

src/rngs/small.rs

+/// - Security against prediction or reproducibility are important.
+///   Use [`StdRng`] instead.


This is two separate things: don't recommend StdRng for reproducibility!

I'll just remove "reproducibility", because this is discussed in detail below.

Due to close correlations of PCG streams (rust-random#907) and lack of right-state propagation (rust-random#905), the `SmallRng` algorithm is switched to xoshiro{128,256}++. The implementation is taken from the `rand_xoshiro` crate and slightly simplified. Fixes rust-random#910.

Also fix a wrong link.

This update in particular changes the SmallRng to xoshiro, which I explored in an experimental branch. Benchmarking shows that this change is performance-neutral and the output looks the same. See rust-random/rand#1038 for more. The experimental branch was more complicated because it replaced the use of thread_rng with a custom rayon pool. I may eventually pick up that branch again because it offers more control, but for now this new rand version gets me the better PRNG. AMD Ryzen 9 3900X 12-Core Processor (AMD64 Family 23 Model 113 Stepping 0) tracescene/10x10x4 time: [295.42 us 295.79 us 296.17 us] change: [-1.4129% -1.1793% -0.9318%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low severe 5 (5.00%) low mild 1 (1.00%) high mild 1 (1.00%) high severe

This update in particular changes `SmallRng` to xoshiro. `SmallRng` is only used for scene construction, as the core continues to use `thread_rng`, which wraps `StdRng`. See rust-random/rand#1038 for more. As expected, benchmarking shows that this change is performance-neutral and the output looks correct. There is an ongoing experimental branch that effectively implements a custom thread-local RNG solution using a custom rayon pool. This will in particular allow using xoshiro in the core trace loop. Benchmarking: AMD Ryzen 9 3900X 12-Core Processor (AMD64 Family 23 Model 113 Stepping 0) tracescene/10x10x4 time: [295.42 us 295.79 us 296.17 us] change: [-1.4129% -1.1793% -0.9318%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low severe 5 (5.00%) low mild 1 (1.00%) high mild 1 (1.00%) high severe

bjorn3 reviewed Sep 6, 2020

View reviewed changes

src/rngs/mod.rs Show resolved Hide resolved

dhardy approved these changes Sep 8, 2020

View reviewed changes

dhardy reviewed Sep 14, 2020

View reviewed changes

dhardy approved these changes Sep 15, 2020

View reviewed changes

vks added 8 commits September 15, 2020 15:40

SmallRng: Recommend rand_chacha for performance

dd95631

Also fix a wrong link.

Update changelog

616ea87

Revise SmallRng advise

25ea6c1

Fix dead links

0c8dc1c

xoshiro256++: Prefer upper bits

530cd27

Migrate SmallRng::seed_from_u64 from PCG32 to SplitMix64

02aafbd

Improve SmallRng advise

a260f2c

vks force-pushed the replace-pcg branch from ae35664 to a260f2c Compare September 15, 2020 13:40

vks merged commit f01b65d into rust-random:master Sep 15, 2020

vks deleted the replace-pcg branch September 15, 2020 13:56

dhardy mentioned this pull request Sep 29, 2020

SmallRng seed is too big, and seedability #1054

Closed

CAD97 mentioned this pull request Dec 21, 2020

Update rand requirement from 0.7 to 0.8 bevyengine/bevy#1114

Merged

sync-by-unito bot mentioned this pull request Dec 21, 2020

Bump rand from 0.7.3 to 0.8.0 AleoHQ/leo#510

Closed

This was referenced Mar 9, 2021

Update rand requirement from 0.7 to 0.8 hacspec/hacspec#80

Merged

chore(deps): update rand requirement from 0.7 to 0.8 transparencies/zip#2

Open

Bump rand from 0.7.3 to 0.8.3 Lionjudge9061-corp/mobilecoin#2

Open

dependabot bot mentioned this pull request Mar 16, 2021

Bump rand from 0.7.3 to 0.8.3 ZeusWPI/zauth#65

Closed

dhardy mentioned this pull request Nov 22, 2021

SmallRng uses wrong seed_from_u64 implementation #1203

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SmallRng: Replace PCG algorithm with xoshiro{128,256}++ #1038

SmallRng: Replace PCG algorithm with xoshiro{128,256}++ #1038

vks commented Sep 6, 2020

vks commented Sep 8, 2020

dhardy left a comment

dhardy Sep 8, 2020

vigna Sep 8, 2020

vks Sep 8, 2020

vigna Sep 8, 2020

dhardy Sep 8, 2020

vigna Sep 9, 2020

vks Sep 10, 2020

dhardy Sep 10, 2020

vks Sep 10, 2020

vigna Sep 10, 2020 •

edited

vks commented Sep 10, 2020

dhardy commented Sep 14, 2020

vigna commented Sep 14, 2020

vks commented Sep 14, 2020

dhardy Sep 14, 2020

dhardy Sep 14, 2020

vks Sep 14, 2020

		/// Note that depending on the application, [`StdRng`] is faster on many modern
		/// platforms while providing higher-quality randomness. Furthermore, `SmallRng`

		/// - Security against prediction or reproducibility are important.
		/// Use [`StdRng`] instead.

SmallRng: Replace PCG algorithm with xoshiro{128,256}++ #1038

SmallRng: Replace PCG algorithm with xoshiro{128,256}++ #1038

Conversation

vks commented Sep 6, 2020

vks commented Sep 8, 2020

dhardy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vigna Sep 10, 2020 • edited

Choose a reason for hiding this comment

vks commented Sep 10, 2020

dhardy commented Sep 14, 2020

vigna commented Sep 14, 2020

vks commented Sep 14, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vigna Sep 10, 2020 •

edited