rt: add rng_seed option to `runtime::Builder` #4910

hds · 2022-08-12T22:32:36Z

Motivation

In certain circumstances (e.g. running tests with loom), it may be desirable
to have deterministic behavior when running tokio.

Partly this may be achieved by seeding the random number generator used
by the tokio::select! macro.

This PR is more a proof of concept that a real proposed change, it needs work
before it could be merged.

Refs: #4879

Open Questions

There are a number of open questions (or rather things that I need help with):

Currently we seed all threads with the same value. This is important because
when using a multi-threaded runtime, the thread that a task will be scheduled
on is not deterministic. Is this acceptable?
Answer: Threads are now each seeded with a value generated by a seed generator.
This makes the values deterministic, but not the same.
Is there a way to make the thread that a task is run on deterministic?
Answer: This is outside the scope of this change.
The API provided takes a u64 as a seed. This is not very flexible, but avoids
the need to hash or otherwise reduce some other value to a u64 to seed the
actual RNG. This was done deliberately, because it's the easiest thing to change
in this PR before merging. Opinions?
Answer: Changed the value to an opaque RngSeed type which can be constructed
from a byte slice.
Starting multiple runtimes from a single thread will overwrite the RNG seed. This
feels ugly to me, but I don't see a better way without sacrificing performance by,
e.g. making the RNG global. Suggestions?
Answer: The Handle::enter pattern is used to set the seed when a runtime is entered
from a thread, when the thread leaves the runtime, the previous thread local value is
restored.

Solution

The tokio::select! macro polls branches in a random order. While this
is desirable in production, for testing purposes a more deterministic
approach can be useul.

This change adds an additional parameter to the runtime Builder to set
the random number generator seed. This value is then used to reset the
seed on the current thread when the runtime is entered into (restoring the
previous value when the thread leaves the runtime). All threads created
explicitly by the runtime also have a seed set as the runtime is built. Each
thread is set with a seed from a deterministic sequence.

This guarantees that calls to the tokio::select! macro which are
performed in the same order on the same thread will poll branches in the
same order.

Both the builder parameter as well as the RngSeed struct are marked
unstable initially.

The `tokio::select!` macro polls branches in a random order. While this is desirable in production, for testing purposes a more deterministic approach can be useul. This change adds an additional parameter to the runtime `Builder` to set the random number generator seed. This value is then used to reset the seed on all threads associated with the runtime being built. This guarantees that calls to the `tokio::select!` macro which are performed in the same order on the same thread will poll branches in the same order.

carllerche · 2022-08-15T20:58:37Z

Thanks for taking this on! I will look shortly, but first I will answer some of the questions.

Currently we seed all threads with the same value. This is important because
when using a multi-threaded runtime, the thread that a task will be scheduled
on is not deterministic. Is this acceptable?

We probably don't want to do this as it will make the order of rands identical on each thread. What we can do is use the seed for a "seed random generator" and use that to generate a per-thread seed.

Pseudo code:

let seed_rng = Rng::new(seed_from_builder);

for _ in thread_to_spawn.iter() {
    let thread_seed = seed.rng.next_seed();
   spawn_thread_with_seed(thread_seed);
}
```

Hopefully, this makes some sense.

> Is there a way to make the thread that a task is run on deterministic?

Not entirely, but this is out of scope. For the use cases in question, they control enough to make sure it is deterministic. In theory, if you use the current_thread scheduler and strickly control I/O and other input, you can make it deterministic. Even better is to mock out I/O.

carllerche · 2022-08-15T21:05:31Z

The API provided takes a u64 as a seed. This is not very flexible, but avoids
the need to hash or otherwise reduce some other value to a u64 to seed the
actual RNG. This was done deliberately, because it's the easiest thing to change
in this PR before merging. Opinions?

Hmm, I don't think we want to expose u64 there as each rng algorithm has a different seed format. I am not sure what we do want to expose yet, but what we can do is start by flagging this setting as tokio_unstable and punt :). My guess is we will want to define an opaque type RngSeed or something and conversions to it.

carllerche · 2022-08-15T21:06:44Z

Starting multiple runtimes from a single thread will overwrite the RNG seed. This
feels ugly to me, but I don't see a better way without sacrificing performance by,
e.g. making the RNG global. Suggestions?

Can you use the Handle::enter pattern here? When one "enters" a runtime, the rng is set in the thread local. Any prior rng is stored and replaced when the enter guard drops.

NOTE: This change doesn't correct the handling of the RngSeed on the thread the runtime is started from, so it's all likely to change. Just pushing for safety. Instead of exposing the width of the seed we're currently using, expose an opaque struct which can generate a seed from a byte slice. Additionally, each thread is given with a unique seed based on its `id` (`worker_thread_index`). This should enable more deterministic behavior, while ensuring that each thread does not begin with the same seed.

In order to properly clean up after ourselves, we set a specific seed into the thread local RNG when entering into a runtime context. The previous seed (RNG state) is stored in the `EnterGuard` together with the previous context (runtime handle). Upon dropping the guard, the previously stored seed is returned to the thread local RNG. To achieve this in a deterministic, but fair way, we now store a seed generator in the runtime handle, and another in the blocking thread spawner. These seed generators are thread safe (as the one in the handle may be passed across thread boundries) and will produce a deterministic series of seeds when the initial seed provided to the seed generator is the same.

hds · 2022-08-18T22:09:48Z

OK, that was a fun rabbit hole I ended up running down following the Handle:enter pattern. Thanks for that. (-;

We now have a public RngSeed which can be created from a byte slice, and also a RngSeedGenerator which is stored on the runtime handle and the spawner for the BlockingPool in order to seed the RNG for the runtime and the blocking threads respectively.

For the runtime, we do set the seed when entering the runtime and keep the old seed in the EnterGuard (not sure how happy everyone is to extend the size and functionality of that struct). When the guard is dropped we reset the RNG back to the state it had prior to entering the runtime. Because the runtime may be accessed in parallel from multiple threads, we make no attempt to update its handle with the state of local RNG upon dropping the enter guard. Instead, each time a thread "enters" a runtime, the next seed from the seed generator is used. This should provide the necessary deterministic behavior, although there are caveats such as that calling select! twice within a single call to block_on is not equivalent to calling select! within two sequential calls to block_on.

Some questions:

Do we want to make this API unstable initially (perhaps not a bad idea)?
Should this functionality be dependent on the macros feature (the only place its being used currently - this affects how I fix the failing CI jobs)?
Do we want to extend the deterministic seeding to the RNG used to pick a random neighbor worker to steal work from (would not be a big change)?

Extended the implementation to also seed the random number generator used by workers to pick the initial peer to attempt to steal work from. This change was included in the builder function docs.

…uilder-rng-seed

hds · 2022-09-06T10:46:13Z

@carllerche I think I've covered everything in your comments and added the necessary documentation, so this PR is ready for review again when you've got a moment.

carllerche

I skimmed it, and it looks fine to me. I would suggest keeping the new APIs as unstable initially as we let consumers try out the API. To do this, I would just make the public APIs unstable, the implementation can stay as it is.

tokio/src/runtime/builder.rs

tokio/src/runtime/mod.rs

In order to test out the new API before fixing on it, make it unstable first.

…uilder-rng-seed

hds · 2022-09-07T22:13:15Z

I've made the new API unstable, now just waiting to see whether I've got the right visibility everywhere to satisfy clippy on stable and unstable builds.

It shouldn't have been in the first place.

carllerche · 2022-09-14T22:00:45Z

Is this ready to go? It looks like there are a few merge conflicts now.

hds · 2022-09-15T11:58:36Z

Yep, it's ready to go once approved (and now once I fix the merge conflicts). I'll try to merge master in it this afternoon.

…uilder-rng-seed

carllerche

LGTM, a couple of suggestions that you can apply if you want.

tokio/src/util/mod.rs

tokio/src/runtime/builder.rs

Co-authored-by: Carl Lerche <me@carllerche.com>

…kio into hds/runtime-builder-rng-seed

Now that the runtime module is present even without the rt feature.

…uilder-rng-seed

Also fixed a couple of internal comments.

The `tokio::select!` macro polls branches in a random order. While this is desirable in production, for testing purposes a more deterministic approach can be useul. This change adds an additional parameter `rng_seed` to the runtime `Builder` to set the random number generator seed. This value is then used to reset the seed on the current thread when the runtime is entered into (restoring the previous value when the thread leaves the runtime). All threads created explicitly by the runtime also have a seed set as the runtime is built. Each thread is set with a seed from a deterministic sequence. This guarantees that calls to the `tokio::select!` macro which are performed in the same order on the same thread will poll branches in the same order. Additionally, the peer chosen to attempt to steal work from also uses a deterministic sequence if `rng_seed` is set. Both the builder parameter as well as the `RngSeed` struct are marked unstable initially.

The original tests for the `Builder::rng_seed` added in #4910 were a bit fragile. There have already been a couple of instances where internal refactoring caused the tests to fail and need to be modified. While it is expected that internal refactoring may cause the random values to change, this shouldn't cause the tests to break. The tests should be more robust and not be affected by internal refactoring or changes in the Rust compiler version. The tests are changed to perform the same operation in 2 runtimes created with the same seed, the expectation is that the values that result from each runtime are the same.

github-actions bot added the R-loom Run loom tests on this PR label Aug 12, 2022

hds requested a review from carllerche August 12, 2022 22:32

Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Aug 14, 2022

hds added 2 commits August 17, 2022 19:05

hds marked this pull request as ready for review August 19, 2022 21:31

hds added 10 commits August 24, 2022 18:03

extended to seed worker RNG

471f870

Extended the implementation to also seed the random number generator used by workers to pick the initial peer to attempt to steal work from. This change was included in the builder function docs.

Merge branch 'master' of github.com:tokio-rs/tokio into hds/runtime-b…

3fdff28

…uilder-rng-seed

fix dead code warnings

ea633c6

fix code fmt

f78c7cc

fix visibility of rand module

dbdba4f

fix visibility with different feature settings

3cd5146

Merge branch 'master' of github.com:tokio-rs/tokio into hds/runtime-b…

5bf557f

…uilder-rng-seed

Merge branch 'master' of github.com:tokio-rs/tokio into hds/runtime-b…

690e067

…uilder-rng-seed

Merge branch 'master' of github.com:tokio-rs/tokio into hds/runtime-b…

046c7ba

…uilder-rng-seed

exclude wasi from multi-thread select test

379dadb

mcches mentioned this pull request Sep 2, 2022

Add distinct connections over Io tokio-rs/turmoil#18

Merged

Merge branch 'master' of github.com:tokio-rs/tokio into hds/runtime-b…

4600701

…uilder-rng-seed

Merge branch 'master' into hds/runtime-builder-rng-seed

cf50fb2

carllerche reviewed Sep 7, 2022

View reviewed changes

tokio/src/runtime/builder.rs Outdated Show resolved Hide resolved

tokio/src/runtime/mod.rs Outdated Show resolved Hide resolved

hds added 2 commits September 8, 2022 00:02

make Builder::rng_seed and RngSeed unstable

1ee9c80

In order to test out the new API before fixing on it, make it unstable first.

Merge branch 'master' of github.com:tokio-rs/tokio into hds/runtime-b…

a2b633c

…uilder-rng-seed

hds added 4 commits September 8, 2022 08:28

Merge branch 'master' into hds/runtime-builder-rng-seed

3f75e30

Merge branch 'master' into hds/runtime-builder-rng-seed

e2c97cc

make RngSeed not public in builder mod

03bbd6b

It shouldn't have been in the first place.

publish RngSeed from crate::util

8981f72

hds added 2 commits September 15, 2022 16:57

Merge branch 'master' of github.com:tokio-rs/tokio into hds/runtime-b…

370d9e6

…uilder-rng-seed

fix issues arising from merging with Carl's refactoring

3bdbec7

carllerche approved these changes Sep 15, 2022

View reviewed changes

tokio/src/util/mod.rs Outdated Show resolved Hide resolved

tokio/src/runtime/builder.rs Outdated Show resolved Hide resolved

hds and others added 10 commits September 15, 2022 18:06

only allow unreachable_pub when tokio_unstable isn't set

18ba0b6

Co-authored-by: Carl Lerche <me@carllerche.com>

fix comment formatting inside macro

3482562

Co-authored-by: Carl Lerche <me@carllerche.com>

fix visibility when default features disabled

c1111fc

Merge branch 'hds/runtime-builder-rng-seed' of github.com:tokio-rs/to…

68fa544

…kio into hds/runtime-builder-rng-seed

fix visibility issues

04e2fb4

Now that the runtime module is present even without the rt feature.

another offering to the gods of visibility

e0c944d

next offering to visibility

c18463c

Merge branch 'master' of github.com:tokio-rs/tokio into hds/runtime-b…

7471dc2

…uilder-rng-seed

RngSeed isn't need for the macros feature

d85e129

describe conditions for determinism in rng_seed docs

a04044e

Also fixed a couple of internal comments.

hds merged commit b5709ba into master Sep 16, 2022

hds deleted the hds/runtime-builder-rng-seed branch September 16, 2022 15:41

hds mentioned this pull request Oct 4, 2022

rt: improve rng_seed test robustness #5075

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rt: add rng_seed option to `runtime::Builder` #4910

rt: add rng_seed option to `runtime::Builder` #4910

hds commented Aug 12, 2022 •

edited

carllerche commented Aug 15, 2022

carllerche commented Aug 15, 2022

carllerche commented Aug 15, 2022

hds commented Aug 18, 2022 •

edited

hds commented Sep 6, 2022

carllerche left a comment

hds commented Sep 7, 2022

carllerche commented Sep 14, 2022

hds commented Sep 15, 2022

carllerche left a comment

rt: add rng_seed option to runtime::Builder #4910

rt: add rng_seed option to runtime::Builder #4910

Conversation

hds commented Aug 12, 2022 • edited

Motivation

Open Questions

Solution

carllerche commented Aug 15, 2022

carllerche commented Aug 15, 2022

carllerche commented Aug 15, 2022

hds commented Aug 18, 2022 • edited

hds commented Sep 6, 2022

carllerche left a comment

Choose a reason for hiding this comment

hds commented Sep 7, 2022

carllerche commented Sep 14, 2022

hds commented Sep 15, 2022

carllerche left a comment

Choose a reason for hiding this comment

rt: add rng_seed option to `runtime::Builder` #4910

rt: add rng_seed option to `runtime::Builder` #4910

hds commented Aug 12, 2022 •

edited

hds commented Aug 18, 2022 •

edited