Deprecate Rng::sample and Rng::sample_iter #552

sicking · 2018-07-13T01:46:22Z

In #483 we deprecated Rng::choose and Rng::choose_mut in favor of SliceRandom::choose and SliceRandom::choose_mut. The argument (presented here) was that we can think of the the rng object as a source of data. So functions (like gen_range) which generate data should live on the Rng trait. Whereas when we're operating on other data (like choose), we should stick functions on those data objects and take the Rng as a parameter.

I.e. vec.choose(rng) rather than rng.choose(vec).

I think that argument also argues for doing distribution.sample(rng) rather than rng.sample(distribution). The name sample reinforces that the distribution is the source of the data that we're sampling from.

I also think it's generally a bad idea to have two functions that do the same thing. Feels like a source of confusion where it's easy to wonder what the difference is or why one should choose one over the other.

Same goes for sample_iter.

This PR deprecates Rng::sample and Rng::sample_iter and changes all internal callers (mostly tests) to use Distribution::sample instead (there were no internal callers of Rng::sample_iter).

With this change, all functions on Rng generate new values, making the trait feel more coherent. The values are either returned (most functions) or written to a passed in buffer (fill, try_fill).

…sample and Distribution::sample_iter

sicking · 2018-07-13T03:28:58Z

Any ideas what the test failure is? Looks like a tool problem.

thread 'main' panicked at 'unsupported custom section: '.debug_info'', src/wasm_context.rs:646:25

dhardy · 2018-07-13T07:48:28Z

There is one difference IIRC: rng.sample(d) takes d by value, not by reference.

I agree that these are redundant, though I'm still unsure about removing them.

sicking · 2018-07-13T08:47:22Z

Yeah, I've always found that pretty weird. Especially since you usually want to sample from a Distribution multiple times.

Fortunately, because of this, you can still pass a reference to a Distribution to Rng::sample. Though I'm not really sure why we have that setup.

dhardy · 2018-07-13T09:36:59Z

Yeah, it's weird. It's because some distributions have no parameters, thus you don't really need an instance: rng.sample(Uniform). But it's probably unnecessary.

sicking · 2018-07-13T20:04:18Z

Ah, I see. So would we still need that if this PR lands?

dhardy · 2018-07-14T07:59:55Z

Well, this works:

let x: f32 = Standard.sample(&mut rng);

But there's no easy way to specify the type within the function call (since neither Standard nor sample are templated over type T), so there's some advantage to Rng::sample still:

rng.sample::<f32, _>(Standard)

sicking · 2018-07-15T02:13:38Z

You can always do

Distribution::<f32>::sample(Standard, rng)

But if we think that's too verbose, we could add

pub trait DistributionAs {
    fn sample_as<T, R: Rng + ?Sized>(&self, rng: &mut R) -> T
        where Self: Distribution<T>;
}
impl<D> DistributionAs for D {
    fn sample_as<T, R: Rng + ?Sized>(&self, rng: &mut R) -> T
        where Self: Distribution<T> {
        self.sample(rng)
    }
}

which makes it possible to write

Standard.sample_as::<f32, _>(rng)

dhardy · 2018-07-15T07:57:47Z

I don't know, now you're just adding another complication to replace the first 😄

Removing these doesn't seem like a usability win or to have much advantage to be honest.

Besides, your original argument is that Rng functions generate new values — well, rng.sample(Uniform::new(5, 70)) is still generating new values.

Any other rationales? Otherwise I'm going to close this.

sicking · 2018-07-15T09:47:37Z

I don't really see why rng.sample(Uniform::new(0, 10)) is meaningfully different from rng.choose(slice[0..10]) or rng.choose_iter(0..10).

As things stand right now, Rng is neither "do stuff related to randomness", nor "generate random values". On top of that sample and sample_iter are duplicated functions.

I agree that the disambiguation rules for the sample on Rng is different from the one on Distribution in ways that can be useful. However if someone has code like:

let x: f32 = Standard.sample(rng);
do_stuff(x);

and get compilation errors when changing to

do_stuff(Standard.sample(rng));

I don't think it's going to be obvious to that person to switch to

do_stuff(rng.sample::<f32, _>(Standard);

I.e. I don't think Rng::sample is a very good solution to the disambiguation problem in Distribution::sample.

I definitely think that the simplest solution to the problem of disambiguation is to simply rely on the standard rust features of Distribution::<f32>::sample(Standard, rng). It's a few more characters to type, but it's very discoverable and idiomatic.

The other thing to remember is that this is mainly a problem for Open01 and OpenClosed01. Standard is generally used through Rng::gen, and all other distributions only implement Distribution<T> for one T. So most developers are likely not going to run into this problem.

vks · 2018-07-16T07:14:56Z

I don't understand why you want to deprecate Rng::sample but keep Rng::gen_*. The latter are special cases of Rng::sample. (FWIW, I would be fine with deprecating all of them.)

dhardy · 2018-07-16T08:12:59Z

Rng is about convenience. The fact that the only original functionality on Rng is fill and try_fill (and the only originality here is applying fill_bytes / try_fill_bytes to different types) should tell you that.

I don't think we should remove rng.gen_bool(p) because the alternative, Bernoulli::new(p).sample(&mut rng), is not very concise.

Maybe I was wrong in removing choose and shuffle; I did so because these implementations were in the way of building a nice API around extension traits, but they could be re-introduced as convenience wrappers around SliceRandom::choose etc.

Technically, yes, we could remove Rng entirely. I see Rng as a place to conveniently expose common functionality to make life easier for users, and especially to make Rand easier to learn.

With that in mind, I suppose we could un-deprecate choose/choose_mut/shuffle and keep the current wrapper implementations forever.

vks · 2018-07-16T12:52:38Z

Rng is about convenience. The fact that the only original functionality on Rng is fill and try_fill [...] should tell you that.

This is not really reflected by the documentation. It does not clarify that only fill and try_fill are original.

With that in mind, I suppose we could un-deprecate choose/choose_mut/shuffle and keep the current wrapper implementations forever.

Alternatively, we could move fill and friends to SliceRandom as well and declare Rng completely as a convenience wrapper. This is a bit orthogonal to what SliceRandom currently does, but the name is generic enough to allow it.

dhardy · 2018-07-16T13:01:19Z

fill is only a thin wrapper around fill_bytes already.

vks · 2018-07-16T13:06:35Z

We also have to consider that Rng is the first exposure for new Rust programmers to the crates.io ecosystem in TRPL:

let secret_number = rand::thread_rng().gen_range(1, 101);

If we want to reduce the API surface, we should probably discuss (maybe even with the docs team) whether something like

let mut rng = rand::thread_rng();
let secret_number = rand::Uniform::new(1, 101).sample(&mut rng);

is newcomer-friendly enough.

dhardy · 2018-07-16T13:32:07Z

If we want to reduce the API surface

I don't want to advocate a large and complex API, but in this case I don't think a small amount of redundancy (in the form of convenience functions wrapping other implementations) adds much complexity, and I think it does add a significant amount of convenience (look at the example just posted; even ignoring repeated symbols and local variables, it goes from 3 identifiers to 5: rand, thread_rng, Uniform, new, sample).

There's a case for teaching users the building blocks of a toolkit and a case for teaching how to accomplish basic tasks; I think good design has to consider trade-offs between the two.

vks · 2018-07-16T14:00:30Z

it goes from 3 identifiers to 5

Arguably gen_range just hardcodes an identifier into another, but I guess it helps with type inference.

There's a case for teaching users the building blocks of a toolkit and a case for teaching how to accomplish basic tasks; I think good design has to consider trade-offs between the two.

Fair enough, I'm only slightly worried that this is not made clear by the current documentation.

pitdicker · 2018-07-17T05:34:11Z

In #483 we deprecated Rng::choose and Rng::choose_mut in favor of SliceRandom::choose and SliceRandom::choose_mut.

Oops, missed those changes. Deprecating choose, choose_mut and shuffle seems reasonable to me. They are very much algorithms on data, that need an RNG for their operations. I do think we should consider a bit more careful how much user code this may break before the release.

Like @dhardy I very much see Rng as a convenience trait. For a random number generator trait any method that 'generates' values seems appropriate to me.

gen,
fill and try_fill,
gen_range (and I liked the idea of gen_below, but that is another discussion),
sample and sample_iter,
gen_bool and gen_ratio (or some other name).

I think that argument also argues for doing distribution.sample(rng) rather than rng.sample(distribution). The name sample reinforces that the distribution is the source of the data that we're sampling from.

In this case I don't see the distribution as a source of data, but as a transformation. Not really sure how to put it though.

sicking · 2018-07-17T05:49:25Z

Ok, seems I'm the only one that feel these are out of place, and are sufficiently bothered by the redundancy to think that we should remove them.

sicking force-pushed the nosample branch 2 times, most recently from f7ebf1e to 8fd4d0d Compare July 13, 2018 02:54

sicking mentioned this pull request Jul 13, 2018

Tracker: 0.6 release #520

Closed

28 tasks

Deprecate Rng::sample and Rng::sample_iter in favor of Distribution::…

89ec7b4

…sample and Distribution::sample_iter

sicking force-pushed the nosample branch from 8fd4d0d to 89ec7b4 Compare July 13, 2018 03:08

sicking closed this Jul 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deprecate Rng::sample and Rng::sample_iter #552

Deprecate Rng::sample and Rng::sample_iter #552

sicking commented Jul 13, 2018 •

edited

sicking commented Jul 13, 2018

dhardy commented Jul 13, 2018

sicking commented Jul 13, 2018

dhardy commented Jul 13, 2018

sicking commented Jul 13, 2018

dhardy commented Jul 14, 2018

sicking commented Jul 15, 2018 •

edited

dhardy commented Jul 15, 2018

sicking commented Jul 15, 2018

vks commented Jul 16, 2018

dhardy commented Jul 16, 2018 •

edited

vks commented Jul 16, 2018

dhardy commented Jul 16, 2018

vks commented Jul 16, 2018

dhardy commented Jul 16, 2018

vks commented Jul 16, 2018

pitdicker commented Jul 17, 2018

sicking commented Jul 17, 2018

Deprecate Rng::sample and Rng::sample_iter #552

Deprecate Rng::sample and Rng::sample_iter #552

Conversation

sicking commented Jul 13, 2018 • edited

sicking commented Jul 13, 2018

dhardy commented Jul 13, 2018

sicking commented Jul 13, 2018

dhardy commented Jul 13, 2018

sicking commented Jul 13, 2018

dhardy commented Jul 14, 2018

sicking commented Jul 15, 2018 • edited

dhardy commented Jul 15, 2018

sicking commented Jul 15, 2018

vks commented Jul 16, 2018

dhardy commented Jul 16, 2018 • edited

vks commented Jul 16, 2018

dhardy commented Jul 16, 2018

vks commented Jul 16, 2018

dhardy commented Jul 16, 2018

vks commented Jul 16, 2018

pitdicker commented Jul 17, 2018

sicking commented Jul 17, 2018

sicking commented Jul 13, 2018 •

edited

sicking commented Jul 15, 2018 •

edited

dhardy commented Jul 16, 2018 •

edited