WeightedIndex: Make it possible to update a subset of weights #866

vks · 2019-08-14T14:13:01Z

This could be useful for crates like droprate.

I had to add an additional field total_weight to WeightedIndex, which is redundant to the field weight_distribution. However, I cannot use the latter without making the information about the end of the sampled range public.

dhardy

Sure, we can add this.

dhardy · 2019-08-15T09:40:33Z

src/distributions/weighted/mod.rs

+                return Err(WeightedError::InvalidWeight);
+            }
+            if i >= self.cumulative_weights.len() {
+                return Err(WeightedError::TooMany);


Would be worth adding InvalidIndex, except that it's a breaking change. Perhaps do so in a separate PR which we don't land until we start preparing the next Rand version?

Yeah, I though about this as well. Will do once this is merged.

src/distributions/weighted/mod.rs

dhardy · 2019-08-15T10:00:39Z

src/distributions/weighted/mod.rs

+                old_w -= &self.cumulative_weights[i - 1];
+            }
+
+            for j in i..self.cumulative_weights.len() {


This is O(n*m) where n = cumulative_weights.len() - min_index; m = new_weights.len().

Instead we should sort the new_weights by index, then apply in-turn (like in new); this is O(m*log(m) + n).

Also, we can just take total_weight = cumulative_weights.last().unwrap().

Instead we should sort the new_weights by index, then apply in-turn (like in new); this is O(m*log(m) + n).

I'll look into this.

Also, we can just take total_weight = cumulative_weights.last().unwrap().

I don't think so, the last cumulative weight is not stored in the vector. Or are you saying we should change it such that it is?

Aha, binary_search_by is happy to return an index one-past-the-last-item, therefore the final weight is not needed. (And we have motive for not including the final weight: it guarantees we will never exceed the last index of the input weights list.)

Then yes, we need to store either the last weight or the total as an extra field.

Instead we should sort the new_weights by index, then apply in-turn (like in new); this is O(m*log(m) + n).

I implemented that. It's a bit messy, because the the index type might be unsigned.

dhardy

Yes, that did get messy! Eventually I convinced myself that your implementation is probably right.

Fortunately we can clean it up a lot (at the cost of two clones and one extra subtraction per un-adjusted weight). I think for all types we care about the clones will be cheap. Granted this is probably slower than your method when only updating a small subset of many indices, but I think not hugely slow and it's still O(n+m).

I'll leave it to your preference to require ordered input vs sorting.

Finally, do we need two loops? Only if we care about not changing self when given invalid parameters.

src/distributions/weighted/mod.rs

vks · 2019-08-16T09:06:26Z

I'll leave it to your preference to require ordered input vs sorting.

I think it is better to require sorted input, because usually it's trivial for the user to provide.

Finally, do we need two loops? Only if we care about not changing self when given invalid parameters.

The problem is that this would result in self being in an invalid state, which I wanted to avoid. (This would not be a problem if we would just panic.)

vks · 2019-08-16T11:36:55Z

I simplified the code as you suggested. The performance seems similar enough.

dhardy · 2019-08-17T11:17:32Z

Thanks; then I think this is good to go. I won't have very much time available for this for a few weeks, so I'll leave you to merge.

vks · 2019-08-18T15:01:14Z

@dhardy Unfortunately, I'm not authorized to merge.

dhardy · 2019-08-22T07:16:38Z

One timeout, one Redox failure. Good enough, I guess.

vks added 2 commits August 13, 2019 17:40

WeightedIndex: Make it possible to update a subset of weights

390dfa2

Benchmark creation vs. modification of WeightedIndex

7351358

dhardy reviewed Aug 15, 2019

View reviewed changes

vks added 3 commits August 15, 2019 14:56

WeightedIndex::update_weights: Correct comment

8d846a3

WeightedIndex::update_weights: More efficient implementation

304ba15

WeightedIndex: Clean up benchmark

ee6f334

dhardy reviewed Aug 16, 2019

View reviewed changes

src/distributions/weighted/mod.rs Outdated Show resolved Hide resolved

src/distributions/weighted/mod.rs Outdated Show resolved Hide resolved

Avoid an unnecessary clone

5ef9bde

vks added 2 commits August 16, 2019 13:31

Simplify WeightedIndex::update_weights

8c258dc

Remove outdated comments

20ebbd9

Fix WeightedIndex for alloc builds

c9428a0

dhardy merged commit 8616945 into rust-random:master Aug 22, 2019

vks deleted the update-weights branch August 22, 2019 11:42

dependabot bot mentioned this pull request Mar 15, 2021

Update rand requirement from 0.7 to 0.8 transparencies/yew#5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WeightedIndex: Make it possible to update a subset of weights #866

WeightedIndex: Make it possible to update a subset of weights #866

vks commented Aug 14, 2019

dhardy left a comment

dhardy Aug 15, 2019

vks Aug 15, 2019

dhardy Aug 15, 2019

vks Aug 15, 2019

dhardy Aug 15, 2019

vks Aug 15, 2019

dhardy left a comment

vks commented Aug 16, 2019 •

edited

vks commented Aug 16, 2019

dhardy commented Aug 17, 2019

vks commented Aug 18, 2019

dhardy commented Aug 22, 2019

WeightedIndex: Make it possible to update a subset of weights #866

WeightedIndex: Make it possible to update a subset of weights #866

Conversation

vks commented Aug 14, 2019

dhardy left a comment

Choose a reason for hiding this comment

dhardy Aug 15, 2019

Choose a reason for hiding this comment

vks Aug 15, 2019

Choose a reason for hiding this comment

dhardy Aug 15, 2019

Choose a reason for hiding this comment

vks Aug 15, 2019

Choose a reason for hiding this comment

dhardy Aug 15, 2019

Choose a reason for hiding this comment

vks Aug 15, 2019

Choose a reason for hiding this comment

dhardy left a comment

Choose a reason for hiding this comment

vks commented Aug 16, 2019 • edited

vks commented Aug 16, 2019

dhardy commented Aug 17, 2019

vks commented Aug 18, 2019

dhardy commented Aug 22, 2019

vks commented Aug 16, 2019 •

edited