Skip to content

Commit

Permalink
Improve description of initial value behavior in estimator
Browse files Browse the repository at this point in the history
Rather than describing this as "debiasing", instead make clear that we
are properly normalizing a weighted average with a sum of weights less
than 1.
  • Loading branch information
afontenot committed Jun 3, 2023
1 parent e31cbf3 commit d1cbe7f
Showing 1 changed file with 32 additions and 25 deletions.
57 changes: 32 additions & 25 deletions src/state.rs
Expand Up @@ -430,15 +430,19 @@ impl Estimator {
self.smoothed_steps_per_sec =
self.smoothed_steps_per_sec * weight + new_steps_per_second * (1.0 - weight);

// Get an unbiased estimate of `smoothed_steps_per_sec` to serve as the data source for the
// double smoothed estimate. See comment on debiasing in `steps_per_second` for details.
// An iterative estimate like `smoothed_steps_per_sec` is supposed to be an exponentially
// weighted average from t=0 back to t=-inf; Since we initialize it to 0, we neglect the
// (non-existent) samples in the weighted average prior to the first one, so the resulting
// average must be normalized. We normalize the single estimate here in order to use it as
// a source for the double smoothed estimate. See comment on normalization in
// `steps_per_second` for details.
let delta_t_start = duration_to_secs(now - self.start_time);
let debias = 1.0 - estimator_weight(delta_t_start);
let debiased_smoothed_steps_per_sec = self.smoothed_steps_per_sec / debias;
let total_weight = 1.0 - estimator_weight(delta_t_start);
let normalized_smoothed_steps_per_sec = self.smoothed_steps_per_sec / total_weight;

// determine the double smoothed value (EWA smoothing of the single EWA)
self.double_smoothed_steps_per_sec = self.double_smoothed_steps_per_sec * weight
+ debiased_smoothed_steps_per_sec * (1.0 - weight);
+ normalized_smoothed_steps_per_sec * (1.0 - weight);

self.prev_steps = new_steps;
self.prev_time = now;
Expand All @@ -464,32 +468,35 @@ impl Estimator {
let delta_t = duration_to_secs(now - self.prev_time);
let reweight = estimator_weight(delta_t);

// Debiasing:
// Normalization of estimates:
//
// Our exponentially weighted estimate is a single value (smoothed_steps_per_second) that
// is iteratively updated. At each update, the previous value of the estimate is
// re-weighted according its age. At any point in time, the raw value of this estimate
// reflects the assumption that it contains properly weighted sample values going back
// indefinitely in time. But this assumption is false.
// The raw estimate is a single value (smoothed_steps_per_second) that is iteratively
// updated. At each update, the previous value of the estimate is re-weighted according to
// its age. The resulting estimate is therefore a weighted average, where the weight of
// each sample is the difference W(t1) - W(t2), where t1 and t2 are the time deltas since
// the end and beginning of the sample (respectively), and W(t) is a function describing
// the appropriate total weight for data older than t, e.g. W(t) = 0.1 ^ (t/15).
//
// The value is initialized with some value when the estimator starts. The raw value of the
// estimator treats this as an appropriately weighted sample average across all times
// before t=0. Of course, the value is actually arbitrary. In other words, because the raw
// estimate gives a positive weight to this initial value, the resulting estimate will be
// *biased* towards the initial value.
// Note that since t = Sum(t_n) implies W(t) = Prod(W(t_n)), the formula for the total
// weight of data older than t is just the normal iterative weighting function.
//
// A debiased estimate is the result of correcting the raw estimate by assigning 0 weight
// to the initial value. We can do this with a simple trick: set the initial value to 0,
// and then divide the raw estimate by the estimator weight for all time *since* t=0.
// Since W(0) = 1, and Limit(W(t)) -> 0 as t -> inf, the sum of the weights in the weighted
// average will be less than 1 unless there are samples going back to t = inf. Since this
// isn't possible, we initialize the raw estimate to 0, meaning that we have a finite
// number of samples with a total weight less than 1.
//
// Therefore, the raw estimate must be normalized by dividing it by the sum of the weights
// in the weighted average. This sum is just W(0) - W(t_f), where t_f is the time since the
// first sample, and W(0) = 1.
let delta_t_start = duration_to_secs(now - self.start_time);
let debias = 1.0 - estimator_weight(delta_t_start);
let total_weight = 1.0 - estimator_weight(delta_t_start);

// Generate updated values for `smoothed_steps_per_sec` and `double_smoothed_steps_per_sec`
// (sps and dsps) without storing them. Note that we debias sps when using it as a source
// to update dsps, and then debias dsps itself before returning it.
let sps = self.smoothed_steps_per_sec * reweight / debias;
// (sps and dsps) without storing them. Note that we normalize sps when using it as a
// source to update dsps, and then normalize dsps itself before returning it.
let sps = self.smoothed_steps_per_sec * reweight / total_weight;
let dsps = self.double_smoothed_steps_per_sec * reweight + sps * (1.0 - reweight);
dsps / debias
dsps / total_weight
}
}

Expand Down Expand Up @@ -713,7 +720,7 @@ mod tests {

// The first level EWA:
// -> 90% weight @ 0 eps, 9% weight @ 1 eps, 1% weight @ 0 eps
// -> then debiased by deweighting the 1% weight (before -30 seconds)
// -> then normalized by deweighting the 1% weight (before -30 seconds)
let single_target = 0.09 / 0.99;

// The second level EWA:
Expand Down

0 comments on commit d1cbe7f

Please sign in to comment.