Fix incorrect seconds_per_step calculation #535

afontenot · 2023-04-24T20:28:49Z

seconds_per_step looks at a ring buffer with samples taken each tick. Each sample is

<duration of the tick> / <number of progress steps>

This results in an incorrect seconds_per_step, when this is calculated as a raw average of the buffer values. This is because we have no reason to assume any tick is a more representative sample than any other tick, but this will treat ticks with very few steps as more representative than steps with many ticks, because the value of the denominator is lost in the sample.

The result is a very poor ETA / bitrate estimation when step production follows a burst-wait pattern, since ticks with few or no steps are to be expected.

A better estimate could be achieved by averaging over the step-rate of each tick, but this would still not be ideal since it would treat short ticks as equally representative as long ticks in the average.

Instead, we make two changes:

First, we change the buffer to store the total number of steps rather than the difference between steps. This necessitates changing the new / reset code to set the first buffer item to zero.

Second, we create a second ring buffer to store the timestamp of each tick. As a result, we can calculate the seconds_per_step rate by simply looking at the earliest values in the buffer, and returning (now() - first_time) / (last_steps - first_steps).

By using now() instead of last_time in the above equation, we also fix an additional issue. Since we only add items to the buffer when new steps occur, the progress bar freezes and the ETA does not update when progress stalls. By always using the current time, we get a more accurate seconds_per_step estimate with no additional overhead.

Fixes #534.

seconds_per_step looks at a ring buffer with samples taken each tick. Each sample is <duration of the tick> / <number of progress steps> This results in an incorrect seconds_per_step, when this is calculated as a raw average of the buffer values. This is because we have no reason to assume any tick is a more representative sample than any other tick, but this will treat ticks with very few steps as *more* representative than steps with many ticks, because the value of the denominator is lost. The result is a very poor ETA / bitrate estimation when step production follows a burst-wait pattern, since ticks with few or no steps are to be expected. A better estimate could be achieved by averaging over the step-rate of each tick, but this would still not be ideal since it would treat short ticks as equally representative as long ticks in the average. Instead, we make two changes: First, we change the buffer to store the total number of steps rather than the difference between steps. This necessitates changing the new / reset code to set the first buffer item to zero. Second, we create a second ring buffer to store the timestamp of each tick. As a result, we can calculate the `seconds_per_step` rate by simply looking at the earliest values in the buffer, and returning `(now() - first_time) / (last_steps - first_steps)`. By using `now()` instead of `last_time` in the above equation, we also fix an additional issue. Since we only add items to the buffer when new steps occur, the progress bar freezes and the ETA does not update when progress stalls. By always using the latest time, we get a more accurate seconds_per_step estimate with no additional overhead.

afontenot · 2023-04-26T06:02:13Z

The following two changes required modifying the tests:

changing the new / reset code to set the first buffer item to zero.

Means that a newly initialized ring buffer contains 1 element, not 0.

using now() instead of last_time in the above equation

Depending on the latest time when calculating seconds_per_step mean that the test which checks that the rate calculation is correct can't be exact any more. Instead, I check that it is between the rates expected for two very close Instant values - immediately before and after the call to seconds_per_step.

djc · 2023-04-26T21:07:17Z

I'm not able to review this in detail for the next week or two, but here is some early feedback:

Given the sensitivity of this code, it would be great if you could add a whole bunch more comments to the parts of the Estimator that you changed to elaborate what is going on and why.
It looks like you modified existing tests to work but didn't add much test coverage? In particular, it would be nice to explicitly test behavior for the motivating use case of ticks with more steps weighing more than ticks with fewer steps as well as the additional issue you noted (potential freezes).
Tests for any overflow/underflow/divide-by-zero scenarios would also be great to have.

Thanks again for working on this, your contribution is much appreciated!

afontenot · 2023-04-26T21:36:43Z

@djc thanks, I'll add some comments and improve the tests

Quick note on this:

Tests for any overflow/underflow/divide-by-zero scenarios would also be great to have.

Yep, the status quo here is a bit of a mess actually. You can easily end up dividing by zero if e.g. time passes but no steps occur. It actually relies ensuring that the numerator in the rate calculation is always zero if the denominator is, so that you get a NaN float, and then explicitly tests for this elsewhere in the code.

I believe I imitated the existing behavior in this area exactly, which is the reasoning behind the strange code here:

    let mut duration: f64 = 0.0;
    if steps != 0 {
        duration = duration_to_secs(now - self.step_instants[first_pos]);
    }

But it would be good to test for this, or maybe change the behavior to not rely on NaN floats. Of course the latter would require having an opinion on the correct value to show for the ETA in the case that time has passed but no progress has been made. As I didn't have an opinion, I fell back on maintaining the status quo.

afontenot · 2023-05-03T17:49:48Z

Just a heads up that I have put improvements for this PR on hold until my work on an exponentially weighted average can see some review. If we end up switching the estimator to use that approach, this PR will become irrelevant and need to be closed.

afontenot · 2023-05-21T00:42:17Z

Going to close this for now as it looks like we will be able to merge some form of an exponentially weighted average, which solves this problem in a different way.

afontenot force-pushed the fix-seconds-per-step branch from c62e3a8 to 193a51c Compare April 26, 2023 02:10

afontenot closed this May 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix incorrect seconds_per_step calculation #535

Fix incorrect seconds_per_step calculation #535

afontenot commented Apr 24, 2023 •

edited

afontenot commented Apr 26, 2023

djc commented Apr 26, 2023

afontenot commented Apr 26, 2023

afontenot commented May 3, 2023

afontenot commented May 21, 2023

Fix incorrect seconds_per_step calculation #535

Fix incorrect seconds_per_step calculation #535

Conversation

afontenot commented Apr 24, 2023 • edited

afontenot commented Apr 26, 2023

djc commented Apr 26, 2023

afontenot commented Apr 26, 2023

afontenot commented May 3, 2023

afontenot commented May 21, 2023

afontenot commented Apr 24, 2023 •

edited