Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress bar bitrate intermittently drops on the sending side #192

Open
afontenot opened this issue Apr 17, 2023 · 10 comments
Open

Progress bar bitrate intermittently drops on the sending side #192

afontenot opened this issue Apr 17, 2023 · 10 comments

Comments

@afontenot
Copy link
Contributor

Note: I refer to bitrate in this issue (I patch the progress bar format to display it), but the issue can be observed in the ETA, which shows sudden upward movements on the sending end. The left-to-right movement of the bar is also jerky on the sending side.

I frequently (multiple times over the course of any large transfer) observe the following behavior:

  1. The sending and receiving end show roughly the same bitrate and ETA.

  2. Over the course of a few seconds, the sending side bitrate drops to about 10% of normal. It slowly recovers.

  3. Despite the fact that the sending side bitrate has supposedly dropped, the receiving side bitrate never drops! It shows (roughly) the same, consistent bitrate until the download completes.

Obviously, the receiving end is not actually receiving faster than the transferring side is transferring.

I suspect the issue is with a buffer somewhere - the receiver pulls data out of this buffer at a constant rate, but the sender is not necessarily filling it at a constant rate. If the time to drain the buffer significantly exceeds the amount of time over which the progress bar averages the transfer rate, then it will fluctuate unexpectedly.

I note that if I set the enable_steady_tick duration to 1 sec (instead of 250 ms), it seems to behave much better. This appears to cause indicatif to average the performance over a long enough time to remove this buffer-filling noise.

This is all just my speculation - if it's bogus, steps to debug this further would be appreciated.

@piegamesde
Copy link
Member

Yes, you are basically just observing standard network buffering. You can even observe that sender and receiver have slightly different progress to report, the difference is the size of the network buffer. This should be smaller when using a direct connection instead of a relay. (This used to be even more pronounced because for a long time, the relay server had a bug where it did not apply back pressure, i.e. had an unbounded buffer.)

This cannot be fixed, only hidden, and I'm not sure if that's worth the effort. Users should already be used to it: Many file transfer operations between local drives start quick and then drop in speed when the OS buffer is full and actually needs to start writing.

Note that receivers may experience this too, if their network connection is faster than their file system, which is pretty rare though given that SSDs are common

@afontenot
Copy link
Contributor Author

Thanks - I think the thing I'm not used to seeing in other transfer tools is that the transfer speed drops significantly below the actual long-term average, and then recovers. If it was just a matter of "starting fast" and then the progress bar slowing down to the true rate once the buffer was full, that would be understandable.

Looks like there's a downstream bug in indicatif here: console-rs/indicatif#394

The issue seems to be that they just use the last 15 ticks for an average. On the sending end that's only 3.75 seconds, a small enough amount of time that you end up waiting with a full buffer a significant amount of the time. Maybe the right move is just to wait for them to fix it (there's discussion of switching to an exponential average).

Incidentally, all my tests have been with direct connections, not through the relay.

@piegamesde
Copy link
Member

I have trouble reproducing what you are describing here. What kind of setup are you using for testing?

@afontenot
Copy link
Contributor Author

On the receiving end, a virtual server in a nearby Google Cloud region. On the sending end, a server with a direct ethernet connection to a cable modem with ~25 Mbps upload. Using sftp to the server gives me an extremely steady 2.7 MB/s estimated bitrate throughout the transfer, which is typical for me to any reasonably close server.

I've recorded a video so you can see what I'm talking about. Actually, in this case not only does the sending end (on the bottom) have quite a few downward fluctuations in the bitrate estimate, it seems to spend enough of its time at 2 Mbps or below that I'm concerned it couldn't possibly be accurate. Note that the ETA gets similarly messed up whenever the bitrate is wrong.

Calculating the true bitrate using the transfer time, I see that I'm getting ~2.7 MB/sec, actually as expected.

mwrs_issue.mp4

This is MWRS built from the master branch with two patches: one to change the visual output to include bitrate, and the other to get the receiving end to update every 250 ms rather than updating constantly.

@afontenot
Copy link
Contributor Author

I just now noticed from the video that the top progress bar updates very regularly, almost exactly every 1/4 second (as expected). The bottom progress bar (sending side) is much more irregular, and furthermore only updates (on average) every ~1/2 second. This suggests to me

  1. that there's some issue causing the stable_tick to not function as expected, and
  2. indicatif is probably dividing the bytes transferred by the wrong amount of time as a result, resulting in the bitrate shown being about 1/2 the correct value (on average).

Probably some aspect of this explains the abrupt bitrate crashes as well.

@piegamesde
Copy link
Member

As far as I can tell this is an indicatif issue, and the best thing that we can do for now is to continue not displaying the transfer bitrate.

@afontenot
Copy link
Contributor Author

Do you have any ideas for debugging this? Note that the ETA is similarly affected. If this issue doesn't affect most users, I don't see the harm in enabling it.

@piegamesde
Copy link
Member

Maybe first start by sanity-checking that the send side is calling the progress updated in a regular stable interval with sane values. And then the rest should be in indicatif, so no idea

@afontenot
Copy link
Contributor Author

afontenot commented Apr 23, 2023

As expected, my write patterns are extremely burst-y. I see waits over half a second regularly, sometimes over one second. However, this doesn't appear to be enough to explain the problem, as when I attempt to recreate indicatif's moving average it's reasonably smooth, and the values are otherwise sane (e.g. the reads total the true size of the file).

I've made a simple tool to allow replaying indicatif logs, and filed a bug with them here: console-rs/indicatif#534

@afontenot
Copy link
Contributor Author

This should be fixed in the latest release of indicatif: https://github.com/console-rs/indicatif/releases/tag/0.17.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants