Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic CatchupParallelBlocks #5785

Open
winder opened this issue Oct 17, 2023 · 0 comments
Open

Dynamic CatchupParallelBlocks #5785

winder opened this issue Oct 17, 2023 · 0 comments

Comments

@winder
Copy link
Contributor

winder commented Oct 17, 2023

Status

The CatchupParallelBlocks configuration tells the block download service how many blocks to download at the same time. Setting this value to 32 or higher causes many blocks to be downloaded simultaneously, which greatly increases the speed at which a node can be initialized from round 0. On the flip side, once a node has caught up, it's a waste of time to request the next 32 blocks.

For most nodes, this is not a problem. During normal operation on a node that is caught up to the current round, the block download service is only used as a fallback in the event that consensus messages are not received.

This changed with the introduction of follower nodes. This type of node does not run the consensus service. Instead, it depends entirely on downloading blocks from its peers using the block download service. This means that once caught up, the follower node will make a number of requests equal to CatchupParallelBlocks every time Conduit (or some other application) updates the sync round, it makes many unnecessary requests.

So far we have recommended setting CatchupParallelBlocks to 32. This means there are 31 superfluous requests being made every round.

Expected

There should be fewer superfluous requests being made.

Solution

The block download service should have an algorithm which dynamically updates CatchupParallelBlocks based on a rolling average of successful block downloads over time. With a simple rolling average, we should be able to scale the actual parallel downloads to 1 when the node is caught up and up to CatchupParallelBlocks when the node is initializing.

Urgency

Medium-High

We think this may be causing additional latency in follower nodes, which translates to delayed data delivery to Conduit and the Indexer API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant