Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p/discover: improved node revalidation #29572

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

fjl
Copy link
Contributor

@fjl fjl commented Apr 18, 2024

Node discovery periodically revalidates the nodes in its table by sending PING, checking if they are still alive. I recently noticed some issues with the implementation of this process, which can cause strange results such as nodes dropping unexpectedly, certain nodes not getting revalidated often enough, and bad results being returned to incoming FINDNODE queries.

Let me first describe how the revalidation process worked previously:

  • We set a randomized timer < 10s. When that timer expires, a random bucket is chosen, and within that bucket the last node will be validated.
  • The idea of revalidating the last node was taken from the original Kademlia paper. Certain contacts, such as incoming PING, will move nodes to the first position of the bucket. Other events (i.e. adding nodes from NODES responses) will put them in the back of the bucket. This is supposed to play out such that we always pick a node that requires revalidation the most because any successful contact moves it back to the front. The bucket behaves like a queue, basically.
  • We first send a PING message to the chosen node. If it responds, we increase it's livenessChecks value by one. Since PONG also has the node's ENR sequence number, we request the node's new ENR when it has changed.
  • If the node does not respond to PING, we immediately remove it from the bucket. In place of the old node, we put a random one from the bucket's replacement cache (a list of recently-encountered nodes). However, this only happens if the node is still the last node after revalidation. This condition exists because another event may have updated the node's position, in which case it shouldn't be removed.
  • Finally, note there are some edge cases to consider. when we fetch an updated ENR from the node it can have an updated endpoint, which might not fit into the bucket/table IP limits anymore. In that case, we can't apply the update and just stick with the older ENR. We could also drop the node at that point, but it will be dropped later anyway if the node really isn't reachable on the old endpoint anymore.

Now on to issues with the above process:

  • Revalidation is too slow. We check one node every 5s, and the table's top 10 buckets of 16 nodes are expected to be full at all times. Assuming an even distribution across all table members, we check each node every 160 * 5s == 13.3 min. Note this time applies also to all nodes, even the ones freshly added to the table from a query. It's just too slow to maintain a healthy table.
  • And the distribution isn't even. The concept of moving nodes around within the bucket made less sense the longer I looked at it, because it just complicates things in the implementation. Also, since the process chooses a random bucket and only then picks the node, nodes in deeper buckets will be revalidated more often simply because those buckets are usually less full. The distribution of revalidation requests across table nodes should be even because they may all go offline with an equal chance.
  • Node replacement edge cases are mostly handled correctly by the current implementation, but it's really hard to follow the code, and I had a lot of trouble seeing through it. That part about not replacing the node if it's not last anymore is just useless. There is also at least one code path where nodes were deleted without choosing a replacement.

Here is my proposed design for the new revalidation process:

  • We maintain two 'revalidation lists' containing the table nodes. The lists could be named 'fast' and 'slow'.
  • The process chooses random nodes from each list on a randomized interval, the interval being faster for the 'fast' list, and performs revalidation for the chosen node.
  • Whenever a node is newly inserted into the table, it goes into the 'fast' list. Once validation passes, it transfers to the 'slow' list. If a request fails, or the node changes endpoint, it transfers back into 'fast'.
  • livenessChecks is incremented by one for successful checks. Unlike the old implementation, we will not drop the node on the first failing check. We instead quickly decay the livenessChecks by / 5 or so to give it another chance.
  • Order of nodes in bucket doesn't matter anymore.
  • I intend to write the implementation in a way that makes it easy to dynamically adjust the rate of revalidation requests if needed. This is important because the implementation also uses revalidation requests as an input to the endpoint predictor. We could increase activity if the predictor doesn't have enough statements, for example.

@fjl fjl requested a review from zsfelfoldi as a code owner April 18, 2024 07:59
@fjl fjl changed the title p2p/discover: new node revalidation logic p2p/discover: improved node revalidation Apr 18, 2024
Comment on lines +79 to +85
if tr.fast.nextTime == never {
return tr.slow.nextTime
}
if tr.slow.nextTime == never {
return tr.fast.nextTime
}
return min(tr.fast.nextTime, tr.slow.nextTime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if tr.fast.nextTime == never {
return tr.slow.nextTime
}
if tr.slow.nextTime == never {
return tr.fast.nextTime
}
return min(tr.fast.nextTime, tr.slow.nextTime)
return min(tr.fast.nextTime, tr.slow.nextTime)

Isn't this sufficient?


if !resp.didRespond {
// Revalidation failed.
n.livenessChecks /= 5
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe use / 3

fjl and others added 3 commits April 24, 2024 16:20
Co-authored-by: Martin HS <martin@swende.se>
This is to better reflect their purpose. The previous naming of 'seen' and 'verified'
was kind of arbitrary, especially since 'verified' was the stricter one.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants