New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p2p/discover: improved node revalidation #29572
Open
fjl
wants to merge
30
commits into
ethereum:master
Choose a base branch
from
fjl:discover-reval-new
base: master
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+875
−512
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
fjl
changed the title
p2p/discover: new node revalidation logic
p2p/discover: improved node revalidation
Apr 18, 2024
holiman
reviewed
Apr 24, 2024
holiman
reviewed
Apr 24, 2024
Comment on lines
+79
to
+85
if tr.fast.nextTime == never { | ||
return tr.slow.nextTime | ||
} | ||
if tr.slow.nextTime == never { | ||
return tr.fast.nextTime | ||
} | ||
return min(tr.fast.nextTime, tr.slow.nextTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested change
if tr.fast.nextTime == never { | |
return tr.slow.nextTime | |
} | |
if tr.slow.nextTime == never { | |
return tr.fast.nextTime | |
} | |
return min(tr.fast.nextTime, tr.slow.nextTime) | |
return min(tr.fast.nextTime, tr.slow.nextTime) |
Isn't this sufficient?
fjl
commented
Apr 24, 2024
|
||
if !resp.didRespond { | ||
// Revalidation failed. | ||
n.livenessChecks /= 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use / 3
Co-authored-by: Martin HS <martin@swende.se>
This is to better reflect their purpose. The previous naming of 'seen' and 'verified' was kind of arbitrary, especially since 'verified' was the stricter one.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Node discovery periodically revalidates the nodes in its table by sending PING, checking if they are still alive. I recently noticed some issues with the implementation of this process, which can cause strange results such as nodes dropping unexpectedly, certain nodes not getting revalidated often enough, and bad results being returned to incoming FINDNODE queries.
Let me first describe how the revalidation process worked previously:
livenessChecks
value by one. Since PONG also has the node's ENR sequence number, we request the node's new ENR when it has changed.Now on to issues with the above process:
160 * 5s == 13.3 min
. Note this time applies also to all nodes, even the ones freshly added to the table from a query. It's just too slow to maintain a healthy table.Here is my proposed design for the new revalidation process:
livenessChecks
is incremented by one for successful checks. Unlike the old implementation, we will not drop the node on the first failing check. We instead quickly decay thelivenessChecks
by/ 5
or so to give it another chance.