Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eth/downloader: implement beacon sync #23982

Merged
merged 22 commits into from Mar 11, 2022
Merged
Changes from 1 commit
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
0f78c4f
eth/downloader: implement beacon sync
karalabe Sep 30, 2021
0d22561
eth/downloader: fix a crash if the beacon chain is reduced in length
karalabe Jan 11, 2022
786308d
eth/downloader: fix beacon sync start/stop thrashing data race
karalabe Jan 11, 2022
49b7771
eth/downloader: use a non-nil pivot even in degenerate sync requests
karalabe Jan 11, 2022
3f16180
eth/downloader: don't touch internal state on beacon Head retrieval
karalabe Jan 11, 2022
1c5a698
eth/downloader: fix spelling mistakes
karalabe Jan 14, 2022
344d81a
eth/downloader: fix some typos
karalabe Jan 18, 2022
944e050
eth: integrate legacy/beacon sync switchover and UX
karalabe Nov 2, 2021
0a50767
eth: handle UX wise being stuck on post-merge TTD
karalabe Feb 1, 2022
f6a427d
core, eth: integrate the beacon client with the beacon sync
karalabe Feb 2, 2022
ba9e91f
eth/catalyst: make some warning messages nicer
karalabe Feb 2, 2022
ef4d465
eth/downloader: remove Ethereum 1&2 notions in favor of merge
karalabe Feb 2, 2022
92020c9
core/beacon, eth: clean up engine API returns a bit
karalabe Feb 3, 2022
46c0e3f
eth/downloader: add skeleton extension tests
karalabe Feb 3, 2022
336b5df
eth/catalyst: keep non-kiln spec, handle mining on ttd
karalabe Feb 3, 2022
38114ad
eth/downloader: add beacon header retrieval tests
karalabe Feb 23, 2022
71861e7
eth: fixed spelling, commented failing tests out
MariusVanDerWijden Mar 7, 2022
7ac324b
eth/downloader: review fixes
karalabe Mar 9, 2022
4f2c4e9
eth/downloader: drop peers failing to deliver beacon headers
karalabe Mar 11, 2022
2cb92eb
core/rawdb: track beacon sync data in db inspect
karalabe Mar 11, 2022
bec4fd6
eth: fix review concerns
karalabe Mar 11, 2022
7c12053
internal/web3ext: nit
karalabe Mar 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
25 changes: 19 additions & 6 deletions eth/downloader/skeleton.go
Expand Up @@ -391,7 +391,7 @@ func (s *skeleton) sync(head *types.Header) (*types.Header, error) {
}
for {
// Something happened, try to assign new tasks to any idle peers
s.assingTasks(responses, requestFails, cancel)
s.assignTasks(responses, requestFails, cancel)

// Wait for something to happen
select {
Expand Down Expand Up @@ -615,8 +615,8 @@ func (s *skeleton) processNewHead(head *types.Header, force bool) bool {
return false
}

// assingTasks attempts to match idle peers to pending header retrievals.
func (s *skeleton) assingTasks(success chan *headerResponse, fail chan *headerRequest, cancel chan struct{}) {
// assignTasks attempts to match idle peers to pending header retrievals.
func (s *skeleton) assignTasks(success chan *headerResponse, fail chan *headerRequest, cancel chan struct{}) {
// Sort the peers by download capacity to use faster ones if many available
idlers := &peerCapacitySort{
peers: make([]*peerConnection, 0, len(s.idles)),
Expand Down Expand Up @@ -722,11 +722,22 @@ func (s *skeleton) executeTask(peer *peerConnection, req *headerRequest) {

case <-timeoutTimer.C:
// Header retrieval timed out, update the metrics
peer.log.Trace("Header request timed out", "elapsed", ttl)
peer.log.Warn("Header request timed out, dropping peer", "elapsed", ttl)
headerTimeoutMeter.Mark(1)
s.peers.rates.Update(peer.id, eth.BlockHeadersMsg, 0, 0)
s.scheduleRevertRequest(req)

// At this point we either need to drop the offending peer, or we need a
// mechanism to allow waiting for the response and not cancel it. For now
// lets go with dropping since the header sizes are deterministic and the
// beacon sync runs exclusive (downloader is idle) so there should be no
// other load to make timeouts probable. If we notice that timeouts happen
// more often than we'd like, we can introduce a tracker for the requests
// gone stale and monitor them. However, in that case too, we need a way
// to protect against malicious peers never responding, so it would need
// a second, hard-timeout mechanism.
s.drop(peer.id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it's implemented here, the timeout is based on ttl := s.peers.rates.TargetTimeout(). So this might feasibly happen:

  • One super-fast peer with good connection obtains a ttl of 5ms.
  • 10 laggy peers obtain ttls of 500ms.
  • The fast peer hits a slow-down, and takes 100ms for a delivery.
  • We drop it the fast peer, and only keep the slow ones.

Haven't checked if the numbers I made up make any sense, but you probably understand the idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message rate tracker tries to track the median RTT across all peers. The timeout is the median RTT multiplied by some confidence number.

In your specific scenario, if you have one super fast peer and many laggy ones, the TTL will be the laggy value since that's the median. Even if your fast peer returns a bit slow once, it should still be fine.


case res := <-resCh:
// Headers successfully retrieved, update the metrics
headers := *res.Res.(*eth.BlockHeadersPacket)
Expand Down Expand Up @@ -851,8 +862,10 @@ func (s *skeleton) processResponse(res *headerResponse) bool {

// Ensure the response is for a valid request
if _, ok := s.requests[res.reqid]; !ok {
// Request stale, perhaps the peer timed out but came through in the end
res.peer.log.Warn("Unexpected header packet")
// Some internal accounting is broken. A request either times out or it
// gets fulfilled successfully. It should not be possible to deliver a
// response to a non-existing request.
res.peer.log.Error("Unexpected header packet")
return false
}
delete(s.requests, res.reqid)
Expand Down