p2p: improve PEX reactor #6305

cmwaters · 2021-04-01T08:30:19Z

Decisions / Changes Made

SeedMode:

Prior, seed mode was simply a configuration parameter to the PEX reactor which would cause nodes to crawl the network and dump connections after sharing addresses in their peer store. Taking into consideration changes made from ADR-052, we have decided to remove seed mode from the PEX reactor, so that all nodes running the PEX reactor perform the same pattern. A seed node is just a node that runs only the PEX reactor and thus can afford to have a larger amount of open connections and max peers.

Discovery Algorithm:

I have developed a more intuitive method for understanding when we should be requesting new peers. The previous method was to simply compare the address book size with some arbitrary constant (1000). Which in no way has any bearing to the size of the network, or the needs of the node. Now, the PEX reactor changes how often it sends requests for more addresses inversely proportionaly to the amount of new peers it receives. i.e. if the responses have many new addresses (a high ratio), then the interval between each request will decrease (or a higher frequency). When a node eventually starts to see more of the same addresses gossiped than new peers (a low ratio), then the interval between requests gets larger so the node stops asking for more peers. If anyone is familiar with electronics this is akin to pulse-width modulation. NOTE: that the pex reactor will never completely stop sending requests for new peers.

DOS Protection:

The amount of addresses in a requests is capped to 100 (both sending and receiving) (We could make this a config variable)
I've carried over a requestsSent map to stop peers from spuriously sending responses when no request was sent
I've also carried over the lastReceivedRequests map to prevent peers from sending requests too often.

Testing:

Still in progress

Closes: #6271

Outstanding Work

Introduce a Blacklist. When peers behave maliciously we give them a low score. They will likely get evicted and eventually pruned from the peer store. This then allows them to get added again. I believe we had the notion of a blacklist (or at least a banlist) in the legacy P2P stack. We should probably do the same here and keep track of peers who were actively malicious (not just poor).

p2p/pex/reactor.go

cmwaters · 2021-04-01T09:07:57Z

p2p/pex/reactor.go

 const (
-	maxAddresses   uint16 = 100
-	resolveTimeout        = 3 * time.Second
+	// the minimum time one peer can send another request to the same peer
+	minReceiveRequestInterval  = 300 * time.Millisecond
+
+	// the maximum amount of addresses that can be included in a response
+	maxAddresses uint16 	   = 100
+
+	// allocated time to resolve a node address into a set of endpoints
+	resolveTimeout        	   = 3 * time.Second
+
+	// How long to wait when there are no peers available before trying again
+	noAvailablePeersWaitPeriod = 1 * time.Second
+
+	// indicates the ping rate of the pex reactor when the peer store is full.
+	// The reactor should still look to add new peers in order to flush out low
+	// scoring peers that are still in the peer store
+	fullCapacityInterval	   = 10 * time.Minute


~~Do we want to make any of these variables configurable?~~

I made a few little cosmetic changes, and swapped the setup order so that the test starts up (no deadlock) but it still fails.

cmwaters · 2021-04-06T16:59:43Z

So at the moment we have a disparity between a PEX address:

type PexAddress struct {
	ID   string
	IP   string
	Port uint32
}

and an Endpoint:

type Endpoint struct {
	Protocol Protocol
	IP net.IP
	Port uint16
	Path string
}

The peer manager stores NodeAddress 's which can be resolved into an array of endpoints. The PEX reactor will take this array of endpoints and create the PEXAddress that it will send to other peers like so:

pexAddresses = append(pexAddresses, protop2p.PexAddress{
	ID:   string(address.NodeID),
	IP:   endpoint.IP.String(),
	Port: uint32(endpoint.Port),
})

As you can see it truncates the Protocol . When the PEX reactor receives a PEXAddress , it turns it into a scheme-less URL string like so:

peerAddress, err := p2p.ParseNodeAddress(
	fmt.Sprintf("%s@%s:%d", pexAddress.ID, pexAddress.IP, pexAddress.Port))

As we don't have a Protocol we default to using mconn . All this is to say that the PEX reactor is incapable of supporting any other transport other than mconn. There are two options that I see. Either we change the messages the PEX reactor sends to be URLs (this is a breaking change and something I know we've wanted to avoid) or we do a sort of hack where we attach the protocol to the nodeID like so:

pexAddresses = append(pexAddresses, protop2p.PexAddress{
	ID:   fmt.Sprintf("%s://%v", endpoint.Protocol, address.NodeID),
	IP:   endpoint.IP.String(),
	Port: uint32(endpoint.Port),
})

Erik already mentions this problem in the code here:

// resolve resolves a set of peer addresses into PEX addresses.
//
// FIXME: This is necessary because the current PEX protocol only supports
// IP/port pairs, while the P2P stack uses NodeAddress URLs. The PEX protocol
// should really use URLs too, to exchange DNS names instead of IPs and allow
// different transport protocols (e.g. QUIC and MemoryTransport).

What do people think?

alexanderbez · 2021-04-12T12:44:39Z

Either we change the messages the PEX reactor sends to be URLs (this is a breaking change and something I know we've wanted to avoid) or we do a sort of hack where we attach the protocol to the nodeID like so:

Let's avoid breaking the protocol for sure. But I'm also not totally thrilled with the idea of the hack because it doesn't really end up being an ID at that point.

Can we instead introduce new message(s) into the PEX reactor, where the legacy/current messages imply mconn by default and the new messages support URLs thus various possible transports?

cmwaters · 2021-04-14T08:41:23Z

@alexanderbez would something like this work?

message PexAddressV2 {
  string url = 1;
}

message PexRequestV2 {}

message PexResponseV2 {
  repeated PexAddressV2 addresses = 1 [(gogoproto.nullable) = false];
}

message PexMessage {
  oneof sum {
    PexRequest    pex_request     = 1;
    PexResponse   pex_response    = 2;
    PexRequestV2  pex_request_v2  = 3;
    PexResponseV2 pex_response_v2 = 4;
  }
}

alexanderbez · 2021-04-14T12:43:08Z

@cmwaters looks good!

p2p/p2ptest/network.go

p2p/peermanager.go

p2p/pex/reactor_test.go

p2p/pex/reactor.go

p2p/router_test.go

p2p/pex/reactor.go

Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com>

…to callum/p2p-seed

tychoish · 2021-04-22T13:58:48Z

p2p/p2ptest/network.go

@@ -214,7 +224,7 @@ type Node struct {
 // MakeNode creates a new Node configured for the network with a
 // running peer manager, but does not add it to the existing
 // network. Callers are responsible for updating peering relationships.
-func (n *Network) MakeNode(t *testing.T) *Node {
+func (n *Network) MakeNode(t *testing.T, opts NodeOptions) *Node {


are there cases where the NodeOptions are different fro different members of the network, or is it just safe to set it (and the options, on the network itself?

Hmmm probs not. I was thinking that you may want to test a network with differently configured nodes but the default was that they were the same. Are you suggesting that the Network saves a copy of the options and just uses that whenever MakeNode is called?

I think this change is fine, but I think the more options and whatever you have to pass around the harder it is to get the right thing to happen in your tests, so I'm just sort of generically/reflexively coming down on the side of "make interfaces simpler/smaller".

tychoish · 2021-04-22T13:59:45Z

p2p/peermanager.go

-func (m *PeerManager) Add(address NodeAddress) error {
+// exists, the address is added to it if it isn't already present. This will push
+// low scoring peers out of the address book if it exceeds the maximum size.
+func (m *PeerManager) Add(address NodeAddress) (bool, error) {


do we ever handle the false, nil case differently than the true, nil case?

Yup, we count how many nodes are new vs how many are already in the peer store and use it as a heuristic for calculating how often we should ping nodes for more addresses.

The thinking is that if I'm seeing a lot of new addresses I should be pinging more often to full up the peer store and once I stop seeing new addresses I slow down the cadence with which I send requests

tychoish

comments/questions aside, I think once the tests pass this is in a mergable state.

CHANGELOG_PENDING.md

p2p/pex/reactor.go

alexanderbez · 2021-04-23T16:10:50Z

p2p/pex/reactor.go

+// to its upper bound.
+// MaxInterval = 100 * 100 * baseTime ~= 16mins for 10 peers
+// CONTRACT: Must use a write lock as nextRequestTime is updated
+func (r *ReactorV2) calculateNextRequestTime() {


I haven't fully reviewed this in detail but I trust your judgement here 👍

Thanks. I'm hoping I've tested/thought about the heuristics here sufficiently for this to be solid.

codecov · 2021-04-26T10:22:27Z

Codecov Report

Merging #6305 (e5467f6) into master (ee70430) will increase coverage by 0.42%.
The diff coverage is 80.35%.

@@            Coverage Diff             @@
##           master    #6305      +/-   ##
==========================================
+ Coverage   60.70%   61.12%   +0.42%     
==========================================
  Files         283      283              
  Lines       26866    26992     +126     
==========================================
+ Hits        16309    16500     +191     
+ Misses       8856     8775      -81     
- Partials     1701     1717      +16

Impacted Files	Coverage Δ
node/node.go	`57.07% <0.00%> (ø)`
p2p/peermanager.go	`83.39% <50.00%> (-0.76%)`	⬇️
p2p/transport_memory.go	`85.52% <50.00%> (+0.39%)`	⬆️
p2p/pex/reactor.go	`81.63% <84.56%> (+81.63%)`	⬆️
privval/signer_server.go	`89.47% <0.00%> (-5.27%)`	⬇️
evidence/reactor.go	`71.68% <0.00%> (-3.54%)`	⬇️
blockchain/v0/reactor.go	`63.20% <0.00%> (-1.60%)`	⬇️
p2p/switch.go	`59.90% <0.00%> (-0.47%)`	⬇️
consensus/state.go	`66.75% <0.00%> (-0.10%)`	⬇️
mempool/reactor.go	`73.04% <0.00%> (ø)`
... and 12 more

modify PEX reactor and add scaffold for tests

3c39642

cmwaters commented Apr 1, 2021

View reviewed changes

p2p/pex/reactor.go Show resolved Hide resolved

cmwaters commented Apr 1, 2021

View reviewed changes

cmwaters and others added 4 commits April 1, 2021 16:06

create pex test suite

7e53f6e

add mock nodes to the test suite

d747b78

reactor tests

a89cb5c

I made a few little cosmetic changes, and swapped the setup order so that the test starts up (no deadlock) but it still fails.

fix lint

64f19d1

checkpoint

230b4ef

cmwaters added 4 commits April 16, 2021 17:45

add new pex messages

69c9291

add tests

d45ead1

lint

4a603e6

Merge branch 'master' into callum/p2p-seed

4079b06

cmwaters marked this pull request as ready for review April 16, 2021 15:52

cmwaters requested review from alexanderbez, ebuchman and tessr as code owners April 16, 2021 15:52

fmt

9f8b618

alexanderbez reviewed Apr 19, 2021

View reviewed changes

cmwaters commented Apr 20, 2021

View reviewed changes

p2p/pex/reactor.go Outdated Show resolved Hide resolved

cmwaters and others added 4 commits April 20, 2021 10:08

Apply suggestions from code review

437e3d8

Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com>

make two separate go routines

86bc93c

accomodate for the legacy pex messages

eb85aa5

add changelog

c07a540

cmwaters requested a review from tychoish as a code owner April 22, 2021 10:55

cmwaters added 2 commits April 22, 2021 13:29

Merge branch 'master' into callum/p2p-seed

675452d

create Node Options and rename Capacity function

96898ec

cmwaters added 2 commits April 22, 2021 14:21

Merge branch 'callum/p2p-seed' of github.com:tendermint/tendermint in…

a3f8911

…to callum/p2p-seed

fix changelog

10c2cd6

tychoish reviewed Apr 22, 2021

View reviewed changes

tychoish approved these changes Apr 22, 2021

View reviewed changes

alexanderbez approved these changes Apr 22, 2021

View reviewed changes

CHANGELOG_PENDING.md Show resolved Hide resolved

p2p/pex/reactor.go Outdated Show resolved Hide resolved

p2p/pex/reactor.go Show resolved Hide resolved

p2p/pex/reactor.go Outdated Show resolved Hide resolved

cmwaters added 4 commits April 23, 2021 07:50

implement suggested changes

c513162

fix tests

9f44019

Merge branch 'master' into callum/p2p-seed

b2c2ff5

Merge branch 'master' into callum/p2p-seed

44027c7

alexanderbez reviewed Apr 23, 2021

View reviewed changes

cmwaters added 3 commits April 26, 2021 11:33

make minor changes to the test

7a2e528

update comment

d1b0098

update another comment

68c8e04

modify some more comments

e5467f6

cmwaters merged commit 9efc20c into master Apr 26, 2021

cmwaters deleted the callum/p2p-seed branch April 26, 2021 11:03

tychoish pushed a commit to tychoish/tendermint that referenced this pull request May 5, 2021

p2p: improve PEX reactor (tendermint#6305)

aea77db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

p2p: improve PEX reactor #6305

p2p: improve PEX reactor #6305

cmwaters commented Apr 1, 2021

cmwaters Apr 1, 2021 •

edited

cmwaters commented Apr 6, 2021

alexanderbez commented Apr 12, 2021 •

edited

cmwaters commented Apr 14, 2021

alexanderbez commented Apr 14, 2021

tychoish Apr 22, 2021

cmwaters Apr 22, 2021

tychoish Apr 22, 2021

tychoish Apr 22, 2021

cmwaters Apr 22, 2021

tychoish left a comment

alexanderbez Apr 23, 2021

cmwaters Apr 26, 2021

codecov bot commented Apr 26, 2021 •

edited

p2p: improve PEX reactor #6305

p2p: improve PEX reactor #6305

Conversation

cmwaters commented Apr 1, 2021

Decisions / Changes Made

Outstanding Work

cmwaters Apr 1, 2021 • edited

Choose a reason for hiding this comment

cmwaters commented Apr 6, 2021

alexanderbez commented Apr 12, 2021 • edited

cmwaters commented Apr 14, 2021

alexanderbez commented Apr 14, 2021

tychoish Apr 22, 2021

Choose a reason for hiding this comment

cmwaters Apr 22, 2021

Choose a reason for hiding this comment

tychoish Apr 22, 2021

Choose a reason for hiding this comment

tychoish Apr 22, 2021

Choose a reason for hiding this comment

cmwaters Apr 22, 2021

Choose a reason for hiding this comment

tychoish left a comment

Choose a reason for hiding this comment

alexanderbez Apr 23, 2021

Choose a reason for hiding this comment

cmwaters Apr 26, 2021

Choose a reason for hiding this comment

codecov bot commented Apr 26, 2021 • edited

Codecov Report

cmwaters Apr 1, 2021 •

edited

alexanderbez commented Apr 12, 2021 •

edited

codecov bot commented Apr 26, 2021 •

edited