Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

p2p: improve PEX reactor #6305

Merged
merged 27 commits into from Apr 26, 2021
Merged

p2p: improve PEX reactor #6305

merged 27 commits into from Apr 26, 2021

Conversation

cmwaters
Copy link
Contributor

@cmwaters cmwaters commented Apr 1, 2021

Decisions / Changes Made

SeedMode:

  • Prior, seed mode was simply a configuration parameter to the PEX reactor which would cause nodes to crawl the network and dump connections after sharing addresses in their peer store. Taking into consideration changes made from ADR-052, we have decided to remove seed mode from the PEX reactor, so that all nodes running the PEX reactor perform the same pattern. A seed node is just a node that runs only the PEX reactor and thus can afford to have a larger amount of open connections and max peers.

Discovery Algorithm:

  • I have developed a more intuitive method for understanding when we should be requesting new peers. The previous method was to simply compare the address book size with some arbitrary constant (1000). Which in no way has any bearing to the size of the network, or the needs of the node. Now, the PEX reactor changes how often it sends requests for more addresses inversely proportionaly to the amount of new peers it receives. i.e. if the responses have many new addresses (a high ratio), then the interval between each request will decrease (or a higher frequency). When a node eventually starts to see more of the same addresses gossiped than new peers (a low ratio), then the interval between requests gets larger so the node stops asking for more peers. If anyone is familiar with electronics this is akin to pulse-width modulation. NOTE: that the pex reactor will never completely stop sending requests for new peers.

DOS Protection:

  • The amount of addresses in a requests is capped to 100 (both sending and receiving) (We could make this a config variable)
  • I've carried over a requestsSent map to stop peers from spuriously sending responses when no request was sent
  • I've also carried over the lastReceivedRequests map to prevent peers from sending requests too often.

Testing:

  • Still in progress

Closes: #6271

Outstanding Work

  • Introduce a Blacklist. When peers behave maliciously we give them a low score. They will likely get evicted and eventually pruned from the peer store. This then allows them to get added again. I believe we had the notion of a blacklist (or at least a banlist) in the legacy P2P stack. We should probably do the same here and keep track of peers who were actively malicious (not just poor).

Comment on lines 21 to 37
const (
maxAddresses uint16 = 100
resolveTimeout = 3 * time.Second
// the minimum time one peer can send another request to the same peer
minReceiveRequestInterval = 300 * time.Millisecond

// the maximum amount of addresses that can be included in a response
maxAddresses uint16 = 100

// allocated time to resolve a node address into a set of endpoints
resolveTimeout = 3 * time.Second

// How long to wait when there are no peers available before trying again
noAvailablePeersWaitPeriod = 1 * time.Second

// indicates the ping rate of the pex reactor when the peer store is full.
// The reactor should still look to add new peers in order to flush out low
// scoring peers that are still in the peer store
fullCapacityInterval = 10 * time.Minute
Copy link
Contributor Author

@cmwaters cmwaters Apr 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make any of these variables configurable?

cmwaters and others added 4 commits April 1, 2021 16:06
I made a few little cosmetic changes, and swapped the setup order so
that the test starts up (no deadlock) but it still fails.
@cmwaters
Copy link
Contributor Author

cmwaters commented Apr 6, 2021

So at the moment we have a disparity between a PEX address:

type PexAddress struct {
	ID   string
	IP   string
	Port uint32
}

and an Endpoint:

type Endpoint struct {
	Protocol Protocol
	IP net.IP
	Port uint16
	Path string
}

The peer manager stores NodeAddress 's which can be resolved into an array of endpoints. The PEX reactor will take this array of endpoints and create the PEXAddress that it will send to other peers like so:

pexAddresses = append(pexAddresses, protop2p.PexAddress{
	ID:   string(address.NodeID),
	IP:   endpoint.IP.String(),
	Port: uint32(endpoint.Port),
})

As you can see it truncates the Protocol . When the PEX reactor receives a PEXAddress , it turns it into a scheme-less URL string like so:

peerAddress, err := p2p.ParseNodeAddress(
	fmt.Sprintf("%s@%s:%d", pexAddress.ID, pexAddress.IP, pexAddress.Port))

As we don't have a Protocol we default to using mconn . All this is to say that the PEX reactor is incapable of supporting any other transport other than mconn. There are two options that I see. Either we change the messages the PEX reactor sends to be URLs (this is a breaking change and something I know we've wanted to avoid) or we do a sort of hack where we attach the protocol to the nodeID like so:

pexAddresses = append(pexAddresses, protop2p.PexAddress{
	ID:   fmt.Sprintf("%s://%v", endpoint.Protocol, address.NodeID),
	IP:   endpoint.IP.String(),
	Port: uint32(endpoint.Port),
})

Erik already mentions this problem in the code here:

// resolve resolves a set of peer addresses into PEX addresses.
//
// FIXME: This is necessary because the current PEX protocol only supports
// IP/port pairs, while the P2P stack uses NodeAddress URLs. The PEX protocol
// should really use URLs too, to exchange DNS names instead of IPs and allow
// different transport protocols (e.g. QUIC and MemoryTransport).

What do people think?

@alexanderbez
Copy link
Contributor

alexanderbez commented Apr 12, 2021

Either we change the messages the PEX reactor sends to be URLs (this is a breaking change and something I know we've wanted to avoid) or we do a sort of hack where we attach the protocol to the nodeID like so:

Let's avoid breaking the protocol for sure. But I'm also not totally thrilled with the idea of the hack because it doesn't really end up being an ID at that point.

Can we instead introduce new message(s) into the PEX reactor, where the legacy/current messages imply mconn by default and the new messages support URLs thus various possible transports?

@cmwaters
Copy link
Contributor Author

@alexanderbez would something like this work?

message PexAddressV2 {
  string url = 1;
}

message PexRequestV2 {}

message PexResponseV2 {
  repeated PexAddressV2 addresses = 1 [(gogoproto.nullable) = false];
}

message PexMessage {
  oneof sum {
    PexRequest    pex_request     = 1;
    PexResponse   pex_response    = 2;
    PexRequestV2  pex_request_v2  = 3;
    PexResponseV2 pex_response_v2 = 4;
  }
}

@alexanderbez
Copy link
Contributor

@cmwaters looks good!

@cmwaters cmwaters marked this pull request as ready for review April 16, 2021 15:52
p2p/p2ptest/network.go Outdated Show resolved Hide resolved
p2p/peermanager.go Outdated Show resolved Hide resolved
p2p/peermanager.go Outdated Show resolved Hide resolved
p2p/pex/reactor_test.go Show resolved Hide resolved
p2p/pex/reactor.go Show resolved Hide resolved
p2p/router_test.go Outdated Show resolved Hide resolved
p2p/pex/reactor.go Show resolved Hide resolved
p2p/pex/reactor.go Show resolved Hide resolved
p2p/pex/reactor.go Outdated Show resolved Hide resolved
p2p/pex/reactor.go Show resolved Hide resolved
p2p/pex/reactor.go Outdated Show resolved Hide resolved
@cmwaters cmwaters requested a review from tychoish as a code owner April 22, 2021 10:55
@@ -214,7 +224,7 @@ type Node struct {
// MakeNode creates a new Node configured for the network with a
// running peer manager, but does not add it to the existing
// network. Callers are responsible for updating peering relationships.
func (n *Network) MakeNode(t *testing.T) *Node {
func (n *Network) MakeNode(t *testing.T, opts NodeOptions) *Node {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there cases where the NodeOptions are different fro different members of the network, or is it just safe to set it (and the options, on the network itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm probs not. I was thinking that you may want to test a network with differently configured nodes but the default was that they were the same. Are you suggesting that the Network saves a copy of the options and just uses that whenever MakeNode is called?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is fine, but I think the more options and whatever you have to pass around the harder it is to get the right thing to happen in your tests, so I'm just sort of generically/reflexively coming down on the side of "make interfaces simpler/smaller".

func (m *PeerManager) Add(address NodeAddress) error {
// exists, the address is added to it if it isn't already present. This will push
// low scoring peers out of the address book if it exceeds the maximum size.
func (m *PeerManager) Add(address NodeAddress) (bool, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we ever handle the false, nil case differently than the true, nil case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, we count how many nodes are new vs how many are already in the peer store and use it as a heuristic for calculating how often we should ping nodes for more addresses.

The thinking is that if I'm seeing a lot of new addresses I should be pinging more often to full up the peer store and once I stop seeing new addresses I slow down the cadence with which I send requests

Copy link
Contributor

@tychoish tychoish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments/questions aside, I think once the tests pass this is in a mergable state.

CHANGELOG_PENDING.md Show resolved Hide resolved
p2p/pex/reactor.go Outdated Show resolved Hide resolved
p2p/pex/reactor.go Show resolved Hide resolved
p2p/pex/reactor.go Outdated Show resolved Hide resolved
// to its upper bound.
// MaxInterval = 100 * 100 * baseTime ~= 16mins for 10 peers
// CONTRACT: Must use a write lock as nextRequestTime is updated
func (r *ReactorV2) calculateNextRequestTime() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't fully reviewed this in detail but I trust your judgement here 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'm hoping I've tested/thought about the heuristics here sufficiently for this to be solid.

@codecov
Copy link

codecov bot commented Apr 26, 2021

Codecov Report

Merging #6305 (e5467f6) into master (ee70430) will increase coverage by 0.42%.
The diff coverage is 80.35%.

@@            Coverage Diff             @@
##           master    #6305      +/-   ##
==========================================
+ Coverage   60.70%   61.12%   +0.42%     
==========================================
  Files         283      283              
  Lines       26866    26992     +126     
==========================================
+ Hits        16309    16500     +191     
+ Misses       8856     8775      -81     
- Partials     1701     1717      +16     
Impacted Files Coverage Δ
node/node.go 57.07% <0.00%> (ø)
p2p/peermanager.go 83.39% <50.00%> (-0.76%) ⬇️
p2p/transport_memory.go 85.52% <50.00%> (+0.39%) ⬆️
p2p/pex/reactor.go 81.63% <84.56%> (+81.63%) ⬆️
privval/signer_server.go 89.47% <0.00%> (-5.27%) ⬇️
evidence/reactor.go 71.68% <0.00%> (-3.54%) ⬇️
blockchain/v0/reactor.go 63.20% <0.00%> (-1.60%) ⬇️
p2p/switch.go 59.90% <0.00%> (-0.47%) ⬇️
consensus/state.go 66.75% <0.00%> (-0.10%) ⬇️
mempool/reactor.go 73.04% <0.00%> (ø)
... and 12 more

@cmwaters cmwaters merged commit 9efc20c into master Apr 26, 2021
@cmwaters cmwaters deleted the callum/p2p-seed branch April 26, 2021 11:03
tychoish pushed a commit to tychoish/tendermint that referenced this pull request May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

p2p: seed mode support
3 participants