kad: consume `FromSwarm::NewExternalAddrOfPeer` #5313

thomaseizinger · 2024-04-15T12:30:29Z

Counter-part to #5103. We should consume this event and extend our routing table.

Lines 454 to 455 in bee8199

    
           /// Informs the behaviour that we have discovered a new external address for a remote peer. 
        
           NewExternalAddrOfPeer(NewExternalAddrOfPeer<'a>),

Probably here:

rust-libp2p/protocols/kad/src/behaviour.rs

Lines 2614 to 2634 in bee8199

    
               fn on_swarm_event(&mut self, event: FromSwarm) { 
        
                   self.listen_addresses.on_swarm_event(&event); 
        
                   let external_addresses_changed = self.external_addresses.on_swarm_event(&event); 
        
                   if self.auto_mode && external_addresses_changed { 
        
                       self.determine_mode_from_external_addresses(); 
        
                   } 
        
                   match event { 
        
                       FromSwarm::ConnectionEstablished(connection_established) => { 
        
                           self.on_connection_established(connection_established) 
        
                       } 
        
                       FromSwarm::ConnectionClosed(connection_closed) => { 
        
                           self.on_connection_closed(connection_closed) 
        
                       } 
        
                       FromSwarm::DialFailure(dial_failure) => self.on_dial_failure(dial_failure), 
        
                       FromSwarm::AddressChange(address_change) => self.on_address_change(address_change), 
        
                       _ => {} 
        
                   } 
        
               } 
        
           }

drHuangMHT · 2024-04-18T02:42:38Z

I manually hooked it up to the swarm :) Glad to see this implemented in the protocol. However there's the question whether the peer wants to participate in the network when local peer participates in two or more separate DHT networks.

dariusc93 · 2024-04-26T11:51:53Z

I manually hooked it up to the swarm :) Glad to see this implemented in the protocol. However there's the question whether the peer wants to participate in the network when local peer participates in two or more separate DHT networks.

If a peer does not wish to participate in the DHT network, they can change the mode to Mode::Client. The other option would be to only add the behaviour with the respective protocol id since adding multiple protocol id to a single kad behaviour is now deprecated for supporting multiple behaviours (see #5122).

drHuangMHT · 2024-04-26T13:03:35Z

If a peer does not wish to participate in the DHT network, they can change the mode to Mode::Client. The other option would be to only add the behaviour with the respective protocol id since adding multiple protocol id to a single kad behaviour is now deprecated for supporting multiple behaviours (see #5122).

The problem is, both behaviours that use different protocols will receive NewExternalAddrOfPeer and add that peer to their routing tables, which I assume will cause the peer to be added to the wrong table.

dariusc93 · 2024-05-01T22:58:03Z

As I think about it, we could still add the peer to the routing table since in client mode, they wont be able to participate in the DHT until it is switched to server mode (either manually or if auto is set, by confirmation of an external address). This way, two or more dht implementations wont be fragmented.

EDIT: We could probably just do the following

diff --git a/protocols/kad/src/behaviour.rs b/protocols/kad/src/behaviour.rs
index fb77f3c0e..64c3a6c14 100644
--- a/protocols/kad/src/behaviour.rs
+++ b/protocols/kad/src/behaviour.rs
@@ -42,6 +42,7 @@ use libp2p_identity::PeerId;
 use libp2p_swarm::behaviour::{
     AddressChange, ConnectionClosed, ConnectionEstablished, DialFailure, FromSwarm,
 };
+use libp2p_swarm::NewExternalAddrOfPeer;
 use libp2p_swarm::{
     dial_opts::{self, DialOpts},
     ConnectionDenied, ConnectionHandler, ConnectionId, DialError, ExternalAddresses,
@@ -2050,6 +2051,13 @@ where
         }
     }
 
+    fn on_new_external_peer(
+        &mut self,
+        NewExternalAddrOfPeer { peer_id, addr }: NewExternalAddrOfPeer,
+    ) {
+        self.add_address(&peer_id, addr.clone());
+    }
+
     fn on_dial_failure(&mut self, DialFailure { peer_id, error, .. }: DialFailure) {
         let Some(peer_id) = peer_id else { return };
 
@@ -2628,6 +2636,9 @@ where
             }
             FromSwarm::DialFailure(dial_failure) => self.on_dial_failure(dial_failure),
             FromSwarm::AddressChange(address_change) => self.on_address_change(address_change),
+            FromSwarm::NewExternalAddrOfPeer(new_external_peer) => {
+                self.on_new_external_peer(new_external_peer)
+            }
             _ => {}
         }
     }

guillaumemichel · 2024-05-02T05:13:19Z

IIUC the goal here is to update the address of a peer that is already a kad routing table, not to add new peers to it.

dariusc93 · 2024-05-05T06:18:46Z

IIUC the goal here is to update the address of a peer that is already a kad routing table, not to add new peers to it.

Possible, though based on the OP, it sounds like it could be both, in which case i recall Behaviour::add_address would add a peer if it doesnt exist in the routing table or update any current entries. Not a problem to add if its just current entries through.

@thomaseizinger thoughts?

thomaseizinger · 2024-05-05T07:18:46Z

Yes, I am pretty sure add_address updates existing entries. I don't really understand the point about multiple DHTs. If an app can participate in two or more, both kademlia instances should know about it. Other protocols are not isolated from that. A node may very well speak identify, ping and many other protocols in addition to two DHT protocols. If not all discovered peers support all protocols, then you have different problems anyway. Every protocol should be designed to gracefully handle the case of the remote peer not supporting it.

If a protocol is essential for a node to operate, you should disconnect it on the Swarm level because only the swarm knows about the composition of protocols.

drHuangMHT · 2024-05-05T09:52:30Z

Yes, I am pretty sure add_address updates existing entries. I don't really understand the point about multiple DHTs. If an app can participate in two or more, both kademlia instances should know about it. Other protocols are not isolated from that. A node may very well speak identify, ping and many other protocols in addition to two DHT protocols. If not all discovered peers support all protocols, then you have different problems anyway. Every protocol should be designed to gracefully handle the case of the remote peer not supporting it.

If a protocol is essential for a node to operate, you should disconnect it on the Swarm level because only the swarm knows about the composition of protocols.

This is my understanding of the problem: Suppose there is a node sit in-between two different DHTs. Then a peer from one of the DHTs triggers NewExternalAddrOfPeer and updates the DHT it participates. Because both behaviours listen to the same swarm event source, the other behaviour will also update its own DHT, essentially making that peer participate in the other DHT, which it doesn't necessarily want to.
Because the update is triggered by NewExternalAddrOfPeer, there is no way of telling if that peer supports the other protocol due to the fact that negotiation may not actually happen or the failure is not acknowledged(lost in time), similar to manually insert a peer that the local node currently cannot reach.

thomaseizinger · 2024-05-05T11:34:59Z

essentially making that peer participate in the other DHT, which it doesn't necessarily want to.

Is there much harm in that? Would you want to validate first that a node speaks the corresponding kad protocol before returning it in queries from other nodes?

drHuangMHT · 2024-05-05T12:01:34Z

essentially making that peer participate in the other DHT, which it doesn't necessarily want to.

Is there much harm in that? Would you want to validate first that a node speaks the corresponding kad protocol before returning it in queries from other nodes?

There is essentially no harm to that single peer, but it can do harm to the network. We marked deprecated for default configuration with IPFS protocol name for the behaviour earlier with the intention of preventing other DHT networks from accidentially merging with IPFS' DHT. If both networks are large enough and different enough, records from the other network will only be unnecessary overhead since those peers contributes little.
Then there will be the problem of why participate in two networks when the other network helps little. You see, I'm running my own little project that also uses DHT for peer discovery. But with relatively low number of nodes it doesn't work well. So my idea is instead of only relying on my own network, I can also ask for help in IPFS' DHT network(which sounds parasitic isn't it). So I raised the question.
For the latter, I would say yes and no. Yes because it resolves the problem, and no because it brings in extra complexity that you usually don't need.

thomaseizinger · 2024-05-05T14:05:40Z

Currently, we only add nodes to the DHT that we dialed successfully. That guarantees that at least the IP is valid and reachable. In addition, we also check that we actually speak the same kademlia protocol, see

rust-libp2p/protocols/kad/src/behaviour.rs

Line 2239 in a91c48f

self.connection_updated(source, address, NodeStatus::Connected);

So currently, only nodes that speak the correct protocol are added to the DHT, unless you use add_address. (Other implementations might behave differently, so you cannot trust this)

Perhaps the better solution would be to dial the peer upon receiving this address to test if the peer speaks kademlia? If it does, each kademlia behaviour will add it to the corresponding routing table. Or maybe dialing should not happen in the protocols but is something that the user should do?

guillaumemichel · 2024-05-07T08:06:07Z

Ideally when a peer is dialed or changes its addresses, the swarm should emit an event, mentioning the peer and its supported protocols. Different protocols could then decide what to do with this information.

E.g if a peer supports 2 DHTs, and a remote peer only supporting 1 DHT changes its external address, the routing table of the DHT in which this remote peer participates must be updated, while the other DHT will simply ignore the event. But if this remote peer starts supporting the other DHT, then it should be added to the routing table of the second DHT.

thomaseizinger · 2024-05-07T08:55:09Z

the swarm should emit an event, mentioning the peer and its supported protocol

In rust-libp2p, identify is decoupled from the swarm so this is currently not possible and unless major changes are done, will unlikely be possible in the future.

drHuangMHT · 2024-05-07T13:53:18Z

We should probably leave this to the user because it's a very rare case. I'm just bringing it to attention.

thomaseizinger changed the title ~~kad: Consume FromSwarm::NewExternalAddrOfPeer~~ kad: consume FromSwarm::NewExternalAddrOfPeer Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kad: consume `FromSwarm::NewExternalAddrOfPeer` #5313

kad: consume `FromSwarm::NewExternalAddrOfPeer` #5313

thomaseizinger commented Apr 15, 2024 •

edited

drHuangMHT commented Apr 18, 2024

dariusc93 commented Apr 26, 2024

drHuangMHT commented Apr 26, 2024

dariusc93 commented May 1, 2024 •

edited

guillaumemichel commented May 2, 2024

dariusc93 commented May 5, 2024

thomaseizinger commented May 5, 2024

drHuangMHT commented May 5, 2024

thomaseizinger commented May 5, 2024

drHuangMHT commented May 5, 2024

thomaseizinger commented May 5, 2024

guillaumemichel commented May 7, 2024

thomaseizinger commented May 7, 2024

drHuangMHT commented May 7, 2024

kad: consume FromSwarm::NewExternalAddrOfPeer #5313

kad: consume FromSwarm::NewExternalAddrOfPeer #5313

Comments

thomaseizinger commented Apr 15, 2024 • edited

drHuangMHT commented Apr 18, 2024

dariusc93 commented Apr 26, 2024

drHuangMHT commented Apr 26, 2024

dariusc93 commented May 1, 2024 • edited

guillaumemichel commented May 2, 2024

dariusc93 commented May 5, 2024

thomaseizinger commented May 5, 2024

drHuangMHT commented May 5, 2024

thomaseizinger commented May 5, 2024

drHuangMHT commented May 5, 2024

thomaseizinger commented May 5, 2024

guillaumemichel commented May 7, 2024

thomaseizinger commented May 7, 2024

drHuangMHT commented May 7, 2024

kad: consume `FromSwarm::NewExternalAddrOfPeer` #5313

kad: consume `FromSwarm::NewExternalAddrOfPeer` #5313

thomaseizinger commented Apr 15, 2024 •

edited

dariusc93 commented May 1, 2024 •

edited