NAT traversal: QUIC Hole Punching #1015

aarshkshah1992 · 2020-11-05T03:25:38Z

We now have full-fledged support for QUIC which is a UDP based protocol.
We also have a variant of the “STUN” protocol in the (Identify + AutoNAT) implementation which can inform peers about their publicly dialable addresses with some confidence.
We need to introduce hole punching for the QUIC transport by getting two peers to attempt to simultaneously connect to each other on their advertised public addresses to punch a hole in their NAT for the other peer.
We can use Circuit Relays for this co-ordination as detailed here.

aschmahmann · 2020-11-05T08:12:31Z

@aarshkshah1992 what is the advantage of doing a connection upgrade over a relay vs a specific coordination protocol?

Perhaps I'm missing something but here's an example of how I'm looking at it. If we have three peers Alice, Bob, Relay where Alice is trying to connect to Bob using Relay.

Connection Upgrade:

Strategy
- Alice asks Relay to connect her to Bob
- Alice + Bob coordinate hole punching and a separate direct connection using some new protocol (e.g. /p2p/holepunch/1.0.0)
- Alice is now connected to Bob + Relay
Advantages
- We have a relay transport
- It allows us to upgrade the holepunch protocol on Alice + Bob without touching the Relay
- Allows fallback to full communication over Relay if holepunching fails
Disadvantages
- Requires us to figure out how to limit relay abilities to prevent abuse (e.g. R is only going to let A send X bytes/second) where X is normally really small
- Likely requires us to figure out how R should tell A it's terms + conditions such as
  - that it's not a full relay and that it should not be trying to use it as a full relay
  - that it will only serve connections that it feels responsible for (e.g. R might decide it'll help anyone in the world connect to B, but not waste bandwidth letting other people connect to A)
  - Need to figure out how to get go-libp2p to deal with multiple connections to the same PeerID

Separate Protocol:

Strategy (rough draft)
- Alice asks Relay to do a holepunch with Bob via some new protocol (e.g. /p2p/holepunch/1.0.0)
- Relay either responds "I don't know/am not connect to Bob", or "ok"
- Bob tries to directly dial Alice, and if that fails Bob asks Relay to orchestrate a holepunch with Alice
Advantages
- New protocol that only orchestrates NAT traversal means no one should be attempting to use it for communication
  - Perhaps the protocol ends up with user-data such that it technically "could" be used for communication, but that abuse seems pretty easy to prevent
- No need to do any libp2p plumbing related to getting Alice + Bob to talk to each other + upgrade, just for the low level UDP holepunching itself
Disadvantages
- Makes the new protocol a little more complicated/state-machine like, or requires multiple new protocols
- Alice and/or Bob have to find a fallback relay to talk to in the event holepunching is unsuccessful
  - Although at least here they know they can't use Relay, whereas above they might be confused unless we upgrade the circuit-relay protocol

aarshkshah1992 · 2020-11-05T10:28:29Z

@aschmahmann I am not sure what you mean.

The circuit upgrade over Relay will use a new protocol but the bytes for that protocol will be relayed over the Relay server.
In the existing solution, Alice will first try to dial Bob directly using the addresses it sees in the Identify protocol and then use the Relay to co-ordinate hole punching if that dial fails.

Separate Protocol:

Strategy (rough draft)
Alice asks Relay to do a holepunch with Bob via some new protocol (e.g. /p2p/holepunch/1.0.0)
Relay either responds "I don't know/am not connect to Bob", or "ok"
Bob tries to directly dial Alice, and if that fails Bob asks Relay to orchestrate a holepunch with Alice

Even this needs a Relay that Bob would be connected to , right ? Remember, both Alice and Bob could be behind a NAT which means they need to be connected to a common publicly reachable server to co-ordinate the hole punch. What we are saying here is that since we already have the circuit relay infra and code, why not use that to co-ordinate the hole punch albeit over a protocol "layered" on top of the Circuit Relay.

aschmahmann · 2020-11-05T15:51:47Z

Even this needs a Relay that Bob would be connected to , right ? Remember, both Alice and Bob could be behind a NAT which means they need to be connected to a common publicly reachable server to co-ordinate the hole punch. What we are saying here is that since we already have the circuit relay infra and code, why not use that to co-ordinate the hole punch albeit over a protocol "layered" on top of the Circuit Relay.

By reusing the existing circuit relay code we are in a position where we need to some issues with circuit relays before doing anything involving hole punching, at least for us to deploy this to people and make say every DHT server node a holepunching relay. Issues include:

Limit bandwidth per user through relay
R needs to tell users "I'm not a full relay, you have reduced bandwidth"
Allow go-libp2p to connect to Bob twice, once over circuit-relay and another time directly

aarshkshah1992 · 2020-11-05T17:25:45Z

Discussed offline with @aschmahmann :

Yes, we will have to restrict bandwidth but that problem is orthogonal to NAT traversal. We should have already done that.
But, I discussed this with @jacobheun today and we decided that we will get in the bandwidth limiting once we have the hole punching in place so Relays are used ONLY for co-ordinating hole punching and not for data transfer.
Using Circuit Relays to co-ordinate that hole punch is really the quickest path to get to QUIC hole punching (which is our main goal right now) given that we have a lot of the Infra and code in place already. It shouldn't be hard to change the co-ordination protocol once we have the hole punching delivered.
I agree, we will have to make a change to go-libp2p to allow multiple connections between peers temporarily (shouldn't be that hard) for this approach to work. We have an issue for it at NAT traversal: Swarm should allow creating a new connection between peers even if one already exists #1014.

aarshkshah1992 · 2021-01-21T06:38:28Z

This is now being tracked as part of #1039.

aarshkshah1992 added feature nat-traversal labels Nov 5, 2020

aarshkshah1992 self-assigned this Nov 5, 2020

aarshkshah1992 mentioned this issue Nov 5, 2020

NAT traversal: Use QUIC connection migration after we have direct a hole punched connection between peers #1016

Closed

aarshkshah1992 removed the feature label Nov 5, 2020

aarshkshah1992 closed this as completed Nov 5, 2020

aarshkshah1992 reopened this Nov 5, 2020

aarshkshah1992 closed this as completed Jan 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NAT traversal: QUIC Hole Punching #1015

NAT traversal: QUIC Hole Punching #1015

aarshkshah1992 commented Nov 5, 2020

aschmahmann commented Nov 5, 2020

aarshkshah1992 commented Nov 5, 2020

aschmahmann commented Nov 5, 2020 •

edited

aarshkshah1992 commented Nov 5, 2020 •

edited

aarshkshah1992 commented Jan 21, 2021

NAT traversal: QUIC Hole Punching #1015

NAT traversal: QUIC Hole Punching #1015

Comments

aarshkshah1992 commented Nov 5, 2020

aschmahmann commented Nov 5, 2020

aarshkshah1992 commented Nov 5, 2020

aschmahmann commented Nov 5, 2020 • edited

aarshkshah1992 commented Nov 5, 2020 • edited

aarshkshah1992 commented Jan 21, 2021

aschmahmann commented Nov 5, 2020 •

edited

aarshkshah1992 commented Nov 5, 2020 •

edited