consider removing / disabling keep-alives #44

marten-seemann · 2021-02-16T05:26:30Z

NAT mappings are kept from expiring if the the NAT occasionally sees a connection being used.

Architecturally, sending keep-alives is a responsibility of the underlying transport, not of the stream multiplexer. For TCP, we can set the SO_KEEPALIVE socket option in go-tcp-transport, and the kernel will take care of keeping the connection alive. When running yamux on top of any other transport, we can't make any assumptions about the necessity and the frequency of keep-alive intervals anyway.

I suggest adding SO_KEEPALIVE support to go-tcp-transport and removing the respective code from go-yamux. Open question: Do we only deprecate Config.EnableKeepAlive and Config.KeepAliveInterval, or do we remove them?

WDYT, @Stebalien and @willscott?

The text was updated successfully, but these errors were encountered:

willscott · 2021-02-16T05:29:32Z

if our release isn't doing anything with them, I'd probably be in favor of a major version update that removes them entirely.

Stebalien · 2021-02-16T18:14:48Z

Yeah, I'd love to remove this from the stream multiplexer (especially given that mplex doesn't support it).

However, we need keepalives in all transports before we should consider removing them here.

Websocket.
Relay (so the intermediate node can't "stall" the connection). I'm not sure of a clean way to handle this. Maybe:
1. Expose whether or not the transports support keepalives via Stat().
2. When keepalives are not supported, spin up a keepalive process in the host (keeping a keepalive stream open).

marten-seemann · 2021-02-17T02:44:38Z

Websocket.

On the protocol level, Websockets should have support for keep-alive: https://tools.ietf.org/html/rfc6455#section-5.5.2. That said, isn't it guaranteed that we always run Websockets over TCP or QUIC, both of which already support keep-alives?

Relay (so the intermediate node can't "stall" the connection).

I'm not sure there's a problem here. The relay will also run the TCP / QUIC transport, so they'll do keep-alives to both sides of the relayed connection. Even if they disable that (don't really see a reason why they'd do that, but let's just assume they do), receiving and acknowledging a keep-alive packet from both connection endpoints should be sufficient to keep any NAT binding alive.

Stebalien · 2021-02-17T02:59:51Z

That said, isn't it guaranteed that we always run Websockets over TCP or QUIC, both of which already support keep-alives?

Yes, we just need to enable them on the websocket transport. That is, the websocket transport doesn't, IIRC, use the TCP transport under the covers. It uses a shared "reuseport transport".

Stebalien · 2021-02-17T03:01:14Z

I'm not sure there's a problem here. The relay will also run the TCP / QUIC transport, so they'll do keep-alives to both sides of the relayed connection. Even if they disable that (don't really see a reason why they'd do that, but let's just assume they do), receiving and acknowledging a keep-alive packet from both connection endpoints should be sufficient to keep any NAT binding alive.

The problem is the relay. The relay can let the connection start, then DoS both sides by hanging (not necessarily due to malice). Without keepalives, we won't notice that the connection simply doesn't work. With keepalives, the relay will need to let some traffic through sometimes.

(obviously, we should try to upgrade anyways; but we still want to detect hung relayed connections)

marten-seemann · 2021-02-17T03:06:39Z

The problem is the relay. The relay can let the connection start, then DoS both sides by hanging (not necessarily due to malice). Without keepalives, we won't notice that the connection simply doesn't work.

A malicious relay can always stall the connection, so that's nothing we can defend against. The only thing we can and should do is detect it and time out in a timely manner.
In general, to keep a NAT bindings fresh, only one of the peers has to use keep-alives: As a keep-alive involves sending a packet and receiving an acknowledgement for that packet, every NAT on the way will see both an incoming and an outgoing packet. We only enable keep-alives on both sides to speed up the connection timeout in case the other peer goes offline.

Stebalien · 2021-02-17T05:45:25Z

A malicious relay can always stall the connection, so that's nothing we can defend against. The only thing we can and should do is detect it and time out in a timely manner

Yep, that's my point. We need some form of keepalive to detect stalled connections.

marten-seemann · 2021-02-17T06:26:17Z

Yep, that's my point. We need some form of keepalive to detect stalled connections.

Maybe I'm misunderstanding you. Dont we have that, because the two peers send keepalives to the relay?

Stebalien · 2021-02-17T20:26:13Z

No. We have keepalives _to_ the relay but that assumes that the relay is behaving correctly. We need keepalives that go end-to-end. The relay might: * Be malicious. We can't fully protect against this, but we can at least ensure that the relay allows _some_ traffic through. * Have a stuck/frozen libp2p node. TCP keepalives will continue to work because the kernel is fine.

This was referenced Feb 16, 2021

remove keep-alives #46

Closed

enable TCP keepalives libp2p/go-tcp-transport#73

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consider removing / disabling keep-alives #44

consider removing / disabling keep-alives #44

marten-seemann commented Feb 16, 2021

willscott commented Feb 16, 2021

Stebalien commented Feb 16, 2021

marten-seemann commented Feb 17, 2021

Stebalien commented Feb 17, 2021

Stebalien commented Feb 17, 2021

marten-seemann commented Feb 17, 2021

Stebalien commented Feb 17, 2021

marten-seemann commented Feb 17, 2021

Stebalien commented Feb 17, 2021 via email

consider removing / disabling keep-alives #44

consider removing / disabling keep-alives #44

Comments

marten-seemann commented Feb 16, 2021

willscott commented Feb 16, 2021

Stebalien commented Feb 16, 2021

marten-seemann commented Feb 17, 2021

Stebalien commented Feb 17, 2021

Stebalien commented Feb 17, 2021

marten-seemann commented Feb 17, 2021

Stebalien commented Feb 17, 2021

marten-seemann commented Feb 17, 2021

Stebalien commented Feb 17, 2021 via email