Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peers are unable to reconnect and stuck in a deadlock after IP address change #5377

Closed
viaj3ro opened this issue Jun 11, 2021 · 5 comments · Fixed by #5538
Closed

Peers are unable to reconnect and stuck in a deadlock after IP address change #5377

viaj3ro opened this issue Jun 11, 2021 · 5 comments · Fixed by #5538
Assignees
Labels
beginner Issues suitable for new developers bug Unintended code behaviour gossip intermediate Issues suitable for developers moderately familiar with the codebase and LN networking p2p Code related to the peer-to-peer behaviour
Milestone

Comments

@viaj3ro
Copy link

viaj3ro commented Jun 11, 2021

As described by @t-bast in https://gitter.im/ACINQ/eclair LND and eclair (possibly c-lightning) peers are unable to reconnect after a IP address change due to not trusting node_announcement:

if you change your IP address, your peers won't reconnect to you. They'll wait for you to connect to them and then will store your new IP address for future reconnections (ie they don't trust the new IP address from node_announcement alone)

This leads to a situation where even some clearnet peers have to be manually reconnected which can be quite time consuming and TOR peers are stuck in a deadlock that can only manually be resolved by the node operator of each and every TOR peer. They would have to look up the new IP address of their offline peer on one of the lightning explorer websites and manually feed it to their node. This is unlikely to happen quickly, if at all.

I changed the IP of my node more than 2 days ago and still have 65 disconnected TOR peers. Most, if not all of them seem to be running LND.

I'd suggest to use the IP from node_announcement if a connection to the last known IP fails. Can still be alternated with last known IP until success.

Also reported to eclair: ACINQ/eclair#1842

@Roasbeef
Copy link
Member

Seems all we need to do here, is refresh the target peer IP/addr from the node announcement: https://github.com/lightningnetwork/lnd/blob/master/server.go#L3564

@Roasbeef Roasbeef added beginner Issues suitable for new developers intermediate Issues suitable for developers moderately familiar with the codebase and LN labels Jun 15, 2021
@ellemouton
Copy link
Collaborator

ellemouton commented Jun 28, 2021

Is it necessary to only refresh peer advertised addresses here if the peer is an inbound peer? Why not for all peers?

Also, is an itest a good idea for something like this?

@Roasbeef Roasbeef added this to To do in v0.14.0-beta via automation Jul 1, 2021
@Roasbeef Roasbeef moved this from To do to In progress in v0.14.0-beta Jul 1, 2021
@Roasbeef
Copy link
Member

Roasbeef commented Jul 1, 2021

Is it necessary to only refresh peer advertised addresses here if the peer is an inbound peer? Why not for all peers?

We do it for inbound peers there as since it's an inbound connection, we may not have the proper port they're actually listening on to connect to (we see the random port assigned by the kernel).

Also, is an itest a good idea for something like this?

I think so, to ensure that things are working as expected end to end.

A sample test would look something like:

  • connect to a peer inbound
  • disconnect
  • restart inbound node and change the listening port
  • initial peer should eventually reconnect

@Roasbeef Roasbeef moved this from In progress to Review in progress in v0.14.0-beta Aug 17, 2021
@Roasbeef Roasbeef moved this from Review in progress to Reviewer approved in v0.14.0-beta Aug 26, 2021
@Roasbeef Roasbeef added the P2 should be fixed if one has time label Aug 31, 2021
@Roasbeef Roasbeef moved this from Reviewer approved to Blocked in v0.14.0-beta Sep 23, 2021
@viaj3ro
Copy link
Author

viaj3ro commented Sep 27, 2021

Any chance for this issue to be resolved any time soon? I still have a handful of TOR peers that refuse to connect back to me even after almost 4 months. Since many nodes have IP address changes once in a while, I assume they have the same issue without ever noticing it.

@t-bast
Copy link
Contributor

t-bast commented Sep 27, 2021

Note that there are related discussions at the spec level: lightning/bolts#911
Independently of that, I believe there are improvements that can be made to implementations (we didn't have time to fix this on the eclair side either yet).

v0.14.0-beta automation moved this from Blocked to Done Oct 4, 2021
@HannahMR HannahMR added P2 should be fixed if one has time and removed P2 should be fixed if one has time labels Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beginner Issues suitable for new developers bug Unintended code behaviour gossip intermediate Issues suitable for developers moderately familiar with the codebase and LN networking p2p Code related to the peer-to-peer behaviour
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

6 participants