Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket subscriptions do not get closed quickly if network connection is severed #23528

Closed
samsondav opened this issue Sep 3, 2021 · 3 comments · Fixed by #23556
Closed

Comments

@samsondav
Copy link

System information

Geth version: 1.10.8
OS & Version: All

Expected behaviour

I would expect if a websocket connection is severed for the subscription to return an error immediately allowing re-subscription attempts.

Actual behaviour

If you cut the websocket connection between the client and remote go-ethereum node (e.g. by firewall, VPN disconnection etc) then the websocket does not close and instead silently hangs. It appears to hang for a long time, although occasionally after 10-15m or so we will see a Error in new head subscription, unsubscribed: websocket: close 1006 (abnormal closure): unexpected EOF and our software will recover.

Steps to reproduce the behaviour

  1. Open a subscription to listen for new heads
  2. Cut the connection (I used connection through a VPN and disabled the VPN to simulate this, but you could probably do it with iptables etc)

Backtrace

No backtrace, but the debugger shows this:

Screenshot 2021-09-03 at 16 04 37

and also

Screenshot 2021-09-03 at 16 10 57

@samsondav samsondav changed the title Websocket subscriptions do not get closed if network connection goes away Websocket subscriptions do not get closed quickly if network connection is severed Sep 3, 2021
@holiman
Copy link
Contributor

holiman commented Sep 9, 2021

Could you try with this diff

diff --git a/rpc/websocket.go b/rpc/websocket.go
index afeb4c2081..bd511af91e 100644
--- a/rpc/websocket.go
+++ b/rpc/websocket.go
@@ -277,6 +277,7 @@ func (wc *websocketCodec) pingLoop() {
 	for {
 		select {
 		case <-wc.closed():
+			log.Info("websocket closed, exiting")
 			return
 		case <-wc.pingReset:
 			if !timer.Stop() {
@@ -284,9 +285,11 @@ func (wc *websocketCodec) pingLoop() {
 			}
 			timer.Reset(wsPingInterval)
 		case <-timer.C:
+			log.Info("Sending ping")
 			wc.jsonCodec.encMu.Lock()
 			wc.conn.SetWriteDeadline(time.Now().Add(wsPingWriteTimeout))
-			wc.conn.WriteMessage(websocket.PingMessage, nil)
+			err := wc.conn.WriteMessage(websocket.PingMessage, nil)
+			log.Info("Ping sent", "err", err)
 			wc.jsonCodec.encMu.Unlock()
 			timer.Reset(wsPingInterval)
 		}

In theory, the client should detect the drop within one minute, when it tries to send a ping. The failed send should close the connection.

@holiman
Copy link
Contributor

holiman commented Sep 9, 2021

It would be interesting to find out

  1. If the ping is properly sent from the client,
  2. If so, whether it triggers a write error,
  3. If so, how long it takes after the write error to trigger a close.

@jmank88
Copy link
Contributor

jmank88 commented Sep 9, 2021

The behavior we are seeing is that the ping writes do not error, and this is simulated in TestClientWebsocketSevered #23556

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants