Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sockets do not reconnect after a short network disconnection #3161

Closed
in19farkt opened this issue Nov 13, 2018 · 12 comments
Closed

Sockets do not reconnect after a short network disconnection #3161

in19farkt opened this issue Nov 13, 2018 · 12 comments

Comments

@in19farkt
Copy link

Environment

  • JS lib version (phoenix.js): 1.4.0
  • Operating system: MocOS
  • Browser: Chrome 70.0.3538.77

Expected behavior

Sockets try to reconnect after short network disconnection (5-10 seconds).

Actual behavior

Sockets remain disabled

Debug results

onConnClose is called with event.code === 1000 so this.reconnectTimer.scheduleTimeout is not be called

@chrismccord
Copy link
Member

Can you provide steps to reproduce this? I cannot reproduce on 1.4.0. Please also verify you are running updated phoenix.js, you can rm -rf assets/node_modules/phoenix and cd assets && npm i as a sanity check.

@chrismccord
Copy link
Member

With respect to the 1000 close code, we intentionally don't reconnect if the server tells us it's ending the connection on purpose. How are you triggering your network disconnects?

@in19farkt
Copy link
Author

This problem appeared after upgrading to version 1.4.

Steps to reproduce:

  • create socket instance
const instance = new Socket(WS_URL, { heartbeatIntervalMs: 5000 });
instance.connect();
  • disable network using chrome developer tools
    image
  • turn on the network after 7 seconds
  • last heartbeat remains unanswered and sockets don't try to reconnect

if you turn off the network for 40 seconds, then the reconnection will work correctly.

@JacquiManzi
Copy link

@in19farkt how on earth did you actually get Chrome dev tools offline mode to work for WebSocket connections? The only way I have been able to test this is by turning off my wifi.

Issue I'm referring to:
https://stackoverflow.com/questions/38729050/chrome-disabling-web-sockets-or-closing-a-web-socket-connection

More on topic, I am experiencing the same issue after upgrading to Phoenix 1.4 and I'm actually experiencing this on longer disconnects. I'll update this with more information as I debug.

@sgtpepper43
Copy link
Contributor

sgtpepper43 commented Jan 30, 2019

We're running into similar issues. For me I can get it to disconnect (and not reconnect) just by locking my computer and letting it sit for a while. I was digging into this and noticed that we don't reconnect for WS_CLOSE_NORMAL error codes, which is fine, but that seems at odds with how the heartbeat works:

sendHeartbeat(){ if(!this.isConnected()){ return }
    if(this.pendingHeartbeatRef){
      this.pendingHeartbeatRef = null
      if (this.hasLogger()) this.log("transport", "heartbeat timeout. Attempting to re-establish connection")
      this.conn.close(WS_CLOSE_NORMAL, "hearbeat timeout")
      return
    }
    this.pendingHeartbeatRef = this.makeRef()
    this.push({topic: "phoenix", event: "heartbeat", payload: {}, ref: this.pendingHeartbeatRef})
  }

Here it's saying that it will attempt to re-establish the connection, but we send WS_CLOSE_NORMAL, which explicitly does not re-establish the connection. So... should we send a different error code?

Though in my case I'm not actually even seeing the heartbeat log. So I don't know if this has anything to do with heartbeats or not.

It also should be noted that if I kill my server and then turn it back on, everything works great. Also, we have a script that loads and connects to a different phoenix app, and it's running 1.3 (which was before we stopped reconnecting on 1000 errors). When we experience the issue, that script reconnects just fine, while the local script running 1.4 doesn't.

@Gazomba
Copy link

Gazomba commented Feb 12, 2019

We are experiencing similar behaviour. When I cut off internet for just a second, the socket is closed and does not try to reconnect, i.e the onOpen callback is never triggered. But when I cut off internet for a longer period of time and the timeout is triggered, it does reconnect and everything works just fine.

Xiaobin0860 added a commit to Xiaobin0860/phoenix that referenced this issue Feb 16, 2019
* phx/master: (26 commits)
  Support any struct with :endpoint key in helpers
  Inspect body in ConnTest.response/2 (phoenixframework#3267)
  update snippet to agree with latest phx.new (phoenixframework#3277)
  Only enable trim for HTML templates
  Revert reconnect optimizations which introduced regressions. Fixes phoenixframework#3161 (phoenixframework#3272)
  Reword sentence in Controllers guide (phoenixframework#3270)
  update documentation to reflect on function deprecations (phoenixframework#3269)
  Fix warning in presence
  Default log for render errors info should be debug
  Add jason to umbrella ecto deps. Fixes phoenixframework#3263
  Add Elixir Slack community in the help column of the initial default page (phoenixframework#3262)
  Update heroku.md (phoenixframework#3241)
  add JavaPhoenixClient in 3rd party channels client libraries list (phoenixframework#3256)
  fix typespec for put_layout (phoenixframework#3253)
  Add version to umbrellas (mirror Elixir master)
  Fix references in guides (phoenixframework#3251)
  Add $PORT bind step in Heroku deployment guide (phoenixframework#3235)
  update link to mime  types (phoenixframework#3249)
  Add Elixir 1.7 and 1.8 to Travis CI build matrix (phoenixframework#3248)
  Update learning.md (phoenixframework#3247)
  ...
@simpers
Copy link

simpers commented Mar 21, 2019

This still happens sometimes. Not sure what the difference is in this case versus the first, as the fix that was introduced really helped.

It seems like the socket will close itself, intentionally, if one of the heartbeats it tried to send isn't responded to properly. This results in the socket's onClose being called, and it will not reconnect again. The channels, however, are still there and they still try to reconnect. This confusion comes from the fact that they use the same reconnectAfterMs. So for now I can in my socket.onClose force-leave these manually, but should the channels not be handled in the same was as the socket here?

@chrismccord
Copy link
Member

Can you open an issue with steps to reproduce? As a sanity check, please also ensure you are on 1.4.2 js client, and you have rm -rf node_modules/phoenix && npm i in your assets dir

@chrismccord
Copy link
Member

@simpers I was able to recreate the issue and it has been fixed on master. Please give it a shot and lmk if all is green on your end. Thanks!

@simpers
Copy link

simpers commented Mar 25, 2019

Everything seems to work perfectly now, as far as I can tell :) Super thanks for the quick help, @chrismccord ! 👍 🎉

@themitigater
Copy link

themitigater commented Apr 29, 2020

Hey, we're seeing similar behavior after upgrading to Phoenix JS 1.5.1.
The socket doesn't get reconnected when we do a network switch, this is not the case when Wifi is turned off and then back on.

Environment:
Phoenix JS: 1.5.1
OS: macOS - 10.15.4
Browser: Chrome 81

Actual Behaviour:

  • Error Seen on the console: Websocket is in closed or closing state
  • this.socket.readyState returns 1 - Open
  • this.channel.socket.isConnected() returns true
  • The last push event as seen from the network didn't receive a reply from the server.

Expected Behaviour
Socket reconnections should be attempted.

Steps to reproduce:

  1. Create a socket and connect() // Default opts
  2. Subscribe to channel()
  3. Switch network - Wifi/Lan
  4. Try channel.push(timeout: 25000)
    • onTimeout is triggered

@snewcomer
Copy link
Contributor

@themitigater Is it possible to reproduce in an example app you can push and share? That would help us debug easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants