Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EM keeps shutdown when server overloads. #125

Open
martinbarilik opened this issue May 30, 2020 · 6 comments
Open

EM keeps shutdown when server overloads. #125

martinbarilik opened this issue May 30, 2020 · 6 comments

Comments

@martinbarilik
Copy link

Hi there again. We are using this gem to create client, but we can't figure out why client shutdowns EM immediately if server overloads out client with lots of messages. It happens sporadically once a day or two.

We create client as follows:

Thread.abort_on_exception = true
Thread.new {
  EM.run {
    ws = Faye::WebSocket::Client.new('ws://www.example.com/')

    ws.on :open do |event|
      p [:open]
      ws_logger.info "HELLO!"
    end

    ws.on :message do |event|
      ws_logger.info event.data
    end

    ws.on :close do |event|
      ws_logger.info "#{Time.now} code: #{event.code}, reason: #{event.reason}"
      ws = nil
    end
  }
  EM.add_shutdown_hook { ws_logger.info("#{Time.now} EM SHUTDOWN HOOK") }
}

We need to run client socket in dedicated thread so it won't block anything. Problem starts when server overloads our client with messages, we do not see the messages after overload in ws_logger file, but we have this:

[some time] EM SHUTDOWN HOOK
[some time] code: 1006, reason: 

The fact that EM SHUTDOWN HOOK is first and the socket closes with 1006 tells us, something is wrong. Maybe server sents very big bulk of messages client can't process ?

We have to be able to recover from 1006 and restart web-socket to be running again, but with EM being shutdown, recovery code is killed with it and software stops.

I look some info around sockets, maybe we need to set max_length option to client ? But it buggers me that The default value is 2^26 - 1, or 1 byte short of 64 MiB is not enough ?

I do not know is this is relevant, but i even looked at default buffer size in out machine according to this post https://stackoverflow.com/questions/7865069/how-to-find-the-socket-buffer-size-of-linux.

/proc/sys/net/ipv4/tcp_rmem (for read) // prints 4096    131072  6291456
/proc/sys/net/ipv4/tcp_wmem (for write) // prints 4096    16384   4194304

One more thing, can't figure out what is the difference between this gem and websocket-driver-ruby. Isn't the websocket-driver-ruby more suitable for us ?

If you need more info, ask for it.

Thanks in advance.

@jcoglan
Copy link
Collaborator

jcoglan commented May 30, 2020

Can you add this listener to your socket to see what error it prints when the socket closes?

ws.on :error do |error|
  # ...
end

@martinbarilik
Copy link
Author

I have added it, waiting for server to overload the client.

@martinbarilik
Copy link
Author

.on :error listener doesn't seem to catch anything, same old messages

[some time] EM SHUTDOWN HOOK
[some time] code: 1006, reason: 

@jcoglan
Copy link
Collaborator

jcoglan commented Jun 8, 2020

Normally, the close event only triggers with code 1006 when the underlying TCP socket is disconnected without a WebSocket close message being received, and the code that detects this disconnection triggers an error event first. The only other time this happens is if EventMachine itself is stopped, which you're detecting here.

Why are you stopping EventMachine, and why are you wrapping it in your own threads?

@martinbarilik
Copy link
Author

martinbarilik commented Jun 9, 2020

I am working on a project which depends on synchronous code but is using EM and websocket in it. We changed a code a bit, for now we use thread to only start EM so it won't break the main thread. We had some errors in functionality that caused an errors but they were never propagated to EM, because we used own Threads. We rewrote it to use EM.defer, for now it's working but we need to test it more.

If there is any "best-way-to" start EM on server start ( like in rack, or cron or something ) so everytime user wants to start its websocket EM will already be running, i am listening.

Anyway, thanks for your time.

@jcoglan
Copy link
Collaborator

jcoglan commented Jun 10, 2020

Yeah, mixing synchronous code with EM is painful. This package uses this function to start the reactor: https://github.com/faye/faye-websocket-ruby/blob/master/lib/faye/websocket.rb#L39-L42. Don't call that, it's not part of the public API -- write your own version of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants