Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Puma throwing "Transport endpoint is not connected" errors over and over. #2335

Closed
vrmerlin opened this issue Aug 10, 2020 · 6 comments · Fixed by #2375
Closed

Puma throwing "Transport endpoint is not connected" errors over and over. #2335

vrmerlin opened this issue Aug 10, 2020 · 6 comments · Fixed by #2375
Labels

Comments

@vrmerlin
Copy link

Describe the bug
We are evaluating Puppet Enterprise, which uses Puma. At some random point, puma errors are being generated over and over again until the entire storage space is full.

Here are the errors we are seeing over and over again:

Error in reactor loop escaped: Transport endpoint is not connected - getpeername(2) (Errno::ENOTCONN)
/opt/puppetlabs/server/apps/bolt-server/lib/ruby/gems/puma-4.3.1/lib/puma/minissl.rb:156:in ‘peeraddr’
/opt/puppetlabs/server/apps/bolt-server/lib/ruby/gems/puma-4.3.1/lib/puma/minissl.rb:156:in ‘peeraddr’
/opt/puppetlabs/server/apps/bolt-server/lib/ruby/gems/puma-4.3.1/lib/puma/reactor.rb:239:in ‘rescue in block in run_internal’
/opt/puppetlabs/server/apps/bolt-server/lib/ruby/gems/puma-4.3.1/lib/puma/reactor.rb:218:in ‘block in run_internal’
/opt/puppetlabs/server/apps/bolt-server/lib/ruby/gems/puma-4.3.1/lib/puma/reactor.rb:157:in ‘each’
/opt/puppetlabs/server/apps/bolt-server/lib/ruby/gems/puma-4.3.1/lib/puma/reactor.rb:157:in ‘run_internal’
/opt/puppetlabs/server/apps/bolt-server/lib/ruby/gems/puma-4.3.1/lib/puma/reactor.rb:313:in ‘block in run_in_thread’
/opt/puppetlabs/server/apps/bolt-server/lib/ruby/gems/logging-2.2.2/lib/logging/diagnostic_context.rb:474:in ‘block in create_with_logging_context’

Any idea why these errors are being generated? If needed I can work with Puppet, but we are just evaluating their product and don't yet have access to tech support. Since the failures are in puma code I thought I'd try here first.

Puma config:

PE uses puma-4.3.1.

@nateberkopec
Copy link
Member

This looks like Puma's fault, a bad response to a bad/misbehaving client. During an SSL handshake failure, it looks like your client is disconnecting (possibly the reason the SSL handshake failed too). That's causing an exception when we try to get the address of the socket here.

We only need the socket address for logging purposes, so if we can't get it due to ENOTCONN we should just set it to an an unknown address.

So we should probably just add it to the rescue clause there.

@MSP-Greg
Copy link
Member

@vrmerlin

What @nateberkopec mentioned.

Additionally, anything that might be different about your ssl configuration/setup? I'm surprised it hasn't come up before.

@vrmerlin
Copy link
Author

Yeah, and i think I've narrowed the problem down. Looks like Puppet Enterprise has a default flag to report diagnostics back to their company site once a day (not thrilled about that). Our company injects a custom SSL certificate for all external communication. So, the Ruby code is likely rejecting the SSL cert, and causing this problem at the 24 hour mark when the PE server reports in.

I turned the flag off, and hopefully the entire system won't crash tonight, like has been happening the last few days.

@MSP-Greg
Copy link
Member

@vrmerlin

Thanks for looking into the issue. I think we'd still like to catch the error in a normal manner. The current code is as follows:

ssl_socket = c.io
begin
  addr = ssl_socket.peeraddr.last
# EINVAL can happen when browser closes socket w/security exception
rescue IOError, Errno::EINVAL
  addr = "<unknown>"
end

cert = ssl_socket.peercert

c.close

I'm guessing the following should work, although it will still generate an error...

ssl_socket = c.io
begin
  addr = ssl_socket.peeraddr.last
  cert = ssl_socket.peercert
# EINVAL can happen when browser closes socket w/security exception
rescue IOError, Errno::EINVAL, Errno::ENOTCONN => e_inner
  addr = "<#{e_inner.class}>"
  cert = "<#{e_inner.message}>"
end
# below has its own rescue
c.close

@dentarg
Copy link
Member

dentarg commented Oct 1, 2020

Looks like this one should have been closed by #2375 when it was merged, but the "Close" keyword wasn't in the squash/merge commit I guess (e041d07)

@nateberkopec
Copy link
Member

@MSP-Greg You can close if you agree ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants