Puma listen thread can fail with infinite loop. #2699

ioquatix · 2021-09-15T09:00:37Z

The following middleware can cause Puma to enter an infinite loop:

require 'objspace'

run lambda{|env|
	ios = ObjectSpace.each_object(IO).to_a.select{|io| io.is_a?(TCPServer)}
	
	ios.each(&:close)
	
	[200, [], [ios.inspect]]
}

This probably seems odd, but we are actually seeing some variant of this in a production system. We don't know who is calling close (suspect threading bug) but it causes an EBADF error in the list loop, followed by infinite IOErrors:

> puma -w4 -t4
[95960] Puma starting in cluster mode...
[95960] * Puma version: 5.4.0 (ruby 3.0.2-p107) ("Super Flight")
[95960] *  Min threads: 4
[95960] *  Max threads: 4
[95960] *  Environment: development
[95960] *   Master PID: 95960
[95960] *      Workers: 4
[95960] *     Restarts: (✔) hot (✔) phased
[95960] * Listening on http://0.0.0.0:9292
[95960] Use Ctrl-C to stop
[95960] - Worker 1 (PID: 95962) booted in 0.02s, phase: 0
[95960] - Worker 0 (PID: 95961) booted in 0.02s, phase: 0
[95960] - Worker 2 (PID: 95963) booted in 0.02s, phase: 0
[95960] - Worker 3 (PID: 95964) booted in 0.02s, phase: 0
2021-09-15 20:56:09 +1200 Listen loop: #<Errno::EBADF: Bad file descriptor>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>
2021-09-15 20:56:09 +1200 Listen loop: #<IOError: closed stream>

At the very least, we should probably handle this more gracefully (i.e. exit the loop as soon as the listening socket becomes invalid for any reason, rather than trying to select/accept again).

The text was updated successfully, but these errors were encountered:

ioquatix · 2021-09-15T09:46:46Z

I just whipped up a quick PR to try and fix this, but we should probably write some tests too.

MSP-Greg · 2021-09-15T13:38:10Z

Need more coffee, I'll see about a test... Thanks.

MSP-Greg · 2021-09-15T16:34:16Z

@ioquatix

I'm just seeing the infinite loop when Puma is shutdown. Do you see the same? Using a rackup of:

require 'objspace'

run lambda { |env|
  ios = ObjectSpace.each_object(::TCPServer).to_a.tap { |a| a.each(&:close) }
  [200, [], [ios.inspect]]
}

curl connects the first time, and then freezes. Or, the first connection deletes the listener sockets, but Puma keeps running, and only detects the closure when it's shutdown. Trying to see if there's a simple way to detect the closure while it's running...

ioquatix · 2021-09-15T20:10:36Z

I'm not sure the exact sequence of events but it might depend on how puma is started.

MSP-Greg · 2021-09-15T20:47:19Z

I started the server, I think using both single (no workers) and clustered (workers), and did requests with curl. The first request closed the listener sockets, the second hung, as there was nothing to accept the connection. When I stopped the server with Ctrl-C, I then got the loop.

I'd like to know how/why the production system shut the listener socket...

Regardless, I think we may need another thread to check whether any listener sockets have closed, then a means of re-opening them, which needs to work a bit differently for single and clustered.

Or, since it's unlikely that Puma closed the listeners, we consider it an external issue. Don't know. Then again, I have stated something like 'nothing should stop Puma' in the past...

ioquatix · 2021-09-15T22:44:14Z

I'd like to know how/why the production system shut the listener socket...

We suspect thread corruption within a C extension but we don't know for sure yet. It's just one theory. Another is Ruby code doing something wonky with IOs.

I don't mind if "nothing stops puma" but my gut feeling is we should be very careful about having a catch all exception handler in a non-terminating loop. We can try to solve both problems, but my main concern is the latter.

* More elaborate exception handling. Fixes #2699. * Add TestIntegrationSingle#test_closed_listener Co-authored-by: MSP-Greg <Greg.mpls@gmail.com>

* More elaborate exception handling. Fixes puma#2699. * Add TestIntegrationSingle#test_closed_listener Co-authored-by: MSP-Greg <Greg.mpls@gmail.com>

ioquatix added a commit to ioquatix/puma that referenced this issue Sep 15, 2021

More elaborate exception handling. Fixes puma#2699.

e6464f6

ioquatix mentioned this issue Sep 15, 2021

More elaborate exception handling. #2700

Merged

ioquatix added a commit to ioquatix/puma that referenced this issue Sep 16, 2021

More elaborate exception handling. Fixes puma#2699.

ccd0843

MSP-Greg pushed a commit to ioquatix/puma that referenced this issue Sep 16, 2021

More elaborate exception handling. Fixes puma#2699.

76ad8c6

nateberkopec closed this as completed in #2700 Sep 16, 2021

nateberkopec pushed a commit that referenced this issue Sep 16, 2021

More elaborate exception handling. (#2700)

cb08034

* More elaborate exception handling. Fixes #2699. * Add TestIntegrationSingle#test_closed_listener Co-authored-by: MSP-Greg <Greg.mpls@gmail.com>

JuanitoFatas pushed a commit to JuanitoFatas/puma that referenced this issue Sep 9, 2022

More elaborate exception handling. (puma#2700)

3c2dbab

* More elaborate exception handling. Fixes puma#2699. * Add TestIntegrationSingle#test_closed_listener Co-authored-by: MSP-Greg <Greg.mpls@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Puma listen thread can fail with infinite loop. #2699

Puma listen thread can fail with infinite loop. #2699

ioquatix commented Sep 15, 2021

ioquatix commented Sep 15, 2021

MSP-Greg commented Sep 15, 2021

MSP-Greg commented Sep 15, 2021

ioquatix commented Sep 15, 2021

MSP-Greg commented Sep 15, 2021

ioquatix commented Sep 15, 2021

Navigation Menu

Puma listen thread can fail with infinite loop. #2699

Puma listen thread can fail with infinite loop. #2699

Comments

ioquatix commented Sep 15, 2021

ioquatix commented Sep 15, 2021

MSP-Greg commented Sep 15, 2021

MSP-Greg commented Sep 15, 2021

ioquatix commented Sep 15, 2021

MSP-Greg commented Sep 15, 2021

ioquatix commented Sep 15, 2021