Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async callbacks break after forking if async_cb_thread gets started before forking #884

Closed
ivoanjo opened this issue Feb 20, 2021 · 1 comment · Fixed by #888
Closed

Comments

@ivoanjo
Copy link
Contributor

ivoanjo commented Feb 20, 2021

Hello again! While having breakfast I got to wondering about the code I saw in #883 and I got curious -- what happens to the async_cb_thread after Ruby forks? And would that break async callbacks?

It turns out that there's an issue hiding here.

Using the following test case (I just dropped it into the ffi repository and reused some of the test tools already there):

require_relative './spec/ffi/fixtures/compile'

puts TestLibrary::PATH

module LibTest
  extend FFI::Library
  ffi_lib TestLibrary::PATH
  AsyncIntCallback = callback [ :int ], :void

  @blocking = true
  attach_function :testAsyncCallback, [ AsyncIntCallback, :int ], :void
end

def poke_ffi
  puts "1. Active threads in #{Process.pid}: #{Thread.list}"
  FFI::Function.new(:int, []) { 5 }
  puts "2. Active threads in #{Process.pid}: #{Thread.list}"
  LibTest.testAsyncCallback(proc { puts "Hello from callback in #{Process.pid}" }, 0)
  puts "3. Active threads in #{Process.pid}: #{Thread.list}"
end

poke_ffi if ENV['POKE_FFI_BEFORE_FORK'] == '1'

puts "Forking. Parent is #{Process.pid}"

fork do
  puts "Forked. I'm #{Process.pid}"
  poke_ffi
  puts "Fork finished in #{Process.pid}"
end

puts "After forking, in #{Process.pid}"
poke_ffi
Process.wait
puts "Parent exiting after child finished #{Process.pid}"

When we don't run poke_ffi before the fork, everything works as expected:

$ ruby test.rb 
"make CPU=x86_64 OS=linux"
make: Nothing to be done for 'all'.
/home/knuckles/ruby/ffi/spec/ffi/fixtures/libtest.so
Forking. Parent is 77293
After forking, in 77293
1. Active threads in 77293: [#<Thread:0x0000559a01096d70 run>]
Forked. I'm 77303
2. Active threads in 77293: [#<Thread:0x0000559a01096d70 run>, #<Thread:0x0000559a019f4700 run>]
1. Active threads in 77303: [#<Thread:0x0000559a01096d70 run>]
2. Active threads in 77303: [#<Thread:0x0000559a01096d70 run>, #<Thread:0x0000559a019f4700 run>]
Hello from callback in 77293
3. Active threads in 77293: [#<Thread:0x0000559a01096d70 run>, #<Thread:0x0000559a019f4700 sleep>]
Hello from callback in 77303
3. Active threads in 77303: [#<Thread:0x0000559a01096d70 run>, #<Thread:0x0000559a019f4700 sleep>]
Fork finished in 77303
Parent exiting after child finished 77293

But then we set POKE_FFI_BEFORE_FORK=1 we break ffi in the fork

"make CPU=x86_64 OS=linux"
make: Nothing to be done for 'all'.
/home/knuckles/ruby/ffi/spec/ffi/fixtures/libtest.so
1. Active threads in 77248: [#<Thread:0x000055e09cb6ed60 run>]
2. Active threads in 77248: [#<Thread:0x000055e09cb6ed60 run>, #<Thread:0x000055e09d4b8858 run>]
Hello from callback in 77248
3. Active threads in 77248: [#<Thread:0x000055e09cb6ed60 run>, #<Thread:0x000055e09d4b8858 sleep>]
Forking. Parent is 77248
After forking, in 77248
1. Active threads in 77248: [#<Thread:0x000055e09cb6ed60 run>, #<Thread:0x000055e09d4b8858 sleep>]
2. Active threads in 77248: [#<Thread:0x000055e09cb6ed60 run>, #<Thread:0x000055e09d4b8858 sleep>]
Forked. I'm 77262
1. Active threads in 77262: [#<Thread:0x000055e09cb6ed60 run>]
2. Active threads in 77262: [#<Thread:0x000055e09cb6ed60 run>] // <-- async_cb_thread is nowhere to be found
Hello from callback in 77248
3. Active threads in 77248: [#<Thread:0x000055e09cb6ed60 run>, #<Thread:0x000055e09d4b8858 sleep>]
// <process hung -- parent is waiting for child, child is waiting for callback to execute, callback never gets called>

I'm guessing this wasn't spotted so far because most Ruby frameworks/apps employ forking pretty early on in their lifecycle so they get lucky (?).

larskanis added a commit to larskanis/ffi that referenced this issue Mar 5, 2021
After fork() the dispatcher thread is no longer running, so it needs to be restarted in the child process.

Fixes ffi#884
@larskanis
Copy link
Member

Thank you for spotting this and providing the repro script!

This was referenced Mar 8, 2021
This was referenced Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants