New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for new thread scheduler of Ruby-3.0 #799
base: master
Are you sure you want to change the base?
Conversation
The feature is described in: https://bugs.ruby-lang.org/issues/16786 To avoid blocking the current ruby thread while calls to C, calls can be executed in a dedicated pthread. This happens when the current thread has a scheduler assigned by Thread.current.scheduler= . A pipe is used to signal the end of a call and the scheduler is invoked in order to wait for readability of the pipe. This way the scheduler can yield to another fiber or do other work instead of blocking the thread until the C call finishs. The current implementation does not yield any callbacks back to the calling thread. Instead all callbacks invoked in this way are handled as asynchronous callbacks. This means that each callback is executed in a dedicated ruby thread.
The feature can be tested by something like this: require "ffi"
class Scheduler
def for_fd(fd)
::IO.for_fd(fd, autoclose: false)
end
def wait_readable_fd(fd)
wait_readable(for_fd(fd))
end
def wait_readable(io)
p wait_readable_start: io
IO.select([io])
p wait_readable_end: io
end
def enter_blocking_region
puts "Enter blocking region: #{caller.first}"
end
def exit_blocking_region
puts "Exit blocking region: #{caller.first}"
end
def fiber(&block)
fiber = Fiber.new(blocking: false, &block)
fiber.resume
return fiber
end
end
Thread.current.scheduler = Scheduler.new
module Native
extend FFI::Library
ffi_lib :c
attach_function :sleep, [:uint], :uint, blocking: true
callback :qsort_cmp, [ :pointer, :pointer ], :int
attach_function :qsort, [ :pointer, :int, :int, :qsort_cmp ], :int, blocking: true
end
Fiber do
p native_sleep: :start
r = Native.sleep 1
p native_sleep: r
end
Fiber do
arr = [2, 1, 3]
pa = FFI::MemoryPointer.new(:int, arr.size)
pa.write_array_of_int32(arr)
Native.qsort(pa, arr.size, FFI.find_type(:int).size) do |p1, p2|
p Thread.current
p1.read_int <=> p2.read_int
end
p pa.read_array_of_int32(arr.size)
end It prints:
|
This is very cool. |
Currently callback blocks are executed in a dedicated ruby thread if they use this PR's feature. I don't think this is desired for fiber based event loops. So, I think it makes sense to pass callbacks back to the same thread, that made the C call which called the callback pointer. So given this script: p Thread.current
Native.qsort(pa, arr.size, FFI.find_type(:int).size) do |p1, p2|
p Thread.current
p1.read_int <=> p2.read_int
end It should use only one thread and the output should be kind of:
This could be archived by using the call frame that ruby-ffi manages for each thread and each call into C. This way we can track back from the callback to the causing ruby thread and invoke the callback block by passing this information through the same pipe the causing ruby thread is waiting for. Then the scheduler is notified about the pending callback and resumes the related fiber which then executes the callback block (instead of returning from the C call). A very similar mechanism is also used in Eventbox. If there's no call frame or it doesn't have a pipe to signal this callback, it would be executed in a dedicated thread as currently. In this case the C library invoked the callback from a non-ruby thread or a ruby thread without scheduler and we don't have a chance to deliver it to a related thread/fiber. @ioquatix What do you think about routing callbacks back to the causing C call? Or is that useless? |
I think it's better it runs in the same thread. There has been some discussion about how to send events from different threads into a scheduler. We don't have a firm plan yet but at least it's being considered. Having some use cases like this can help immensely with firming up a specific interface and implementation, so when I circle back to the scheduler interface (hopefully before the end of this month) I'll try consider how this should work. Regarding your specific implementation, I feel very strongly that you have a good opinion about how this should work, so I welcome your feedback and direction w.r.t. this functionality. |
Looks like a nice prototype, I have a few questions:
|
It's not possible to call C functions in a non-blocking fashion from a fiber/scheduler based thread. These are conflicting computation models. So there are two alternative options:
I don't think it's the task of FFI to avoid data races in the C library or sequentialize any calls to C functions. Different libraries have different requirements and FFI should be flexible enough to allow all of them. If the ruby library calls C functions concurrently, it's the task of the developer to verify that the C library allows this. So there is no such guarantee and I don't think we should enforce it.
All callbacks invoked by the mechanism of this PR are handled as asynchronous callbacks, because they are identified as a pthread (and no ruby thread). That means that each callback is executed in a dedicated ruby thread. To avoid data races on the ruby side I posted my ideas in the above comment.
I experimented with thread pooling in Eventbox. My result was that, due to the management overhead, a thread pool is not significant faster than dedicated threads. But on the down side, it can lead to hard to reproduce deadlocks, if the thread pool is limited in size. For now I would like to keep it at one pthread per C call. A thread pool is something we could implement and benchmark in the future. |
Re 1., Right, BTW, what's the effect of |
Reading about this again, I think it would be safer to have a new option (not just `blocking: true) to opt-in for blocking calls to execute on a separate thead. |
The scheduler feature is described in: https://bugs.ruby-lang.org/issues/16786
To avoid blocking the current ruby thread while calls to C, calls can be executed in a dedicated pthread. This happens when the current thread has a scheduler assigned by
Thread.current.scheduler=
and the function is marked asblocking: true
.A pipe is used to signal the end of a call and the scheduler is invoked in order to wait for readability of the pipe. This way the scheduler can yield to another fiber or do other work instead of blocking the thread until the C call finishes.
The current implementation does not yield any callbacks back to the calling thread. Instead all callbacks invoked in this way are handled as asynchronous callbacks. This means that each callback is executed in a dedicated ruby thread.
cc: @ioquatix