Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent::Hash and Concurrent::Array are not fully threadsafe on CRuby #929

Open
jacobat opened this issue Dec 14, 2021 · 9 comments
Open
Assignees
Labels
high-priority Should be done ASAP.

Comments

@jacobat
Copy link
Contributor

jacobat commented Dec 14, 2021

Concurrent::Hash and Concurrent::Array are not fully threadsafe on CRuby.

This can be demonstrated for Array with:

require 'concurrent/array'

array = Concurrent::Array.new
array << 0

20.times.map do |i|
  Thread.new { array.map!{|v| sleep 0.001; v + 1} }
end.each(&:join)
p array

This returns [1].

For Hash this code shows the issue:

require 'concurrent/hash'

hash = Concurrent::Hash.new
hash[:a] = 0

20.times.map do |i|
  Thread.new { hash.transform_values!{|v| sleep 0.001; v + 1} }
end.each(&:join)
p hash

This returns {a: 1}.

In both cases we would expect to see 20 instead of 1.

* Ruby implementation:             Ruby
* `concurrent-ruby` version:       1.1.9
* `concurrent-ruby-ext` installed: no
* `concurrent-ruby-edge` used:     no
@jacobat jacobat changed the title Concurrent::Hash and Concurrent::Array are not threadsafe on CRuby Concurrent::Hash and Concurrent::Array are not fully threadsafe on CRuby Dec 14, 2021
@chrisseaton
Copy link
Member

Hmmm I can see the misunderstanding. The docs say "ensuring only one thread can be reading or writing at a time" and that's what's going on here - one thread reads at a time, and one thread writes at a time, but not both in a single atomic action. The behaviour you're expecting is implemented on the other VMs, but that's really because they over-compensate and lock everything, which also isn't very healthy. I'd welcome discussion.

@jacobat
Copy link
Contributor Author

jacobat commented Dec 15, 2021

The docs say: "A thread-safe subclass of Hash/Array". I guess the question is: what is meant by thread safe? I tried googling and wasn't able to find a crystal clear definition.

One level of "thread safety" is that the object remains internally consistent while being used concurrently. As an example where this is not the case one can try manipulating a regular Ruby array from multiple threads using JRuby. CRubys Array and Hash classes maintain internal consistency at all times. IMHO this level is necessary but insufficient to be thread safe.

A stronger form of thread safety is when "all methods contain sufficient internal synchronization that instances may be used concurrently without the need for external synchronization"1.

Wikipedia2 states that thread safe means that the "Implementation is guaranteed to be free of race conditions when accessed by multiple threads simultaneously".

I hope we can agree that the Hash and Array classes provided fails to meet both of these criteria for any method that does both reading and writing on CRuby.

I think the part about "need for external synchronization" in the above quote is particularly interesting. As of now there's no way to use Array#map! safely in a multithreaded environment without external synchronization. You would have to wrap every call to map! in synchronization for it to be safe. Which to me would seem like an indication for something being not thread safe.

@chrisseaton
Copy link
Member

If you want these methods to be atomic right now as a workaround, a solution is to use compare-and-swap via Atom or AtomicFixnum.

@bestie
Copy link

bestie commented Dec 15, 2021

This is more than I expect from 'thread safe' also and I think the CRuby behavior is fine.

The problem I see is that the library is inconsistent. On both JRuby and TruffleRuby the thread execution is serialized and the block is atomic.

I think having a synchronised map! / each function would be a nice thing for CRuby to have too.

@jacobat
Copy link
Contributor Author

jacobat commented Dec 15, 2021

I don't have an application depending on this. I was simply trying things and was surprised by the behaviour. It just seems like something that could catch people by surprise if they're unaware of it.

I don't see how I could make an implementation using #compare_and_set with #map! that returns the correct result consistently.

@jacobat
Copy link
Contributor Author

jacobat commented Dec 15, 2021

I don't see how I could make an implementation using #compare_and_set with #map! that returns the correct result consistently.

I see it now :)

@eregon
Copy link
Collaborator

eregon commented Dec 12, 2022

Fixing this would mean adding overhead for Concurrent::Array on CRuby.
Not ideal, but it might be the right thing to do for semantics, unsure.
Holding a lock during a block like the one given to map! can also lead to deadlocks (#627), but that's an existing issue and is probably very hard/impossible to solve on non-CRuby, so maybe a lesser concern here.

Currently only that first level of #929 (comment) thread-safety is guaranteed. I.e., internal consistency, but it does not make Ruby blocks or operations accessing multiple indices atomic. Only individual read and writes are atomic.

@eviljoel
Copy link

If I understand this issue correctly (and I think that I do), I want to point out that the documentation on Array explicitly states that iteration operations like #each locks the object itself ensuring only one thread can be reading or writing at a time for the duration of the method call. (Paraphrasing.) The actual behavior seems to conflict with the documentation. See the actual text here: https://github.com/ruby-concurrency/concurrent-ruby/blob/master/lib/concurrent-ruby/concurrent/array.rb#L10

And, for the record, I appear to be running into this issue in my professional development.

@eregon
Copy link
Collaborator

eregon commented Mar 10, 2023

@eviljoel Right, this is indeed inconsistent. The documentation is the intended behavior but is indeed not the case on CRuby. I plan to fix this issue and so make it match the intended behavior.

@eregon eregon added the high-priority Should be done ASAP. label Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high-priority Should be done ASAP.
Projects
None yet
Development

No branches or pull requests

5 participants