`Concurrent::Hash` default initialization is not fully thread-safe #970

mensfeld · 2022-11-17T10:29:39Z

Based on the docs:

A thread-safe subclass of Hash. This version locks against the object itself for every method call, ensuring only one thread can be reading or writing at a time. This includes iteration methods like #each, which takes the lock repeatedly when reading an item.

Given this code:

h = Concurrent::Hash.new do |hash, key|
  hash[key] = Concurrent::Array.new
end

the initialization is not thread-safe.

Note from @eregon, the thread-safe variant of this code is:

h = Concurrent::Map.new do |hash, key|
  hash.compute_if_absent(key) { Concurrent::Array.new }
end

Obviously the latter part of the doc indicates that:

ensuring only one thread can be reading or writing at a time

but the initial part makes it confusing:

This version locks against the object itself for every method call

It can be demoed by running this code:

require 'concurrent-ruby'

1000.times do
  h = Concurrent::Hash.new do |hash, key|
    hash[key] = Concurrent::Array.new
  end

  100.times.map do
    Thread.new do
      h[:na] << true
    end
  end.each(&:join)

  raise if h[:na].count != 100
end

I would expect to either:

Have the initialization block behind a mutex - so there is no conflict
Have the docs updated (I can do that)

Works like so:

require 'concurrent-ruby'

m = Mutex.new

1000.times do
  h = Concurrent::Hash.new do |hash, key|
    m.synchronize do
      break hash[key] if hash.key?(key)

      hash[key] = Concurrent::Array.new
    end
  end

  100.times.map do
    Thread.new do
      h[:na] << true
    end
  end.each(&:join)

  raise if h[:na].count != 100
end

nijikon · 2022-11-17T15:15:26Z

Oh, wow. This is interesting.

mensfeld · 2022-11-19T18:48:47Z

Same applies to the Concurrent::Map:

require 'concurrent-ruby'

1000.times do
  h = Concurrent::Map.new do |hash, key|
    hash[key] = Concurrent::Array.new
  end

  100.times.map do
    Thread.new do
      h[:na] << true
    end
  end.each(&:join)

  raise if h[:na].count != 100
end

This change makes the initialization of the hash upon missing key fully thread-safe. Before this change, initialization that would occur in two threads could overwrite each other, as illustrated here: ruby-concurrency/concurrent-ruby#970

Behavior upon missing prefix partial name may cause a key to overwrite when executed in multiple threads at the same time. ref ruby-concurrency/concurrent-ruby#970

Fixes issue described here: ruby-concurrency/concurrent-ruby#970

Initialization upon miss can lead to hard to debug scenarios where potentially a concurrent array will leak out but the value in the `@events` will be different. as described here: ruby-concurrency/concurrent-ruby#970

Without this change potential incrementation can "go away" and can actually mismatch by 1 if run in multiple threads the same time. This can lead to super weird errors where counter is not as expected (been there, took me ages to debug). ref: ruby-concurrency/concurrent-ruby#970

Under intense threading usage, the `@@users` buffer will not be thread-safe fully. Details here: ref ruby-concurrency/concurrent-ruby#970

In case of extensive concurrent usage, the mutex handed over to two threads under same key may differ as illustrated below: ```ruby require 'concurrent' 10000.times do kind_fetcher_locks = Concurrent::Hash.new { |hash, key| hash[key] = Mutex.new } refs = Set.new 100.times.map do |i| Thread.new { refs << kind_fetcher_locks[i % 50].object_id } end.each(&:join) raise "Not 50 but #{refs.count}" unless refs.size == 50 end ``` this can lead to really weird issues. Works when fixed as above: ```ruby require 'concurrent' 10000.times do mutex = Mutex.new # kind_fetcher_locks = Concurrent::Hash.new { |hash, key| hash[key] = Mutex.new } kind_fetcher_locks = Concurrent::Hash.new do |hash, key| mutex.synchronize do break hash[key] if hash.key?(key) hash[key] = Mutex.new end end refs = Set.new 100.times.map do |i| Thread.new { refs << kind_fetcher_locks[i % 50].object_id } end.each(&:join) raise "Not 50 but #{refs.count}" unless refs.size == 50 end ``` ref ruby-concurrency/concurrent-ruby#970

Not locking the default initialization can lead to race-conditions. Note: not sure if I should use one or two mutexes as I am not familiar with this code enough to make the judgment. ref: ruby-concurrency/concurrent-ruby#970

Under intense threading usage, the `@@users` buffer will not be thread-safe fully. Details here: ref ruby-concurrency/concurrent-ruby#970

nightpool · 2022-11-24T04:25:31Z

This pattern is widely prevalent in open source code and it's very very clear that developers assume that this works. I think it's very important to wrap this initializer block in a mutex and not just update the docs

granthusbands · 2022-11-24T18:02:22Z

It's now slightly complicated, as some (as above) have a fix that assumes the initializer is not in the mutex and so call compute_if_absent or such. Unless the mutex is reentrant or there's a test for this, it would then deadlock.

Also, it's suboptimal to use a mutex that's separate from the hash, as above; it would help for Concurrent::Hash to have some of the quality-of-life improvements of Concurrent::Map or at least a way to use the same lock.

mensfeld · 2022-11-24T18:11:13Z

@granthusbands if not fixable or brings weird problems to the table, maybe we could expand rubocop to notify on common mistakes, etc.

eregon · 2022-12-12T11:48:58Z

Thank you for the issue report. I generally agree we should fix this if we can. The question is how.

(1) We could (try to) use a lock around the whole initializer, but that is also a typical anti-pattern to hold a lock so long, and that can lead to deadlock (e.g., if 2 Concurrent::Hash initializer blocks refer to one another, like #627 which uses the block of each but for Hash/Map).
This seems quite difficult given the various backends. Not all backends use a Mutex for instance or even a lock for all operations on a Concurrent::Hash. We'd need to somehow make it work for each of them independently.
As a note, these are the semantics of ConcurrentHashMap#computeIfAbsent in Java. That also says: The entire method invocation is performed atomically, so the function is applied at most once per key. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map. that latter part which we cannot guarantee for an arbitrary initializer block, so it feels a bit wrong at least.
Yet another challenge is Concurrent::Hash is currently just ::Hash on CRuby, but can't anymore if we fix this. From that view 1) this might be considered caused by CRuby Hash and 2) it may make sense to actually try to fix this in core Hash.

(2) We could do what I suggested in my PhD thesis to solve basically the same issue but on Hash itself (BTW, CRuby Hash does not guarantee this): https://eregon.me/blog/assets/research/thesis-thread-safe-data-representations-in-dynamic-languages.pdf page 83 Idiomatic Concurrent Hash Operations. In short, it replaces []= calls in the initializer block with put_if_absent by passing a different object than the Concurrent::Hash itself, which overrides []= and delegates the rest.

It's a classic "pick 1 or 2 but all 3 seems impossible":

Write into the Concurrent::Hash only once, required to fix for this issue
Execute the block only once, nice-to-have
No deadlocks caused by this

eregon · 2022-12-15T13:38:35Z

I've filed an issue on the CRuby tracker to see what they think about the same problem for core Hash: https://bugs.ruby-lang.org/issues/19237

eregon · 2023-03-09T21:50:32Z

FWIW CRuby closed that ticket and added documentation that Hash is not thread-safe for that case: https://bugs.ruby-lang.org/issues/19237#note-2 and ruby/ruby@ffd5241.
I think it makes sense to solve this for Concurrent::Hash and Concurrent::Map.
We'll need to pick one of the two approaches above.

Using the lock approach would also fix #929, but makes it prone to deadlocks. For example we'll already need Monitor and not Mutex to let existing usages of compute_if_absent inside the block work fine and not error due to trying to lock the same Mutex again.

Using an object forwarding []= differently might be surprising.

Another option would be to pass a special object to the block, which warns on []= inside the block as that's not atomic, to let people know they should use compute_if_absent instead.

Fixes issue described here: ruby-concurrency/concurrent-ruby#970

Not locking the default initialization can lead to race-conditions. ref: ruby-concurrency/concurrent-ruby#970 Co-authored-by: Maciej Mensfeld <maciej@mensfeld.pl> resolves puppetlabs#8951

Not locking the default initialization can lead to race-conditions. I don't think we can switch to Concurrent::Map, and it's compute_if_absent method, because insertion order won't be maintained. So synchronize the long way. ref: ruby-concurrency/concurrent-ruby#970 Co-authored-by: Maciej Mensfeld <maciej@mensfeld.pl> resolves puppetlabs#8951

mensfeld changed the title ~~Concurrent::Hash default initialization is not thread-sae~~ Concurrent::Hash default initialization is not fully thread-safe Nov 17, 2022

nijikon mentioned this issue Nov 20, 2022

[BUG] i18n translation loading is not thread-safe ruby-i18n/i18n#643

Closed

mensfeld mentioned this issue Nov 20, 2022

Make the yaml cache fully thread-safe dry-rb/dry-schema#440

Merged

mensfeld mentioned this issue Nov 20, 2022

Remove not needed thread-safe primitive rails/rails#46534

Merged

5 tasks

mensfeld mentioned this issue Nov 20, 2022

Make sure that concurrent map usage is thread-safe rails/rails#46536

Merged

5 tasks

mensfeld added a commit to mensfeld/finite_machine that referenced this issue Nov 20, 2022

Make the state hash fully thread safe

bc0c836

Fixes issue described here: ruby-concurrency/concurrent-ruby#970

mensfeld mentioned this issue Nov 20, 2022

Make the state hash fully thread safe piotrmurach/finite_machine#77

Merged

5 tasks

mensfeld mentioned this issue Nov 20, 2022

Fix events map initialization to prevent race conditions rmosolgo/graphql-ruby#4251

Merged

mensfeld mentioned this issue Nov 20, 2022

Make registry fully thread safe rom-rb/rom-factory#80

Merged

mensfeld added a commit to mensfeld/whimsy that referenced this issue Nov 21, 2022

Make users buffer fully thread-safe

67a92c1

Under intense threading usage, the `@@users` buffer will not be thread-safe fully. Details here: ref ruby-concurrency/concurrent-ruby#970

mensfeld mentioned this issue Nov 21, 2022

Make users buffer fully thread-safe apache/whimsy#169

Merged

mensfeld mentioned this issue Nov 21, 2022

Kind fetcher locks are not fully thread-safe Shopify/krane#911

Open

mensfeld mentioned this issue Nov 21, 2022

Make cache and values fully thread-safe puppetlabs/puppet#8951

Closed

sebbASF pushed a commit to apache/whimsy that referenced this issue Nov 22, 2022

Make users buffer fully thread-safe (#169)

fbe1a99

Under intense threading usage, the `@@users` buffer will not be thread-safe fully. Details here: ref ruby-concurrency/concurrent-ruby#970

eregon self-assigned this Dec 12, 2022

eregon mentioned this issue Dec 12, 2022

Concurrent::Map Performance #882

Closed

eregon added the high-priority Should be done ASAP. label Mar 10, 2023

This was referenced Jul 18, 2023

Java exception escaping Hash#[]= in addressable oracle/truffleruby#3166

Closed

Unsafe concurrent Hash access sporkmonger/addressable#514

Closed

Make ActiveRecord's quoted name caches thread-safe on JRuby/TruffleRuby rails/rails#48773

Merged

piotrmurach pushed a commit to mensfeld/finite_machine that referenced this issue Oct 8, 2023

Make the state hash fully thread safe

9e6344c

Fixes issue described here: ruby-concurrency/concurrent-ruby#970

piotrmurach pushed a commit to mensfeld/finite_machine that referenced this issue Oct 8, 2023

Fix hooks_map initialization to be fully thread-safe

bed686e

Fixes issue described here: ruby-concurrency/concurrent-ruby#970

piotrmurach pushed a commit to piotrmurach/finite_machine that referenced this issue Oct 8, 2023

Fix hooks_map initialization to be fully thread-safe

62ccd68

Fixes issue described here: ruby-concurrency/concurrent-ruby#970

piotrmurach pushed a commit to piotrmurach/finite_machine that referenced this issue Oct 8, 2023

Fix hooks_map initialization to be fully thread-safe

245fbc3

Fixes issue described here: ruby-concurrency/concurrent-ruby#970

joshcooper mentioned this issue Nov 16, 2023

Make cache and values fully thread-safe puppetlabs/puppet#9158

Merged

kamil-kudra mentioned this issue Apr 18, 2024

RuntimeError: can't add a new key into hash during iteration on lib/active_record/connection_adapters/abstract/query_cache.rb rails/rails#45287

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Concurrent::Hash` default initialization is not fully thread-safe #970

`Concurrent::Hash` default initialization is not fully thread-safe #970

mensfeld commented Nov 17, 2022 •

edited by eregon

nijikon commented Nov 17, 2022

mensfeld commented Nov 19, 2022

nightpool commented Nov 24, 2022

granthusbands commented Nov 24, 2022

mensfeld commented Nov 24, 2022

eregon commented Dec 12, 2022 •

edited

eregon commented Dec 15, 2022 •

edited

eregon commented Mar 9, 2023 •

edited

Concurrent::Hash default initialization is not fully thread-safe #970

Concurrent::Hash default initialization is not fully thread-safe #970

Comments

mensfeld commented Nov 17, 2022 • edited by eregon

nijikon commented Nov 17, 2022

mensfeld commented Nov 19, 2022

nightpool commented Nov 24, 2022

granthusbands commented Nov 24, 2022

mensfeld commented Nov 24, 2022

eregon commented Dec 12, 2022 • edited

eregon commented Dec 15, 2022 • edited

eregon commented Mar 9, 2023 • edited

`Concurrent::Hash` default initialization is not fully thread-safe #970

`Concurrent::Hash` default initialization is not fully thread-safe #970

mensfeld commented Nov 17, 2022 •

edited by eregon

eregon commented Dec 12, 2022 •

edited

eregon commented Dec 15, 2022 •

edited

eregon commented Mar 9, 2023 •

edited