Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Cequel.uuid thread safe and fork safe #422

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

brendar
Copy link

@brendar brendar commented Jul 19, 2021

Cequel.uuid uses a memoized instance of Cassandra::TimeUuid::Generator which means it will be shared between threads, but Cassandra::TimeUuid::Generator is not thread safe. This could lead to duplicate UUIDs under a race condition. Cassandra's "upsert" behavior would mean duplicates would likely go undetected, and could lead to things like separate users sharing a record in the database. This race condition was difficult to reproduce on MRI, so it may be fairly rare, but is nevertheless possible.

A similar issue exists between a parent process and forked child processes, which would have different generator instances, but sharing the same internal state, so duplicate UUIDs are also possible. These race conditions are easy to reproduce, but would only arise if the parent process called Cequel.uuid before forking.

This PR addresses the thread safety issue by storing thread local instances of Cassandra::TimeUuid::Generator and addresses the fork safety issue by clearing the thread local instance when a PID change is detected. Note that on Linux detecting forking by checking Process.pid carries some overhead, but seems worth it to ensure safety.

Included below are some scripts to demonstrate the issues and benchmark the proposed solution.

threaded_cequel_uuid_tester.rb

require "set"
require "cequel"
require "timecop"

class ThreadedCequelUuidTester
  def self.run(num_threads:, num_uuids:)
    Cequel.uuid # Initialize the memoized Cassandra::TimeUuid::Generator

    # Makes duplicates more likely as it only requires a race condition on incrementing the sequence number,
    # rather than both clock value and sequence number. I found this necessary to observe duplicates when using MRI.
    Timecop.freeze

    threads = num_threads.times.map do
      Thread.new do
        uuids = []
        num_uuids.times do
          uuids << Cequel.uuid
        end
        uuids
      end
    end

    count = 0
    duplicate_count = 0
    threads.map(&:value).flatten.each_with_object(Set.new) do |uuid, set|
      count += 1
      unless set.add?(uuid)
        duplicate_count += 1
      end
    end

    puts "Generated #{count} uuids with #{duplicate_count} duplicates"
  end
end

ThreadedCequelUuidTester.run(num_threads: 8, num_uuids: 1_000_000)

forking_cequel_uuid_tester.rb

require "set"
require "cequel"
require "parallel"

class ForkingCequelUuidTester
  def self.run(num_processes:, num_uuids:)
    Cequel.uuid # Initialize the memoized Cassandra::TimeUuid::Generator

    results = Parallel.map(Array.new(num_processes), in_processes: num_processes) do
      uuids = []
      num_uuids.times do
        uuids << Cequel.uuid
      end
      uuids
    end

    count = 0
    duplicate_count = 0
    results.flatten.each_with_object(Set.new) do |uuid, set|
      count += 1
      unless set.add?(uuid)
        duplicate_count += 1
      end
    end

    puts "Generated #{count} uuids with #{duplicate_count} duplicates"
  end
end

ForkingCequelUuidTester.run(num_processes: 8, num_uuids: 1_000)

benchmark_uuid_generation.rb

require "cassandra"
require "benchmark/ips"

module UUIDTester
  class << self
    def unsafe_uuid
      unsafe_uuid_generator.now
    end

    def threadsafe_uuid
      threadsafe_uuid_generator.now
    end

    def threadsafe_and_forksafe_uuid
      threadsafe_and_forksafe_uuid_generator.now
    end

    def new_every_time_uuid
      Cassandra::TimeUuid::Generator.new.now
    end

    private

    def unsafe_uuid_generator
      @unsafe_uuid_generator ||= Cassandra::TimeUuid::Generator.new
    end

    def threadsafe_uuid_generator
      Thread.current[:threadsafe_uuid_generator] ||= Cassandra::TimeUuid::Generator.new
    end

    def threadsafe_and_forksafe_uuid_generator
      current_pid = Process.pid
      if Thread.current[:threadsafe_and_forksafe_uuid_generator_pid] != current_pid
        Thread.current[:threadsafe_and_forksafe_uuid_generator_pid] = current_pid
        Thread.current[:threadsafe_and_forksafe_uuid_generator] = nil
      end
      Thread.current[:threadsafe_and_forksafe_uuid_generator] ||= Cassandra::TimeUuid::Generator.new
    end
  end
end

puts "#{RUBY_VERSION} #{RUBY_PLATFORM}"

Benchmark.ips do |x|
  x.report("unsafe_uuid") { UUIDTester.unsafe_uuid }
  x.report("threadsafe_uuid") { UUIDTester.threadsafe_uuid }
  x.report("threadsafe_and_forksafe_uuid") { UUIDTester.threadsafe_and_forksafe_uuid }
  x.report("new_every_time_uuid") { UUIDTester.new_every_time_uuid }
  x.compare!
end

Results:

2.7.3 x86_64-linux
Warming up --------------------------------------
         unsafe_uuid    56.290k i/100ms
     threadsafe_uuid    54.898k i/100ms
threadsafe_and_forksafe_uuid
                        37.384k i/100ms
 new_every_time_uuid    18.815k i/100ms
Calculating -------------------------------------
         unsafe_uuid    559.430k (± 4.4%) i/s -      2.814M in   5.042673s
     threadsafe_uuid    541.709k (± 4.9%) i/s -      2.745M in   5.080132s
threadsafe_and_forksafe_uuid
                        366.511k (± 5.0%) i/s -      1.832M in   5.011552s
 new_every_time_uuid    195.445k (± 3.1%) i/s -    978.380k in   5.010802s

Comparison:
         unsafe_uuid:   559430.0 i/s
     threadsafe_uuid:   541708.8 i/s - same-ish: difference falls within error
threadsafe_and_forksafe_uuid:   366511.5 i/s - 1.53x  (± 0.00) slower
 new_every_time_uuid:   195445.4 i/s - 2.86x  (± 0.00) slower

@brendar brendar force-pushed the make-cequel-uuid-thread-and-fork-safe branch from 6476838 to 89bb723 Compare July 19, 2021 16:11
Fixes a case where a thread in the forked child
process resets `@pid` before the main thread,
which means the main thread's generator will
still have the same internal state as the parent
process.
@brendar
Copy link
Author

brendar commented Jul 29, 2021

Realized there was still a case that could lead to generator state being shared between processes. Added a commit to address it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant