Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActiveRecord handling during smart spawn breaks schema cache #2461

Open
blowfishpro opened this issue Dec 16, 2022 · 4 comments
Open

ActiveRecord handling during smart spawn breaks schema cache #2461

blowfishpro opened this issue Dec 16, 2022 · 4 comments

Comments

@blowfishpro
Copy link

Issue report

Question 1: What is the problem?

What is the expected behavior?

If the ActiveRecord schema is cached before the app starts up, ActiveRecord is able to use that schema cache at runtime rather than querying the database for information about the schema

What is the actual behavior?

If smart spawning is used, worker processes ignore the schema cache and query the database for information about the schema the first time it's needed. This can be very slow depending on Rails version, database version, and settings

How can we reproduce it?

  • Generate a new rails app with a database
  • Run bundle exec bin/rails db:schema:cache:dump in the app to dump the schema cache to db/schema_cache.yml (newer rails version might compress this)
  • Add this initializer that demonstrates the issue:
    Rails.configuration.after_initialize do
      if ActiveRecord::Base.connection_pool.schema_cache
        puts "DEBUG: after_initialize: table count in schema cache: #{ActiveRecord::Base.connection_pool.schema_cache.instance_variable_get(:@data_sources).size}"
      else
        puts 'DEBUG: after_initialize: no schema cache'
      end
    end
    
    PhusionPassenger.on_event(:starting_worker_process) do
      if ActiveRecord::Base.connection_pool.schema_cache
        puts "DEBUG: starting_worker_process: table count in schema cache: #{ActiveRecord::Base.connection_pool.schema_cache.instance_variable_get(:@data_sources).size}"
      else
        puts 'DEBUG: starting_worker_process: no schema cache'
      end
    end
  • Start the app (i.e. bundle exec passenger start)
    • Use smart spawning (the default)
  • Look for the debug messages in the output:
    App 59390 output: DEBUG: after_initialize: table count in schema cache: 43
    App 59442 output: DEBUG: starting_worker_process: no schema cache
    

Additional Explanation and Notes

  • The schema cache gets wiped out when ActiveRecord::Base.establish_connection is called here
    • This is easily demonstrable without passenger (e.g. in the rails console):
      ActiveRecord::Base.connection_pool.schema_cache.instance_variable_get(:@data_sources).size}
      # => 43
      
      ActiveRecord::Base.establish_connection
      
      ActiveRecord::Base.connection_pool.schema_cache
      # => nil
  • I would guess this was added in the early days of Rails. Nowadays ActiveRecord detects when when it's called from a new process and re-establishes connections itself (while preserving the schema cache). This was added in rails 4.
  • It makes sense that passenger wants to maintain wide compatibility even with very old rails versions, but ideally there would either be a switch so that this only happens on old versions (Rails < 4) or that it's user-configurable somehow.
  • Reading the schema from the database doesn't cause a huge performance penalty in many cases, but we've seen situations where those queries can take tens of seconds.

Question 2: Passenger version and integration mode:

open source 6.0.14 standalone

Question 3: OS or Linux distro, platform (including version):

Mac OS 12.6.2 "Monterey, arm64

Question 4: Passenger installation method:

RubyGems + Gemfile

Question 5: Your app's programming language (including any version managers) and framework (including versions):

Ruby 2.7.5, Rails 6.0.5.1 (tested on newer ruby/rails too though)

Question 6: Are you using a PaaS and/or containerization? If so which one?

I have demonstrated this both with passenger running bare on my laptop and in containers

Question 7: Anything else about your setup that we should know?

This should not be setup dependent.

@blowfishpro
Copy link
Author

Newer rails versions have a lazily_set_schema_cache option which looks like it would work around it by loading the schema cache from its file every time a connection pool is initialized (so this would happen when workers are spawned). This still seems less than ideal though. Ideally it should be able to be loaded once in the preloader and then never again.

@blowfishpro
Copy link
Author

It appears that this is no longer an issue in Rails 7.1. Calling establish_connection does not wipe out the schema cache unless the database configuration has changed. But it's still a concern for older rails versions.

@djberg96
Copy link

It appears that this is no longer an issue in Rails 7.1. Calling establish_connection does not wipe out the schema cache unless the database configuration has changed. But it's still a concern for older rails versions.

Yep, and we slammed into it. Thanks for the report!

For anyone reading this, here is our workaround:

if defined?(PhusionPassenger)
  PhusionPassenger.on_event(:starting_worker_process) do |forked|
    # Starting a new forked ("smart spawn") process
    connection_pool = ActiveRecord::Base.connection_pool
    schema_cache = connection_pool.schema_cache

    if schema_cache.nil?
      ## The below is copied from the ActiveRecord railtie, to load schema cache
      ## from YML if it exists.
      db_config = ActiveRecord::Base.configurations.configs_for(env_name: Rails.env).first
      filename = ActiveRecord::Tasks::DatabaseTasks.cache_dump_filename(
        db_config.name,
        schema_cache_path: db_config&.schema_cache_path
      )
      new_schema_cache = ActiveRecord::ConnectionAdapters::SchemaCache.load_from(filename)

      unless new_schema_cache.nil?
        connection_pool.set_schema_cache(new_schema_cache)
      end
    end
  end
end

@alexford
Copy link

We put that workaround above in config.ru for our Rails 6.1/Passenger standalone app

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants