Skip to content

Latest commit

 

History

History
4274 lines (3613 loc) · 159 KB

08.active_record.md

File metadata and controls

4274 lines (3613 loc) · 159 KB

An Http Request Through Rails

08. Active Record

本章其实已经脱离《An Http Request Through Rails》的范畴,仅仅作为学习之需。由于Active Record并不是一个完整的运作流程,本章只能通过多个例子解析Active Record的几个方面。

另外,本章也混入了大量Active Model的部分,但由于Active Model通常也不独立使用,因此本文直接混合二者而不做区分。

还有,由于Active Record中ORM对象的就是ActiveRecord::Base对象,为了简化,本文将直接称其为Active Record对象。

Find

Find all

首先就以一段最基本的代码开始吧,数据库是SQLite3,这里用的Active Record对象是User,没有任何特别的属性:

User.all.to_a

首先all方法属于ActiveRecord::Queryingdelegate,提供了多种ORM的query方法的scopeddelegate,目标都是scoped

delegate :find, :first, :first!, :last, :last!, :all, :exists?, :any?, :many?, :to => :scoped
delegate :first_or_create, :first_or_create!, :first_or_initialize, :to => :scoped
delegate :destroy, :destroy_all, :delete, :delete_all, :update, :update_all, :to => :scoped
delegate :find_each, :find_in_batches, :to => :scoped
delegate :select, :group, :order, :except, :reorder, :limit, :offset, :joins,
         :where, :preload, :eager_load, :includes, :from, :lock, :readonly,
         :having, :create_with, :uniq, :to => :scoped
delegate :count, :average, :minimum, :maximum, :sum, :calculate, :pluck, :to => :scoped

除此以外还提供了两个可以直接用SQL查询的方法,find_by_sqlcount_by_sql

接着就执行了scoped方法,scopedActiveRecord::Scoping::Named模块下的类方法:

# Returns an anonymous \scope.
#
#   posts = Post.scoped
#   posts.size # Fires "select count(*) from  posts" and returns the count
#   posts.each {|p| puts p.name } # Fires "select * from posts" and loads post objects
#
#   fruits = Fruit.scoped
#   fruits = fruits.where(:color => 'red') if options[:red_only]
#   fruits = fruits.limit(10) if limited?
#
# Anonymous \scopes tend to be useful when procedurally generating complex
# queries, where passing intermediate values (\scopes) around as first-class
# objects is convenient.
#
# You can define a \scope that applies to all finders using
# ActiveRecord::Base.default_scope.
def scoped(options = nil)
  if options
    scoped.apply_finder_options(options)
  else
    if current_scope
      current_scope.clone
    else
      scope = relation
      scope.default_scoped = true
      scope
    end
  end
end

scoped方法允许传入多个查询选项来实现更多的查询,但我们这里并不传入任何参数。对于已经设置了scope的代码来说,调用scoped会把当前scope克隆后返回回去,不过我们这里并没有设置scope,因此将调用relation方法返回一个ActiveRecord::Relation对象,设置default scope,然后返回。

relation方法定义在ActiveRecord::Base中,这个类众所周知是Active Record的核心类。relation方法的实现如下:

def relation
  relation = Relation.new(self, arel_table)

  if finder_needs_type_condition?
    relation.where(type_condition).create_with(inheritance_column.to_sym => sti_name)
  else
    relation
  end
end

这里一开始就创建了ActiveRecord::Relation类的对象,然后添加where语句以支持STI。该类暂时存储了当前所有查询条件,是实现Lazy Query的核心,它的构造函数的实现是:

ASSOCIATION_METHODS = [:includes, :eager_load, :preload]
MULTI_VALUE_METHODS = [:select, :group, :order, :joins, :where, :having, :bind]
SINGLE_VALUE_METHODS = [:limit, :offset, :lock, :readonly, :from, :reordering, :reverse_order, :uniq]

def initialize(klass, table)
  @klass, @table = klass, table

  @implicit_readonly = nil
  @loaded            = false
  @default_scoped    = false

  SINGLE_VALUE_METHODS.each {|v| instance_variable_set(:"@#{v}_value", nil)}
  (ASSOCIATION_METHODS + MULTI_VALUE_METHODS).each {|v| instance_variable_set(:"@#{v}_values", [])}
  @extensions = []
  @create_with_value = {}
end

可以看到这里针对所有可能的查询条件都初始化好了实例变量。

实例化Relation对象时调用到了arel_table方法,这个方法实现在ActiveRecord::Base中:

def arel_table
  @arel_table ||= Arel::Table.new(table_name, arel_engine)
end

这里首先先确定了当前类对应的数据库的表名,方法是table_name,定义在ActiveRecord::ModelSchema模块中,activerecord-3.2.13/lib/active_record/model_schema.rb文件内,这个模块与Schema相关,针对例如与表,列,序列这样的数据库操作,table_name的实现方法是:

def table_name
  reset_table_name unless defined?(@table_name)
  @table_name
end

对于还没有设定@table_name变量的情况,首先要调用reset_table_name去计算出一个表名,实现是:

# Computes the table name, (re)sets it internally, and returns it.
def reset_table_name
  if abstract_class?
    self.table_name = if superclass == Base || superclass.abstract_class?
                        nil
                      else
                        superclass.table_name
                      end
  elsif superclass.abstract_class?
    self.table_name = superclass.table_name || compute_table_name
  else
    self.table_name = compute_table_name
  end
end

可以看到,如果自身是abstract_class或是父类是abstract_class的话,则根据STI的规定继承了父类的表名,否则,调用compute_table_name方法计算出一个表名:

# Computes and returns a table name according to default conventions.
def compute_table_name
  base = base_class
  if self == base
    # Nested classes are prefixed with singular parent table name.
    if parent < ActiveRecord::Base && !parent.abstract_class?
      contained = parent.table_name
      contained = contained.singularize if parent.pluralize_table_names
      contained += '_'
    end
    "#{full_table_name_prefix}#{contained}#{undecorated_table_name(name)}#{table_name_suffix}"
  else
    # STI subclasses always use their superclass' table.
    base.table_name
  end
end

首先要找出一个被用于计算表名的类对象,这里调用base_class的实现:

# Returns the base AR subclass that this class descends from. If A
# extends AR::Base, A.base_class will return A. If B descends from A
# through some arbitrarily deep hierarchy, B.base_class will return A.
#
# If B < A and C < B and if A is an abstract_class then both B.base_class
# and C.base_class would return B as the answer since A is an abstract_class.
def base_class
  class_of_active_record_descendant(self)
end

# Returns the class descending directly from ActiveRecord::Base or an
# abstract class, if any, in the inheritance hierarchy.
def class_of_active_record_descendant(klass)
  if klass == Base || klass.superclass == Base || klass.superclass.abstract_class?
    klass
  elsif klass.superclass.nil?
    raise ActiveRecordError, "#{name} doesn't belong in a hierarchy descending from ActiveRecord"
  else
    class_of_active_record_descendant(klass.superclass)
  end
end

这里的规则基本上按照base_class的注释所描述的那样,不再翻译。

随后,注意compute_table_name里的parent方法,这个方法来自于activesupport-3.2.13/lib/active_support/core_ext/module/introspection.rb的core hack,当该类是某个类或是模块的内部类的时候,返回其外部模块或类,代码相当简单,请大家自行阅读。

当该类parent也是ActiveRecord::Base类,并且不是抽象类的话,这里将parent类的表名取出,如果发现是复数的话,转换成单数作为前缀加在表名的前面。

full_table_name_prefix搜查所有parentstable_name_prefix属性,如果都没有则使用当前类的table_name_prefix属性:

def full_table_name_prefix
  (parents.detect{ |p| p.respond_to?(:table_name_prefix) } || self).table_name_prefix
end

这里的parents方法相当于parent方法的数组版本,将以数组的形式返回所有外部模块或类,直到Object为止。

计算表名的核心方法是undecorated_table_name

# Guesses the table name, but does not decorate it with prefix and suffix information.
def undecorated_table_name(class_name = base_class.name)
  table_name = class_name.to_s.demodulize.underscore
  table_name = table_name.pluralize if pluralize_table_names
  table_name
end

这个方法非常简单,无需详细解释。另外如果是STI的话总是搜索父类的表名。好了,表名的解释到此为止,接着是初始化Arel::Table需要的第二个参数arel_engine

def arel_engine
  @arel_engine ||= begin
    if self == ActiveRecord::Base
      ActiveRecord::Base
    else
      connection_handler.retrieve_connection_pool(self) ? self : superclass.arel_engine
    end
  end
end

这里第一次提到了connection,因此有必要提及Active Record的数据库初始化,代码在ActiveRecord::Railtie内:

# This sets the database configuration from Configuration#database_configuration
# and then establishes the connection.
initializer "active_record.initialize_database" do |app|
  ActiveSupport.on_load(:active_record) do
    db_connection_type = "DATABASE_URL"
    unless ENV['DATABASE_URL']
      db_connection_type  = "database.yml"
      self.configurations = app.config.database_configuration
    end
    Rails.logger.info "Connecting to database specified by #{db_connection_type}"

    establish_connection
  end
end

这段代码主要是establish_connection方法,它初始化了数据库相关部分:

def self.establish_connection(spec = ENV["DATABASE_URL"])
  resolver = ConnectionSpecification::Resolver.new spec, configurations
  spec = resolver.spec

  unless respond_to?(spec.adapter_method)
    raise AdapterNotFound, "database configuration specifies nonexistent #{spec.config[:adapter]} adapter"
  end

  remove_connection
  connection_handler.establish_connection name, spec
end

ConnectionSpecification::Resolver定义在activerecord-3.2.13/lib/active_record/connection_adapters/abstract/connection_specification.rb中,其功能是创建所需的ConnectionSpecification对象,调用spec方法即可进行解析:

def spec
  case config
  when nil
    raise AdapterNotSpecified unless defined?(Rails.env)
    resolve_string_connection Rails.env
  when Symbol, String
    resolve_string_connection config.to_s
  when Hash
    resolve_hash_connection config
  end
end

对于config,实质就是Rails环境,如果没有指定,则指定成Rails.env。然后执行resolve_string_connection方法:

def resolve_string_connection(spec) # :nodoc:
  hash = configurations.fetch(spec) do |k|
    connection_url_to_hash(k)
  end

  raise(AdapterNotSpecified, "#{spec} database is not configured") unless hash

  resolve_hash_connection hash
end

这里通过前面指定的Rails环境获取到database.yml设置的数据库信息,如果不能获取到,则spec可能是一个URL,将调用connection_url_to_hash解析这个URL:

def connection_url_to_hash(url) # :nodoc:
  config = URI.parse url
  adapter = config.scheme
  adapter = "postgresql" if adapter == "postgres"
  spec = { :adapter  => adapter,
           :username => config.user,
           :password => config.password,
           :port     => config.port,
           :database => config.path.sub(%r{^/},""),
           :host     => config.host }
  spec.reject!{ |_,value| value.blank? }
  spec.map { |key,value| spec[key] = URI.unescape(value) if value.is_a?(String) }
  if config.query
    options = Hash[config.query.split("&").map{ |pair| pair.split("=") }].symbolize_keys
    spec.merge!(options)
  end
  spec
end

随后调用resolve_hash_connection方法:

def resolve_hash_connection(spec) # :nodoc:
  spec = spec.symbolize_keys

  raise(AdapterNotSpecified, "database configuration does not specify adapter") unless spec.key?(:adapter)

  begin
    require "active_record/connection_adapters/#{spec[:adapter]}_adapter"
  rescue LoadError => e
    raise LoadError, "Please install the #{spec[:adapter]} adapter: `gem install activerecord-#{spec[:adapter]}-adapter` (#{e.message})", e.backtrace
  end

  adapter_method = "#{spec[:adapter]}_connection"

  ConnectionSpecification.new(spec, adapter_method)
end

这里将根据设定的adapter信息加载数据库相应的Adapter类,然后创建了相应的ConnectionSpecification对象。

随后,回到之前的establish_connection方法,必须判断ActiveRecord::Base是否加载了相应的适配器方法"#{adapther_name"}_connection,否则抛出异常。为了防止重复连接这里又调用了remove_connection

def remove_connection(klass = self)
  connection_handler.remove_connection(klass)
end

可以看到是connection_handler的代理方法,它的定义如下,这里的connection_handlerConnectionAdapters::ConnectionHandler的实例:

# Remove the connection for this class. This will close the active
# connection and the defined connection (if they exist). The result
# can be used as an argument for establish_connection, for easily
# re-establishing the connection.
def remove_connection(klass)
  pool = @class_to_pool.delete(klass.name)
  return nil unless pool

  @connection_pools.delete pool.spec
  pool.automatic_reconnect = false
  pool.disconnect!
  pool.spec.config
end

不过因为这里还没有作过任何连接所以其实不会做什么事情。最后调用了connection_handler.establish_connection方法建立连接:

def establish_connection(name, spec)
  @connection_pools[spec] ||= ConnectionAdapters::ConnectionPool.new(spec)
  @class_to_pool[name] = @connection_pools[spec]
end

从代码中可以看到,connection_handler@connection_pool是一个以ConnectionSpecification对象为Key,ConnectionAdapters::ConnectionPool对象为Value的Hash,而@class_to_pool则是以类名为Key,ConnectionAdapters::ConnectionPool为Value的Hash。需要说明的是,ConnectionHandler类的作用就是维护这两个重要的实例变量。

这里的方法虽然并没有真正的建立一个连向数据库的connection,但是已经做好了准备,当第一次执行ActiveRecord::Base.connection方法的时候就会真正的建立连接:

# Returns the connection currently associated with the class. This can
# also be used to "borrow" the connection to do database work unrelated
# to any of the specific Active Records.
def connection
  retrieve_connection
end

def retrieve_connection
  connection_handler.retrieve_connection(self)
end

connection_handler.retrieve_connection的实现是:

# Locate the connection of the nearest super class. This can be an
# active or defined connection: if it is the latter, it will be
# opened and set as the active connection for the class it was defined
# for (not necessarily the current class).
def retrieve_connection(klass)
  pool = retrieve_connection_pool(klass)
  (pool && pool.connection) or raise ConnectionNotEstablished
end

def retrieve_connection_pool(klass)
  pool = @class_to_pool[klass.name]
  return pool if pool
  return nil if ActiveRecord::Base == klass
  retrieve_connection_pool klass.superclass
end

这里相当于之前初始化操作的逆操作,将取出对应的ConnectionAdapters::ConnectionPool对象(如果找不到则直接调用父类,顺便可以支持STI),然后执行connection方法:

# Retrieve the connection associated with the current thread, or call
# #checkout to obtain one if necessary.
#
# #connection can be called any number of times; the connection is
# held in a hash keyed by the thread id.
def connection
  synchronize do
    @reserved_connections[current_connection_id] ||= checkout
  end
end

@reserved_connectionsConnectionPool维护。这里确定当前连接ID的方法是这样的:

def current_connection_id
  ActiveRecord::Base.connection_id ||= Thread.current.object_id
end

由此可以看到,@reserved_connections维护同一ConnectionPool里不同线程的Connection,不同线程不同时共享Connection。

这里返回或是建立连接的方法是checkout

def checkout
  synchronize do
    waited_time = 0

    loop do
      conn = @connections.find { |c| c.lease }

      unless conn
        if @connections.size < @size
          conn = checkout_new_connection
          conn.lease
        end
      end

      if conn
        checkout_and_verify conn
        return conn
      end

      if waited_time >= @timeout
        raise ConnectionTimeoutError, "could not obtain a database connection#{" within #{@timeout} seconds" if @timeout} (waited #{waited_time} seconds). The max pool size is currently #{@size}; consider increasing it."
      end

      # Sometimes our wait can end because a connection is available,
      # but another thread can snatch it up first. If timeout hasn't
      # passed but no connection is avail, looks like that happened --
      # loop and wait again, for the time remaining on our timeout. 
      before_wait = Time.now
      @queue.wait( [@timeout - waited_time, 0].max )
      waited_time += (Time.now - before_wait)

      # Will go away in Rails 4, when we don't clean up
      # after leaked connections automatically anymore. Right now, clean
      # up after we've returned from a 'wait' if it looks like it's
      # needed, then loop and try again. 
      if(active_connections.size >= @connections.size)
        clear_stale_cached_connections!
      end
    end
  end
end

从代码中可见,一开始先从@connecions中找到一个lease返回有效值的连接,其中lease的实现定义在ActiveRecord::ConnectionAdapters中,这个类是所有数据库Adapter的基类:

def lease
  synchronize do
    unless in_use
      @in_use   = true
      @last_use = Time.now
    end
  end
end

可以看到只有当连接没有被使用的时候lease方才返回有效值。如果没有找到并且@connections里的连接没有超过上限(默认是5),则执行checkout_new_connection方法创建一个新的连接:

def checkout_new_connection
  raise ConnectionNotEstablished unless @automatic_reconnect

  c = new_connection
  c.pool = self
  @connections << c
  c
end

@automatic_reconnect不能为false,也就是说不能已经被remove_connection了。

new_connection方法会调用Adapter的代码:

def new_connection
  ActiveRecord::Base.send(spec.adapter_method, spec.config)
end

这里将初始化所需的Adapter,值得注意的是,所有Adapter的父类都是ActiveRecord::ConnectionAdapters::AbstractAdapter,我们这里来简单看下AbstractAdapter的初始化代码:

def initialize(connection, logger = nil, pool = nil)
  super()

  @active              = nil
  @connection          = connection
  @in_use              = false
  @instrumenter        = ActiveSupport::Notifications.instrumenter
  @last_use            = false
  @logger              = logger
  @open_transactions   = 0
  @pool                = pool
  @query_cache         = Hash.new { |h,sql| h[sql] = {} }
  @query_cache_enabled = false
  @schema_cache        = SchemaCache.new self
  @visitor             = nil
end

初始化代码很简答,不过这里我们需要关心的是SchemaCache对象的初始化,这个类负责维护表中Column和主键的信息,定义在activerecord-3.2.13/lib/active_record/connection_adapters/schema_cache.rb中:

def initialize(conn)
  @connection = conn
  @tables     = {}

  @columns = Hash.new do |h, table_name|
    h[table_name] = conn.columns(table_name, "#{table_name} Columns")
  end

  @columns_hash = Hash.new do |h, table_name|
    h[table_name] = Hash[columns[table_name].map { |col|
      [col.name, col]
    }]
  end

  @primary_keys = Hash.new do |h, table_name|
    h[table_name] = table_exists?(table_name) ? conn.primary_key(table_name) : nil
  end
end

这里先后调用Adapter的方法创建好了@columns@columns_hash@primary_keys三个对象。

new_connection创建好连接之后,对连接执行lease方法将其标记为已经使用。随后执行checkout_and_verify方法:

def checkout_and_verify(c)
  c.run_callbacks :checkout do
    c.verify!
  end
  c
end

这个方法执行了:checkout这个Callback,传入了执行针对连接的verify!方法的block,其中verify!方法主要是验证连接是否有效,如果无效则重新连接:

# Checks whether the connection to the database is still active (i.e. not stale).
# This is done under the hood by calling <tt>active?</tt>. If the connection
# is no longer active, then this method will reconnect to the database.
def verify!(*ignored)
  reconnect! unless active?
end

具体判断是否active?的方法以及重新连接的代码取决于Adapter的实现,这里不再深入。

这样一个checkout连接的过程就完成了,如果之前没有找到空闲的连接,但是@connections里的连接已满,此时就只能等待一段时间(这里调用了new_condwait方法,等到有线程用完connection之后向该conditional variable发出信号(这个可以参看checkin方法的实现),或者预设的时间已经用完),然后试图清理掉已经执行结束的线程中的连接以换取更多可用的连接,然后循环再次重复上述checkout的过程,直到最终超时抛出错误为止。

这样,关于数据库连接的初始化和连接的过程已经叙述完毕,我们现在重新回到relation方法:

def relation #:nodoc:
  relation = Relation.new(self, arel_table)

  if finder_needs_type_condition?
    relation.where(type_condition).create_with(inheritance_column.to_sym => sti_name)
  else
    relation
  end
end

这里的finder_needs_type_condition?通过判断column中是否有实现STI必要的Column,type,如果存在则认为这个类有STI:

def finder_needs_type_condition?
  # This is like this because benchmarking justifies the strange :false stuff
  :true == (@finder_needs_type_condition ||= descends_from_active_record? ? :false : :true)
end

def descends_from_active_record?
  if superclass.abstract_class?
    superclass.descends_from_active_record?
  else
    superclass == Base || !columns_hash.include?(inheritance_column)
  end
end

# The name of the column containing the object's class when Single Table Inheritance is used
def inheritance_column
  if self == Base
    'type'
  else
    (@inheritance_column ||= nil) || superclass.inheritance_column
  end
end

最后设置scope.default_scoped为true,然后返回Relation对象,并且调用all方法,all方法定义在ActiveRecord::FinderMethods中,activerecord-3.2.13/lib/active_record/relation/finder_methods.rb内:

# A convenience wrapper for <tt>find(:all, *args)</tt>. You can pass in all the
# same arguments to this method as you can to <tt>find(:all)</tt>.
def all(*args)
  args.any? ? apply_finder_options(args.first).to_a : to_a
end

apply_finder_options定义在ActiveRecord::SpawnMethods中,这个模块主要负责Relation对象之间的合并和赋值,源码如下:

def apply_finder_options(options)
  relation = clone
  return relation unless options

  options.assert_valid_keys(VALID_FIND_OPTIONS)
  finders = options.dup
  finders.delete_if { |key, value| value.nil? && key != :limit }

  ([:joins, :select, :group, :order, :having, :limit, :offset, :from, :lock, :readonly] & finders.keys).each do |finder|
    relation = relation.send(finder, finders[finder])
  end

  relation = relation.where(finders[:conditions]) if options.has_key?(:conditions)
  relation = relation.includes(finders[:include]) if options.has_key?(:include)
  relation = relation.extending(finders[:extend]) if options.has_key?(:extend)

  relation
end

这个实现不做过多解释,因为很快大家就能明白,直接看to_a的实现:

def to_a
  # We monitor here the entire execution rather than individual SELECTs
  # because from the point of view of the user fetching the records of a
  # relation is a single unit of work. You want to know if this call takes
  # too long, not if the individual queries take too long.
  #
  # It could be the case that none of the queries involved surpass the
  # threshold, and at the same time the sum of them all does. The user
  # should get a query plan logged in that case.
  logging_query_plan do
    exec_queries
  end
end

这里logging_query_plan与SQL Explain有关,主要是当SQL执行超时后执行Adapter的explain方法,我们这里不再深入学习这个功能,先进入exec_queries

def exec_queries
  return @records if loaded?

  default_scoped = with_default_scope

  if default_scoped.equal?(self)
    @records = if @readonly_value.nil? && !@klass.locking_enabled?
      eager_loading? ? find_with_associations : @klass.find_by_sql(arel, @bind_values)
    else
      IdentityMap.without do
        eager_loading? ? find_with_associations : @klass.find_by_sql(arel, @bind_values)
      end
    end

    preload = @preload_values
    preload +=  @includes_values unless eager_loading?
    preload.each do |associations|
      ActiveRecord::Associations::Preloader.new(@records, associations).run
    end

    # @readonly_value is true only if set explicitly. @implicit_readonly is true if there
    # are JOINS and no explicit SELECT.
    readonly = @readonly_value.nil? ? @implicit_readonly : @readonly_value
    @records.each { |record| record.readonly! } if readonly
  else
    @records = default_scoped.to_a
  end

  @loaded = true
  @records
end

这里看下with_default_scope的实现,该方法的语意是,如果指定过default_scope,则返回这个scope:

def with_default_scope
  if default_scoped? && default_scope = klass.send(:build_default_scope)
    default_scope = default_scope.merge(self)
    default_scope.default_scoped = false
    default_scope
  else
    self
  end
end

这里的default_scoped?将返回true,但是本类的build_default_scope将返回nil,因为并不曾指定过default_scope,因此with_default_scope将返回self本身(由于这个方法的重要性,将会在下文再次解析)。这样继续看exec_queries的实现,这将使得default_scoped与self相等,因此进入@readonly_value.nil? && !@klass.locking_enabled?的判断(之所以做这个判断可能是回避IdentityMap类的bug,具体请见activerecord-3.2.13/lib/active_record/identity_map.rb的注释,我们不会深入学习IdentityMap功能,由于它可能引起很多Bug,已经在Rails 4中被去除)。由于没有@readonly_value,所以前者返回true,同时column中没有lock_version这个特殊column,因此locking_enabled?返回false,所以将判断eager_loading?

def eager_loading?
  @should_eager_load ||=
    @eager_load_values.any? ||
    @includes_values.any? && (joined_includes_values.any? || references_eager_loaded_tables?)
end

由于里面提到的变量本次查询都没有设置,因此eager_loading?返回false。这样就会直接执行@klass.find_by_sql(arel, @bind_values)

首先进入arelarelActiveRecord::QueryMethods的方法,定义在ruby-1.9.3-p429/gems/activerecord-3.2.13/lib/active_record/relation/query_methods.rb中:

def arel
  @arel ||= with_default_scope.build_arel
end

这里的with_default_scope已经解释过,这里的执行结果与之前一致,需要关心的是这里的build_arel

def build_arel
  arel = table.from table

  build_joins(arel, @joins_values) unless @joins_values.empty?

  collapse_wheres(arel, (@where_values - ['']).uniq)

  arel.having(*@having_values.uniq.reject{|h| h.blank?}) unless @having_values.empty?

  arel.take(connection.sanitize_limit(@limit_value)) if @limit_value
  arel.skip(@offset_value.to_i) if @offset_value

  arel.group(*@group_values.uniq.reject{|g| g.blank?}) unless @group_values.empty?

  order = @order_values
  order = reverse_sql_order(order) if @reverse_order_value
  arel.order(*order.uniq.reject{|o| o.blank?}) unless order.empty?

  build_select(arel, @select_values.uniq)

  arel.distinct(@uniq_value)
  arel.from(@from_value) if @from_value
  arel.lock(@lock_value) if @lock_value

  arel
end

事实上这里大部分代码并不执行,唯一执行的build_select也只是横向选择了表中所有列:

def build_select(arel, selects)
  unless selects.empty?
    @implicit_readonly = false
    arel.project(*selects)
  else
    arel.project(@klass.arel_table[Arel.star])
  end
end

这里大部分代码都非常好懂,仅仅是对Arel库的简单调用,因此就不一一解析了。

接着,将执行find_by_sql方法:

def find_by_sql(sql, binds = [])
  logging_query_plan do
    connection.select_all(sanitize_sql(sql), "#{name} Load", binds).collect! { |record| instantiate(record) }
  end
end

首先关注sanitize_sql方法,这个方法定义在ActiveRecord::Sanitization模块内,activerecord-3.2.13/lib/active_record/sanitization.rb文件内,并且在这个模块内sanitize_sql方法是sanitize_sql_for_conditions方法的alias,因此我们看sanitize_sql_for_conditions

# Accepts an array, hash, or string of SQL conditions and sanitizes
# them into a valid SQL fragment for a WHERE clause.
#   ["name='%s' and group_id='%s'", "foo'bar", 4]  returns  "name='foo''bar' and group_id='4'"
#   { :name => "foo'bar", :group_id => 4 }  returns "name='foo''bar' and group_id='4'"
#   "name='foo''bar' and group_id='4'" returns "name='foo''bar' and group_id='4'"
def sanitize_sql_for_conditions(condition, table_name = self.table_name)
  return nil if condition.blank?

  case condition
  when Array; sanitize_sql_array(condition)
  when Hash;  sanitize_sql_hash_for_conditions(condition, table_name)
  else        condition
  end
end

不过ActiveRecord::Sanitization负责那些需要预处理的SQL语句,而那种情况下参数应该是数组或是哈希,而这里我们传入的是Arel::SelectManager对象,因此直接返回。

然后我们进入connection.select_all方法,这个方法分两层,外层是由ActiveRecord::ConnectionAdapters::QueryCache实现,定义在activerecord-3.2.13/lib/active_record/connection_adapters/abstract/query_cache.rb。它生成SQL并且将SQL执行结果缓存起来,而下一层由ActiveRecord::ConnectionAdapters::DatabaseStatements实现,定义在activerecord-3.2.13/lib/active_record/connection_adapters/abstract/database_statements.rb文件内。我们先关心QueryCache中的实现:

def select_all(arel, name = nil, binds = [])
  if @query_cache_enabled && !locked?(arel)
    sql = to_sql(arel, binds)
    cache_sql(sql, binds) { super(sql, name, binds) }
  else
    super
  end
end

如果SQL Cache功能打开并且数据库没有被锁住的话(后者是因为数据库被锁住情况下执行相同SQL的结果与不锁住情况下的SQL结果可能有所不同),将先取得SQL语句,然后执行并将其结果cache。我们先看下to_sql的实现:

# Converts an arel AST to SQL
def to_sql(arel, binds = [])
  if arel.respond_to?(:ast)
    visitor.accept(arel.ast) do
      quote(*binds.shift.reverse)
    end
  else
    arel
  end
end

这里对visitor执行accept方法并且传入之前得到的AST即可获取最终的SQL语句,然后执行cache_sql

def cache_sql(sql, binds)
  result =
    if @query_cache[sql].key?(binds)
      ActiveSupport::Notifications.instrument("sql.active_record",
        :sql => sql, :binds => binds, :name => "CACHE", :connection_id => object_id)
      @query_cache[sql][binds]
    else
      @query_cache[sql][binds] = yield
    end

  result.collect { |row| row.dup }
end

从代码中可以看到所有SQL执行的结果均缓存在@query_cache中,如果存在Cache则直接返回结果,否则执行block中的代码去执行上层DatabaseStatements中的同名方法:

# Returns an array of record hashes with the column names as keys and
# column values as values.
def select_all(arel, name = nil, binds = [])
  select(to_sql(arel, binds), name, binds)
end

这里的to_sql传入的实际上是已经计算好的SQL,因此并不再次转换,而是由select执行SQL语句,这里的select方法定义在SQLite的Adapter类中:

def select(sql, name = nil, binds = [])
  exec_query(sql, name, binds).to_a
end

def exec_query(sql, name = nil, binds = [])
  log(sql, name, binds) do

    # Don't cache statements without bind values
    if binds.empty?
      stmt    = @connection.prepare(sql)
      cols    = stmt.columns
      records = stmt.to_a
      stmt.close
      stmt = records
    else
      cache = @statements[sql] ||= {
        :stmt => @connection.prepare(sql)
      }
      stmt = cache[:stmt]
      cols = cache[:cols] ||= stmt.columns
      stmt.reset!
      stmt.bind_params binds.map { |col, val|
        type_cast(val, col)
      }
    end

    ActiveRecord::Result.new(cols, stmt.to_a)
  end
end

这里的代码完全是针对SQLite库的调用,我们不再研究,这里仅仅需要关心取得需要查询的数据之后如何将其转换为Active Record对象,这里先创建了ActiveRecord::Result对象:

def initialize(columns, rows)
  @columns   = columns
  @rows      = rows
  @hash_rows = nil
end

返回ActiveRecord::Result对象后,返回select方法,这里将执行to_a方法。由于ActiveRecord::Resultinclude了Enumerable模块,to_a方法将调用each返回结果:

def each
  hash_rows.each { |row| yield row }
end

这里核心方法是hash_rows,它将Column和结果集改成了哈希的形式:

def hash_rows
  @hash_rows ||=
    begin
      # We freeze the strings to prevent them getting duped when
      # used as keys in ActiveRecord::Model's @attributes hash
      columns = @columns.map { |c| c.dup.freeze }
      @rows.map { |row|
        Hash[columns.zip(row)]
      }
    end
end

最后,将哈希转换成Active Record对象的工作由find_by_sql调用的instantiate方法实现:

# Finder methods must instantiate through this method to work with the
# single-table inheritance model that makes it possible to create
# objects of different types from the same table.
def instantiate(record)
  sti_class = find_sti_class(record[inheritance_column])
  record_id = sti_class.primary_key && record[sti_class.primary_key]

  if ActiveRecord::IdentityMap.enabled? && record_id
    instance = use_identity_map(sti_class, record_id, record)
  else
    instance = sti_class.allocate.init_with('attributes' => record)
  end

  instance
end

首先搜索需要初始化的类,因此将当前对象的inheritance_column(通常都是type)传入find_sti_class,该方法定义在ActiveRecord::Inheritance模块内:

def find_sti_class(type_name)
  if type_name.blank? || !columns_hash.include?(inheritance_column)
    self
  else
    begin
      if store_full_sti_class
        ActiveSupport::Dependencies.constantize(type_name)
      else
        compute_type(type_name)
      end
    rescue NameError
      raise SubclassNotFound,
        "The single-table inheritance mechanism failed to locate the subclass: '#{type_name}'. " +
        "This error is raised because the column '#{inheritance_column}' is reserved for storing the class in case of inheritance. " +
        "Please rename this column if you didn't intend it to be used for storing the inheritance class " +
        "or overwrite #{name}.inheritance_column to use another column for that information."
    end
  end
end

如果没有inheritance_column,则需要初始化的类就是自己本身,否则则初始化这个column的值代表的类。随后,如果启用了ActiveRecord::IdentityMap且初始化值中有主键部分,将搜索IdentityMap,如果有结果,则取出结果并对其重新初始化。如果没有结果或没有启用ActiveRecord::IdentityMap,则先创建其实例,然后调用init_with对其初始化:

# Initialize an empty model object from +coder+. +coder+ must contain
# the attributes necessary for initializing an empty model object. For
# example:
#
#   class Post < ActiveRecord::Base
#   end
#
#   post = Post.allocate
#   post.init_with('attributes' => { 'title' => 'hello world' })
#   post.title # => 'hello world'
def init_with(coder)
  @attributes = self.class.initialize_attributes(coder['attributes'])
  @relation = nil

  @attributes_cache, @previously_changed, @changed_attributes = {}, {}, {}
  @association_cache = {}
  @aggregation_cache = {}
  @readonly = @destroyed = @marked_for_destruction = false
  @new_record = false
  run_callbacks :find
  run_callbacks :initialize

  self
end

这里调用了initialize_attributes方法完成对属性的初始化环节,这里分两层,一层由ActiveRecord::AttributeMethods::Serialization实现,负责线性化部分属性。另一层由ActiveRecord::Locking::Optimistic实现,负责控制Column版本。

首先看Serialization的实现:

def initialize_attributes(attributes, options = {})
  serialized = (options.delete(:serialized) { true }) ? :serialized : :unserialized
  super(attributes, options)

  serialized_attributes.each do |key, coder|
    if attributes.key?(key)
      attributes[key] = Attribute.new(coder, attributes[key], serialized)
    end
  end

  attributes
end

其中这个模块还包含一段这样的代码:

included do
  # Returns a hash of all the attributes that have been specified for serialization as
  # keys and their class restriction as values.
  class_attribute :serialized_attributes
  self.serialized_attributes = {}
end

serialized_attributes是一个Hash,表示需要线性化的属性及初始化的方法,默认为空。如果传入的选项中没有指定:serializedfalsenil,则在初始化时将serialized_attributes创建成ActiveRecord::AttributeMethods::Serialization::Attribute对象,这样就可以调用serialize方法进行线性化了,关于序列化的细节将在本文后面进行更详细的解析。

接着来看Locking::Optimistic的部分:

# If the locking column has no default value set,
# start the lock version at zero. Note we can't use
# <tt>locking_enabled?</tt> at this point as
# <tt>@attributes</tt> may not have been initialized yet.
def initialize_attributes(attributes, options = {})
  if attributes.key?(locking_column) && lock_optimistically
    attributes[locking_column] ||= 0
  end

  attributes
end

这里只是将属性的locking_column(默认值是lock_version)初始化为0。

完成初始化后,返回exec_queries方法,将处理属性的@preload_values@readonly_values部分,但这里这些值均为空,因此直接返回。

至此,一个简单的User.all执行完毕。

下面我们将尝试更加复杂的查询条件,更加复杂的Model关系,更加复杂的功能,来更深入的研究Active Record。

#####Find by Id#####

然后,我们简单的加强了搜索条件,这次的代码是:

User.find_by_id 1

find系列方法是Rails中最常用的搜索方法之一,虽然Rails 4之后find_by_xxxx系列退化为find_by方法,但是该方法依然有不错的学习价值。首先,毫无以外的进入了method_missing方法,该方法定义在ActiveRecord::DynamicMatchers中,位置在activerecord-3.2.13/lib/active_record/dynamic_matchers.rb里:

# Enables dynamic finders like <tt>User.find_by_user_name(user_name)</tt> and
# <tt>User.scoped_by_user_name(user_name). Refer to Dynamic attribute-based finders
# section at the top of this file for more detailed information.
#
# It's even possible to use all the additional parameters to +find+. For example, the
# full interface for +find_all_by_amount+ is actually <tt>find_all_by_amount(amount, options)</tt>.
#
# Each dynamic finder using <tt>scoped_by_*</tt> is also defined in the class after it
# is first invoked, so that future attempts to use it do not run through method_missing.
def method_missing(method_id, *arguments, &block)
  if match = (DynamicFinderMatch.match(method_id) || DynamicScopeMatch.match(method_id))
    attribute_names = match.attribute_names
    super unless all_attributes_exists?(attribute_names)
    if !(match.is_a?(DynamicFinderMatch) && match.instantiator? && arguments.first.is_a?(Hash)) && arguments.size < attribute_names.size
      method_trace = "#{__FILE__}:#{__LINE__}:in `#{method_id}'"
      backtrace = [method_trace] + caller
      raise ArgumentError, "wrong number of arguments (#{arguments.size} for #{attribute_names.size})", backtrace
    end
    if match.respond_to?(:scope?) && match.scope?
      self.class_eval <<-METHOD, __FILE__, __LINE__ + 1
        def self.#{method_id}(*args)                                    # def self.scoped_by_user_name_and_password(*args)
          attributes = Hash[[:#{attribute_names.join(',:')}].zip(args)] #   attributes = Hash[[:user_name, :password].zip(args)]
                        gg                                                #
          scoped(:conditions => attributes)                             #   scoped(:conditions => attributes)
        end                                                             # end
      METHOD
      send(method_id, *arguments)
    elsif match.finder?
      options = if arguments.length > attribute_names.size
                  arguments.extract_options!
                else
                  {}
                end

      relation = options.any? ? scoped(options) : scoped
      relation.send :find_by_attributes, match, attribute_names, *arguments, &block
    elsif match.instantiator?
      scoped.send :find_or_instantiator_by_attributes, match, attribute_names, *arguments, &block
    end
  else
    super
  end
end

这里涉及到两个Matcher,一个是DynamicFinderMatch,另一个是DynamicScopeMatch,这里我们主要关注DynamicFinderMatch

module ActiveRecord

  # = Active Record Dynamic Finder Match
  #
  # Refer to ActiveRecord::Base documentation for Dynamic attribute-based finders for detailed info
  #
  class DynamicFinderMatch
    def self.match(method)
      finder       = :first
      bang         = false
      instantiator = nil

      case method.to_s
      when /^find_(all_|last_)?by_([_a-zA-Z]\w*)$/
        finder = :last if $1 == 'last_'
        finder = :all if $1 == 'all_'
        names = $2
      when /^find_by_([_a-zA-Z]\w*)\!$/
        bang = true
        names = $1
      when /^find_or_create_by_([_a-zA-Z]\w*)\!$/
        bang = true
        instantiator = :create
        names = $1
      when /^find_or_(initialize|create)_by_([_a-zA-Z]\w*)$/
        instantiator = $1 == 'initialize' ? :new : :create
        names = $2
      else
        return nil
      end

      new(finder, instantiator, bang, names.split('_and_'))
    end

    def initialize(finder, instantiator, bang, attribute_names)
      @finder          = finder
      @instantiator    = instantiator
      @bang            = bang
      @attribute_names = attribute_names
    end

    attr_reader :finder, :attribute_names, :instantiator

    def finder?
      @finder && !@instantiator
    end

    def instantiator?
      @finder == :first && @instantiator
    end

    def creator?
      @finder == :first && @instantiator == :create
    end

    def bang?
      @bang
    end

    def save_record?
      @instantiator == :create
    end

    def save_method
      bang? ? :save! : :save
    end
  end
end

这里find_by_id将匹配第一个when语句,其中finder为默认的:first,意为只搜索第一个结果,随后,这里返回了DynamicFinderMatch的实例。

随后将确定find_by中的属性是否确实存在,调用all_attributes_exists?判断:

def all_attributes_exists?(attribute_names)
  (expand_attribute_names_for_aggregates(attribute_names) -
   column_methods_hash.keys).empty?
end

# Similar in purpose to +expand_hash_conditions_for_aggregates+.
def expand_attribute_names_for_aggregates(attribute_names)
  attribute_names.map { |attribute_name|
    unless (aggregation = reflect_on_aggregation(attribute_name.to_sym)).nil?
      aggregate_mapping(aggregation).map do |field_attr, _|
        field_attr.to_sym
      end
    else
      attribute_name.to_sym
    end
  }.flatten
end

此方法主要是为了AggregateReflection而存在,将composed_of:mapping选项映射得到所有属性。该方法不在本文解析的范畴内。

column_methods_hash则尽可能的返回更多的可能方法:

# Returns a hash of all the methods added to query each of the columns in the table with the name of the method as the key
# and true as the value. This makes it possible to do O(1) lookups in respond_to? to check if a given method for attribute
# is available.
def column_methods_hash
  @dynamic_methods_hash ||= column_names.inject(Hash.new(false)) do |methods, attr|
    attr_name = attr.to_s
    methods[attr.to_sym]       = attr_name
    methods["#{attr}=".to_sym] = attr_name
    methods["#{attr}?".to_sym] = attr_name
    methods["#{attr}_before_type_cast".to_sym] = attr_name
    methods
  end
end

二者相减如果不为空则说明参数中含有不存在的属性,将返回错误。返回method_missing方法,随即则是一个判断参数是否过多的检查。接下来主要是一个分支,对于DynamicScopeMatch的情况(scope?存在),则创建一个同名方法,并且对该方法进行调用。在这个同名方法中则主要是针对scoped方法的调用。而对于我们目前关心的DynamicFinderMatch,将先获取到选项,随后调用scoped方法处理选项以获取正确的scope,最后调用Relation对象的find_by_attributes方法即可:

def find_by_attributes(match, attributes, *args)
  conditions = Hash[attributes.map {|a| [a, args[attributes.index(a)]]}]
  result = where(conditions).send(match.finder)

  if match.bang? && result.nil?
    raise RecordNotFound, "Couldn't find #{@klass.name} with #{conditions.to_a.collect {|p| p.join(' = ')}.join(', ')}"
  else
    yield(result) if block_given?
    result
  end
end

首先生成了属性的key value对,然后将其放入where方法内,随后对返回值再调用match.finder方法,match.finder在这里取值为:first,其他可能的取值还有:last:all。下面是where方法的代码:

def where(opts, *rest)
  return self if opts.blank?

  relation = clone
  relation.where_values += build_where(opts, rest)
  relation
end

def build_where(opts, other = [])
  case opts
  when String, Array
    [@klass.send(:sanitize_sql, other.empty? ? opts : ([opts] + other))]
  when Hash
    attributes = @klass.send(:expand_hash_conditions_for_aggregates, opts)
    PredicateBuilder.build_from_hash(table.engine, attributes, table)
  else
    [opts]
  end
end

这里的ActiveRecord::PredicateBuilder类定义在activerecord-3.2.13/lib/active_record/relation/predicate_builder.rb中,提供了关于这类断言式的代码封装:

module ActiveRecord
  class PredicateBuilder # :nodoc:
    def self.build_from_hash(engine, attributes, default_table, allow_table_name = true)
      predicates = attributes.map do |column, value|
        table = default_table

        if allow_table_name && value.is_a?(Hash)
          table = Arel::Table.new(column, engine)

          if value.empty?
            '1 = 2'
          else
            build_from_hash(engine, value, table, false)
          end
        else
          column = column.to_s

          if allow_table_name && column.include?('.')
            table_name, column = column.split('.', 2)
            table = Arel::Table.new(table_name, engine)
          end

          attribute = table[column]

          case value
          when ActiveRecord::Relation
            value = value.select(value.klass.arel_table[value.klass.primary_key]) if value.select_values.empty?
            attribute.in(value.arel.ast)
          when Array, ActiveRecord::Associations::CollectionProxy
            values = value.to_a.map {|x| x.is_a?(ActiveRecord::Base) ? x.id : x}
            ranges, values = values.partition {|v| v.is_a?(Range) || v.is_a?(Arel::Relation)}

            array_predicates = ranges.map {|range| attribute.in(range)}

            if values.include?(nil)
              values = values.compact
              if values.empty?
                array_predicates << attribute.eq(nil)
              else
                array_predicates << attribute.in(values.compact).or(attribute.eq(nil))
              end
            else
              array_predicates << attribute.in(values)
            end

            array_predicates.inject {|composite, predicate| composite.or(predicate)}
          when Range, Arel::Relation
            attribute.in(value)
          when ActiveRecord::Base
            attribute.eq(value.id)
          when Class
            # FIXME: I think we need to deprecate this behavior
            attribute.eq(value.name)
          else
            attribute.eq(value)
          end
        end
      end

      predicates.flatten
    end
  end
end

虽然这里的代码看上去非常复杂,几乎所有与where子句相关的SQL语句的功能都在这里被封装。但是我们需要的仅仅是调用属性的eq方法,并传入属性对应的值。最后将获取一个数组,其元素是一个Arel::Nodes::Equality对象。

first方法的实现也非常简单:

# A convenience wrapper for <tt>find(:first, *args)</tt>. You can pass in all the
# same arguments to this method as you can to <tt>find(:first)</tt>.
def first(*args)
  if args.any?
    if args.first.kind_of?(Integer) || (loaded? && !args.first.kind_of?(Hash))
      limit(*args).to_a
    else
      apply_finder_options(args.first).first
    end
  else
    find_first
  end
end

def find_first
  if loaded?
    @records.first
  else
    @first ||= limit(1).to_a[0]
  end
end

first可以接受一个数字返回最前的多条数据,否则执行find_first方法。这个方法实质就是调用limit(1).to_a[0]语句,其中limit方法的实现非常简单:

def limit(value)
  relation = clone
  relation.limit_value = value
  relation
end

to_a的实现之前已经解释过,这里不再重复。

#####Find by Parameters#####

我们已经了解了find_by系列方法的内部机制,下面将进入更加复杂的查询方法,带参数绑定的命名查询,查询代码如下:

User.where 'id = :id and name = :name and age = :age and admin = :admin', :id => id,
                                                                          :name => name,
                                                                          :age => age,
                                                                          :admin => admin

首先进入where方法:

def where(opts, *rest)
  return self if opts.blank?

  relation = clone
  relation.where_values += build_where(opts, rest)
  relation
end

def build_where(opts, other = [])
  case opts
  when String, Array
    [@klass.send(:sanitize_sql, other.empty? ? opts : ([opts] + other))]
  when Hash
    attributes = @klass.send(:expand_hash_conditions_for_aggregates, opts)
    PredicateBuilder.build_from_hash(table.engine, attributes, table)
  else
    [opts]
  end
end

这两个方法其实之前已经读过,但是现在我们将探索build_where的第一个分支,sanitize_sql方法:

# Accepts an array, hash, or string of SQL conditions and sanitizes
# them into a valid SQL fragment for a WHERE clause.
#   ["name='%s' and group_id='%s'", "foo'bar", 4]  returns  "name='foo''bar' and group_id='4'"
#   { :name => "foo'bar", :group_id => 4 }  returns "name='foo''bar' and group_id='4'"
#   "name='foo''bar' and group_id='4'" returns "name='foo''bar' and group_id='4'"
def sanitize_sql_for_conditions(condition, table_name = self.table_name)
  return nil if condition.blank?

  case condition
  when Array; sanitize_sql_array(condition)
  when Hash;  sanitize_sql_hash_for_conditions(condition, table_name)
  else        condition
  end
end
alias_method :sanitize_sql, :sanitize_sql_for_conditions

虽然这个方法之前也已经接触过,但是之前并没有详细解析,这里我们将重点研究这个方法,由于带参数绑定的方法传入的condition参数都是数组,因此进入sanitize_sql_array方法:

# Accepts an array of conditions. The array has each value
# sanitized and interpolated into the SQL statement.
#   ["name='%s' and group_id='%s'", "foo'bar", 4]  returns  "name='foo''bar' and group_id='4'"
def sanitize_sql_array(ary)
  statement, *values = ary
  if values.first.is_a?(Hash) && statement =~ /:\w+/
    replace_named_bind_variables(statement, values.first)
  elsif statement.include?('?')
    replace_bind_variables(statement, values)
  elsif statement.blank?
    statement
  else
    statement % values.collect { |value| connection.quote_string(value.to_s) }
  end
end

由于本例中我们用了命名参数,将匹配第一个条件,将调用replace_named_bind_variables方法:

def replace_named_bind_variables(statement, bind_vars)
  statement.gsub(/(:?):([a-zA-Z]\w*)/) do
    if $1 == ':' # skip postgresql casts
      $& # return the whole match
    elsif bind_vars.include?(match = $2.to_sym)
      quote_bound_value(bind_vars[match])
    else
      raise PreparedStatementInvalid, "missing value for :#{match} in #{statement}"
    end
  end
end

由于PostgreSQL存在连续两个冒号的语句,因此需要适当规避,随后将传入参数对应的值传入quote_bound_value方法:

def quote_bound_value(value, c = connection)
  if value.respond_to?(:map) && !value.acts_like?(:string)
    if value.respond_to?(:empty?) && value.empty?
      c.quote(nil)
    else
      value.map { |v| c.quote(v) }.join(',')
    end
  else
    c.quote(value)
  end
end

这里主要是将获取的值增加引号,调用的方法是connection.quote,由于SQLite在这方面基本遵守标准,因此将进入ActiveRecord::ConnectionAdapters::Quoting,该模块负责各种与引号相关的实用功能,定义在activerecord-3.2.13/lib/active_record/connection_adapters/abstract/quoting.rb中:

# Quotes the column value to help prevent
# {SQL injection attacks}[http://en.wikipedia.org/wiki/SQL_injection].
def quote(value, column = nil)
  # records are quoted as their primary key
  return value.quoted_id if value.respond_to?(:quoted_id)

  case value
  when String, ActiveSupport::Multibyte::Chars
    value = value.to_s
    return "'#{quote_string(value)}'" unless column

    case column.type
    when :binary then "'#{quote_string(column.string_to_binary(value))}'"
    when :integer then value.to_i.to_s
    when :float then value.to_f.to_s
    else
      "'#{quote_string(value)}'"
    end

  when true, false
    if column && column.type == :integer
      value ? '1' : '0'
    else
      value ? quoted_true : quoted_false
    end
    # BigDecimals need to be put in a non-normalized form and quoted.
  when nil        then "NULL"
  when BigDecimal then value.to_s('F')
  when Numeric    then value.to_s
  when Date, Time then "'#{quoted_date(value)}'"
  when Symbol     then "'#{quote_string(value.to_s)}'"
  else
    "'#{quote_string(YAML.dump(value))}'"
  end
end

可以看到这里根据数据类型划分了多个增加引号的方法,虽然凡是涉及到字符串都用单引号引起,但是对于处理字符串内部的引号的手段却各不相同,这里就不再解析了。

接下来经过多次循环,之前SQL语句中的参数均会被实际值替代,这样,最终的SQL语句将会被加入到Relation对象中,并合并到解析后的SQL语句中去。

####Relations####

Has many

随后,让我们来关注Active Record关于Relation的部分,代码是:

class User < ActiveRecord::Base
  has_many :blogs
end

user.blogs.to_a

这段代码简单地为User类定义了一个has_many关系,随后调用了这个关系进行查询。我们现在将从定义关系的代码开始解析:

def has_many(name, options = {}, &extension)
  Builder::HasMany.build(self, name, options, &extension)
end

这些代码在ActiveRecord::AssociationsClassMethods中被定义。代码中提到了Builder::HasMany类,从名字可知,这个类负责建立has_many关系,祖先是同一模块下定义的CollectionAssociationAssociation类。这里调用的build方法定义在CollectionAssociation类中:

def self.build(model, name, options, &extension)
  new(model, name, options, &extension).build
end

这里初始化了Builder::HasMany类的对象,随即调用了它的build方法,这里的build分多个层次,最核心的是Builder::Association的定义:

def build
  validate_options
  reflection = model.create_reflection(self.class.macro, name, options, model)
  define_accessors
  reflection
end

validate_options验证所有传入的key是否valid,代码非常简单,无需解释。随后调用了model(这里指的是User类)的create_reflection方法,该方法定义在ActiveRecord::Reflection模块中,主要是创建各种Reflection对象:

def create_reflection(macro, name, options, active_record)
  case macro
    when :has_many, :belongs_to, :has_one, :has_and_belongs_to_many
      klass = options[:through] ? ThroughReflection : AssociationReflection
      reflection = klass.new(macro, name, options, active_record)
    when :composed_of
      reflection = AggregateReflection.new(macro, name, options, active_record)
  end

  self.reflections = self.reflections.merge(name => reflection)
  reflection
end

可以注意到这里出现了三种Reflection类,分别是专门用于:through选项的ThroughReflection,比较通用的AssociationReflection和用于:composed_of选项的AggregateReflection。各自都继承于父类MacroReflection(由于定义不是很复杂,因此都和Reflection模块定义在同一个文件中)。这里我们将用到AssociationReflection,并且定义它的实例。随后将Relation的名字和Reflection的实例放入self.reflection哈希中,以便之后查询。

之后的define_accessors主要定义对这一Relationreaderwriter方法,其中包括读写其对象和只读写其id数组的方法。

然后是Builder::CollectionAssociationbuild方法:

def build
  wrap_block_extension
  reflection = super
  CALLBACKS.each { |callback_name| define_callback(callback_name) }
  reflection
end

wrap_block_extension只是将:extend对应的模块保存起来,当前方法主要负责定义Callback,这里包括四个基本Callback方法::before_add, :after_add, :before_remove, :after_remove

接着是被HasMany模块包含的ActiveRecord::AutosaveAssociation模块,这个模块的作用是为Active Record的关联对象添加自动保存的功能:

def build
  reflection = super
  model.send(:add_autosave_association_callbacks, reflection)
  reflection
end

# Adds validation and save callbacks for the association as specified by
# the +reflection+.
#
# For performance reasons, we don't check whether to validate at runtime.
# However the validation and callback methods are lazy and those methods
# get created when they are invoked for the very first time. However,
# this can change, for instance, when using nested attributes, which is
# called _after_ the association has been defined. Since we don't want
# the callbacks to get defined multiple times, there are guards that
# check if the save or validation methods have already been defined
# before actually defining them.
def add_autosave_association_callbacks(reflection)
  save_method = :"autosave_associated_records_for_#{reflection.name}"
  validation_method = :"validate_associated_records_for_#{reflection.name}"
  collection = reflection.collection?

  unless method_defined?(save_method)
    if collection
      before_save :before_save_collection_association

      define_non_cyclic_method(save_method, reflection) { save_collection_association(reflection) }
      # Doesn't use after_save as that would save associations added in after_create/after_update twice
      after_create save_method
      after_update save_method
    else
      if reflection.macro == :has_one
        define_method(save_method) { save_has_one_association(reflection) }
        # Configures two callbacks instead of a single after_save so that
        # the model may rely on their execution order relative to its
        # own callbacks.
        #
        # For example, given that after_creates run before after_saves, if
        # we configured instead an after_save there would be no way to fire
        # a custom after_create callback after the child association gets
        # created.
        after_create save_method
        after_update save_method
      else
        define_non_cyclic_method(save_method, reflection) { save_belongs_to_association(reflection) }
        before_save save_method
      end
    end
  end

  if reflection.validate? && !method_defined?(validation_method)
    method = (collection ? :validate_collection_association : :validate_single_association)
    define_non_cyclic_method(validation_method, reflection) { send(method, reflection) }
    validate validation_method
  end
end

这个函数主要是增加一些callback,其中包括对于collection的关联增加保存前的callback方法before_save_collection_association

# Is used as a before_save callback to check while saving a collection
# association whether or not the parent was a new record before saving.
def before_save_collection_association
  @new_record_before_save = new_record?
  true
end

当保存和升级之后则回调save_collection_associationsave_has_one_association,或是save_belongs_to_association方法,具体代码将在稍后解析。

随后才进入Builder::HasMany定义的build方法:

def build
  reflection = super
  configure_dependency
  reflection
end

def configure_dependency
  if options[:dependent]
    unless options[:dependent].in?([:destroy, :delete_all, :nullify, :restrict])
      raise ArgumentError, "The :dependent option expects either :destroy, :delete_all, " \
                           ":nullify or :restrict (#{options[:dependent].inspect})"
    end

    send("define_#{options[:dependent]}_dependency_method")
    model.before_destroy dependency_method_name
  end
end

这个方法在:dependent选项被指定时增加before_destroy的callback方法,具体代码较为简单,这里不再解析。

随后让我们来执行下user.blogs语句,看看里面的原理,首先,先前我们没有提及的是,define_accessors的具体代码,而事实上,这是读写关联对象的入口:

def define_accessors
  define_readers
  define_writers
end

def define_readers
  name = self.name
  mixin.redefine_method(name) do |*params|
    association(name).reader(*params)
  end
end

def define_writers
  name = self.name
  mixin.redefine_method("#{name}=") do |value|
    association(name).writer(value)
  end
end

从代码中可以看到,在build方法执行时会定义两个方法,分别是对属性的readerwriter。因此当我们执行user.blogs的时候,将执行之前定义的blogsreader的方法。首先让我们进入association方法:

# Returns the association instance for the given name, instantiating it if it doesn't already exist
def association(name)
  association = association_instance_get(name)

  if association.nil?
    reflection  = self.class.reflect_on_association(name)
    association = reflection.association_class.new(self, reflection)
    association_instance_set(name, association)
  end

  association
end

其中association_instance_getassociation_instance_set起到类似于缓存的作用,非常简单,这里不再解析。直接进入reflect_on_association方法:

# Returns the AssociationReflection object for the +association+ (use the symbol).
#
#   Account.reflect_on_association(:owner)             # returns the owner AssociationReflection
#   Invoice.reflect_on_association(:line_items).macro  # returns :has_many
#
def reflect_on_association(association)
  reflections[association].is_a?(AssociationReflection) ? reflections[association] : nil
end

实际上就是从前面存入的self.reflections哈希中取出Reflection对象。

随后根据这个对象用association_class方法可以取得对应的Association类:

def association_class
  case macro
  when :belongs_to
    if options[:polymorphic]
      Associations::BelongsToPolymorphicAssociation
    else
      Associations::BelongsToAssociation
    end
  when :has_and_belongs_to_many
    Associations::HasAndBelongsToManyAssociation
  when :has_many
    if options[:through]
      Associations::HasManyThroughAssociation
    else
      Associations::HasManyAssociation
    end
  when :has_one
    if options[:through]
      Associations::HasOneThroughAssociation
    else
      Associations::HasOneAssociation
    end
  end
end

这里将会取得Associations::HasManyAssociation类,随后就创建该类的实例:

# CollectionAssociation initialize:
def initialize(owner, reflection)
  super
  @proxy = CollectionProxy.new(self)
end

# Association initialize:
def initialize(owner, reflection)
  reflection.check_validity!

  @target = nil
  @owner, @reflection = owner, reflection
  @updated = false

  reset
  reset_scope
end

注意初始化时会创建CollectionProxy对象。

首先检查reflection的正确性:

def check_validity!
  check_validity_of_inverse!
end

def check_validity_of_inverse!
  unless options[:polymorphic]
    if has_inverse? && inverse_of.nil?
      raise InverseOfAssociationNotFoundError.new(self)
    end
  end
end

这里只是简单的要求:polymorphic选项和inverse_of不能同时存在。接着初始化一些变量:

def reset
  @loaded = false
  @target = []
end

def reset_scope
  @association_scope = nil
end

随后我们开始调用HasManyAssociation的方法reader,注意这个方法定义在ActiveRecord::Associations::Association模块中,注意区分这个模块和Builder::Association

def reader(force_reload = false)
  if force_reload
    klass.uncached { reload }
  elsif stale_target?
    reload
  end

  proxy
end

随后进入proxyto_ary方法,该方法定义在前面提到的CollectionProxy中:

def to_ary
  load_target.dup
end
alias_method :to_a, :to_ary

这里load_target是代理方法,实际上将调用ActiveRecord::Associations::HasManyAssociation对象的同名方法:

def load_target
  if find_target?
    @target = merge_target_lists(find_target, target)
  end

  loaded!
  target
end

首先确定@target是否已经被load,条件是:

def find_target?
  !loaded? && (!owner.new_record? || foreign_key_present?) && klass
end

随后将正式find_target

def find_target
  records =
    if options[:finder_sql]
      reflection.klass.find_by_sql(custom_finder_sql)
    else
      scoped.all
    end

  records = options[:uniq] ? uniq(records) : records
  records.each { |record| set_inverse_instance(record) }
  records
end

可以看到,这里已经接近核心的对数据库的搜索,由于没有指定:find_sql,这里讲执行scoped.all,将所有符合条件的对象都查询出来:

def scoped
  target_scope.merge(association_scope)
end

这里的scoped与之前ActiveRecord::Scoping::Named的不同,是两个scope的合并,首先是target_scope

def target_scope
  klass.scoped
end

target_scope该类本身的scope,也就是之前ActiveRecord::Scoping::Named的实现,这里不再复述。随后是association_scope

# The scope for this association.
#
# Note that the association_scope is merged into the target_scope only when the
# scoped method is called. This is because at that point the call may be surrounded
# by scope.scoping { ... } or with_scope { ... } etc, which affects the scope which
# actually gets built.
def association_scope
  if klass
    @association_scope ||= AssociationScope.new(self).scope
  end
end

这里又提到一个新的类AssociationScope,主要是针对外键查询部分的代码实现,这里将创建该类的实例:

def initialize(association)
  @association   = association
  @alias_tracker = AliasTracker.new klass.connection
end

随后调用它的scope方法:

def scope
  scope = klass.unscoped
  scope = scope.extending(*Array.wrap(options[:extend]))

  # It's okay to just apply all these like this. The options will only be present if the
  # association supports that option; this is enforced by the association builder.
  scope = scope.apply_finder_options(options.slice(
    :readonly, :include, :order, :limit, :joins, :group, :having, :offset, :select))

  if options[:through] && !options[:include]
    scope = scope.includes(source_options[:include])
  end

  scope = scope.uniq if options[:uniq]

  add_constraints(scope)
end

这里的unscoped表示暂时去除所有之前设置的默认scope,返回一个纯净的scope,这个方法定义在ActiveRecord::Scoping::Default中:

# Returns a scope for the model without the default_scope.
def unscoped
  block_given? ? relation.scoping { yield } : relation
end

可以看到这里将调用relation方法重新创建一个Relation对象。随后的scope中的代码则是对scope添加多种查询条件,在我们的例子中,可以不用看之前那些条件,只需要关心最后一个方法即可:

def add_constraints(scope)
  tables = construct_tables

  chain.each_with_index do |reflection, i|
    table, foreign_table = tables.shift, tables.first

    if reflection.source_macro == :has_and_belongs_to_many
      join_table = tables.shift

      scope = scope.joins(join(
        join_table,
        table[reflection.association_primary_key].
          eq(join_table[reflection.association_foreign_key])
      ))

      table, foreign_table = join_table, tables.first
    end

    if reflection.source_macro == :belongs_to
      if reflection.options[:polymorphic]
        key = reflection.association_primary_key(klass)
      else
        key = reflection.association_primary_key
      end

      foreign_key = reflection.foreign_key
    else
      key         = reflection.foreign_key
      foreign_key = reflection.active_record_primary_key
    end

    conditions = self.conditions[i]

    if reflection == chain.last
      scope = scope.where(table[key].eq(owner[foreign_key]))

      if reflection.type
        scope = scope.where(table[reflection.type].eq(owner.class.base_class.name))
      end

      conditions.each do |condition|
        if options[:through] && condition.is_a?(Hash)
          condition = disambiguate_condition(table, condition)
        end

        scope = scope.where(interpolate(condition))
      end
    else
      constraint = table[key].eq(foreign_table[foreign_key])

      if reflection.type
        type = chain[i + 1].klass.base_class.name
        constraint = constraint.and(table[reflection.type].eq(type))
      end

      scope = scope.joins(join(foreign_table, constraint))

      unless conditions.empty?
        scope = scope.where(sanitize(conditions, table))
      end
    end
  end

  scope
end

首先,调用construct_tables方法创建一个Arel::Table对象,construct_tables定义在ActiveRecord::Associations::JoinHelper中:

def construct_tables
  tables = []
  chain.each do |reflection|
    tables << alias_tracker.aliased_table_for(
      table_name_for(reflection),
      table_alias_for(reflection, reflection != self.reflection)
    )

    if reflection.source_macro == :has_and_belongs_to_many
      tables << alias_tracker.aliased_table_for(
        (reflection.source_reflection || reflection).options[:join_table],
        table_alias_for(reflection, true)
      )
    end
  end
  tables
end

这里先计算了table name和table alias,计算方法如下:

def table_name_for(reflection)
  reflection.table_name
end

def table_alias_for(reflection, join = false)
  name = "#{reflection.plural_name}_#{alias_suffix}"
  name << "_join" if join
  name
end

随后,调用AliasTracker对象来创建Arel::Table,之所以用这个类是在join的时候防止alias重复,AliasTracker#aliased_table_for的实现如下:

def aliased_table_for(table_name, aliased_name = nil)
  table_alias = aliased_name_for(table_name, aliased_name)

  if table_alias == table_name
    Arel::Table.new(table_name)
  else
    Arel::Table.new(table_name).alias(table_alias)
  end
end

def aliased_name_for(table_name, aliased_name = nil)
  aliased_name ||= table_name

  if aliases[table_name].zero?
    # If it's zero, we can have our table_name
    aliases[table_name] = 1
    table_name
  else
    # Otherwise, we need to use an alias
    aliased_name = connection.table_alias_for(aliased_name)

    # Update the count
    aliases[aliased_name] += 1

    if aliases[aliased_name] > 1
      "#{truncate(aliased_name)}_#{aliases[aliased_name]}"
    else
      aliased_name
    end
  end
end

这里如果aliases表中已经存在alias,则生成另一个alias代替。如果是:has_and_belongs_to_many关系还需要再生成一张中间表,但这里我们不需要。

随后回到add_constraints,这里生成join查询所需的外键名和主键名,其中生成的方法分别是relationforeign_keyactive_record_primary_key

def foreign_key
  @foreign_key ||= options[:foreign_key] || derive_foreign_key
end

def active_record_primary_key
  @active_record_primary_key ||= options[:primary_key] || primary_key(active_record)
end

这两个方法默认都会使用选项里的参数,其中primary_key的fallback就是调用Active Record对象的primary_key,而foreign_key的fallback方法是:

def derive_foreign_key
  if belongs_to?
    "#{name}_id"
  elsif options[:as]
    "#{options[:as]}_id"
  else
    active_record.name.foreign_key
  end
end

可以看到,完全依照Rails的约定来生成。

随后,调用scopewhere方法增加查询条件,如果有type列还必须增加对于type的查询条件,如果有其他查询条件的话也一并加上,这样scope方法就返回了一个已经包含全部查询条件的scope。

随后,调用merge将两个scope合并在一起,merge方法这里的定义在ActiveRecord::SpawnMethods

def merge(r)
  return self unless r
  return to_a & r if r.is_a?(Array)

  merged_relation = clone

  r = r.with_default_scope if r.default_scoped? && r.klass != klass

  Relation::ASSOCIATION_METHODS.each do |method|
    value = r.send(:"#{method}_values")

    unless value.empty?
      if method == :includes
        merged_relation = merged_relation.includes(value)
      else
        merged_relation.send(:"#{method}_values=", value)
      end
    end
  end

  (Relation::MULTI_VALUE_METHODS - [:joins, :where, :order]).each do |method|
    value = r.send(:"#{method}_values")
    merged_relation.send(:"#{method}_values=", merged_relation.send(:"#{method}_values") + value) if value.present?
  end

  merged_relation.joins_values += r.joins_values

  merged_wheres = @where_values + r.where_values

  unless @where_values.empty?
    # Remove duplicates, last one wins.
    seen = Hash.new { |h,table| h[table] = {} }
    merged_wheres = merged_wheres.reverse.reject { |w|
      nuke = false
      if w.respond_to?(:operator) && w.operator == :==
        name              = w.left.name
        table             = w.left.relation.name
        nuke              = seen[table][name]
        seen[table][name] = true
      end
      nuke
    }.reverse
  end

  merged_relation.where_values = merged_wheres

  (Relation::SINGLE_VALUE_METHODS - [:lock, :create_with, :reordering]).each do |method|
    value = r.send(:"#{method}_value")
    merged_relation.send(:"#{method}_value=", value) unless value.nil?
  end

  merged_relation.lock_value = r.lock_value unless merged_relation.lock_value

  merged_relation = merged_relation.create_with(r.create_with_value) unless r.create_with_value.empty?

  if (r.reordering_value)
    # override any order specified in the original relation
    merged_relation.reordering_value = true
    merged_relation.order_values = r.order_values
  else
    # merge in order_values from r
    merged_relation.order_values += r.order_values
  end

  # Apply scope extension modules
  merged_relation.send :apply_modules, r.extensions

  merged_relation
end

方法虽然长,但其实只是简单地一一赋值而已,这里不详细解析。

回到find_target方法,调用scopedall方法进行实际的数据库查询,具体查询过程参考之前的exec_queries方法。

获得实际的Active Record对象之后,调用merge_target_lists将之前的find_target的结果和target合并:

# We have some records loaded from the database (persisted) and some that are
# in-memory (memory). The same record may be represented in the persisted array
# and in the memory array.
#
# So the task of this method is to merge them according to the following rules:
#
#   * The final array must not have duplicates
#   * The order of the persisted array is to be preserved
#   * Any changes made to attributes on objects in the memory array are to be preserved
#   * Otherwise, attributes should have the value found in the database
def merge_target_lists(persisted, memory)
  return persisted if memory.empty?
  return memory    if persisted.empty?

  persisted.map! do |record|
    # Unfortunately we cannot simply do memory.delete(record) since on 1.8 this returns
    # record rather than memory.at(memory.index(record)). The behavior is fixed in 1.9.
    mem_index = memory.index(record)

    if mem_index
      mem_record = memory.delete_at(mem_index)

      ((record.attribute_names & mem_record.attribute_names) - mem_record.changes.keys).each do |name|
        mem_record[name] = record[name]
      end

      mem_record
    else
      record
    end
  end

  persisted + memory
end

合并主要是去重复和拷贝那些被修改过的数据到目标对象,完成之后调用loaded!将该HasManyAssociation设置成loaded,返回被查询的数据即可。

Has And Belongs to many and :through

这将是一个更加复杂的Relation,解析这个关系将使我们对Rails的Relation有更加深刻的理解。

class User < ActiveRecord::Base
  has_and_belongs_to_many :followers, class_name: 'User', foreign_key: 'follow_id', association_foreign_key: 'follower_id', :join_table => 'follows'
  has_many :followers_comments, through: :followers, :source => :comments
  has_many :comments
end

user1.followers_comments

首先让我们进入has_and_belongs_to_many方法:

def has_and_belongs_to_many(name, options = {}, &extension)
  Builder::HasAndBelongsToMany.build(self, name, options, &extension)
end

has_many的实现看上去比较相似,其中build的实现是:

def build
  reflection = super
  check_validity(reflection)
  define_destroy_hook
  reflection
end

由于HasAndBelongsToMany的父类与HasMany的父类一致,都是CollectionAssociation,所以这里super的调用就不再详细解析。

check_validity的实现是:

def check_validity(reflection)
  if reflection.association_foreign_key == reflection.foreign_key
    raise ActiveRecord::HasAndBelongsToManyAssociationForeignKeyNeeded.new(reflection)
  end

  reflection.options[:join_table] ||= join_table_name(
    model.send(:undecorated_table_name, model.to_s),
    model.send(:undecorated_table_name, reflection.class_name)
  )
end

首先:association_foreign_key:foreign_key的内容不能完全一致,否则毫无意义。随后生成的是由两个表名合并成为的Join表的名字,如果之前没有预设的话。合并的方法是join_table_name

# Generates a join table name from two provided table names.
# The names in the join table names end up in lexicographic order.
#
#   join_table_name("members", "clubs")         # => "clubs_members"
#   join_table_name("members", "special_clubs") # => "members_special_clubs"
def join_table_name(first_table_name, second_table_name)
  if first_table_name < second_table_name
    join_table = "#{first_table_name}_#{second_table_name}"
  else
    join_table = "#{second_table_name}_#{first_table_name}"
  end

  model.table_name_prefix + join_table + model.table_name_suffix
end

字符串较小的表名将放在前面。

随后,将定义删除后的hook,方法是define_destroy_hook

def define_destroy_hook
  name = self.name
  model.send(:include, Module.new {
    class_eval <<-RUBY, __FILE__, __LINE__ + 1
      def destroy_associations
        association(#{name.to_sym.inspect}).delete_all_on_destroy
        super
      end
    RUBY
  })
end

接着进入user1.followers_comments,由于该Relation依然是:has_many,即使增加了:through选项,也仅仅是改用ThroughReflection类并创建HasManyThroughAssociation类的对象(这两点之前的解析中均有提及),而HasManyThroughAssociation还是HasManyAssociation的子类,因此初始化部分代码完全一致,因此也不再复述。我们将直接从find_target方法开始,这个方法定义在HasManyThroughAssociation中:

def find_target
  return [] unless target_reflection_has_associated_record?
  scoped.all
end

这里增加了一个方法判断target_reflection_has_associated_record?

def target_reflection_has_associated_record?
  if through_reflection.macro == :belongs_to && owner[through_reflection.foreign_key].blank?
    false
  else
    true
  end
end

当Relation是:belongs_to但是对象外键对应的值却是空,结果一定不存在,直接返回false即可。不过我们这里的关系是:has_and_belongs_to_many,因此总是返回true。

scoped的代码依然是两个scope的合并:

def scoped
  target_scope.merge(association_scope)
end

但是定义已经截然不同,这里target_scope方法定义在ThroughAssociation模块中,覆盖了原来Association中的定义,因此适用于所有存在:through选项的关系:

# We merge in these scopes for two reasons:
#
#   1. To get the default_scope conditions for any of the other reflections in the chain
#   2. To get the type conditions for any STI models in the chain
def target_scope
  scope = super
  chain[1..-1].each do |reflection|
    scope = scope.merge(
      reflection.klass.scoped.with_default_scope.
        except(:select, :create_with, :includes, :preload, :joins, :eager_load)
    )
  end
  scope
end

chain方法在这里也同样经过改造,改造位置在ThroughReflection类中:

# Returns an array of reflections which are involved in this association. Each item in the
# array corresponds to a table which will be part of the query for this association.
#
# The chain is built by recursively calling #chain on the source reflection and the through
# reflection. The base case for the recursion is a normal association, which just returns
# [self] as its #chain.
def chain
  @chain ||= begin
    chain = source_reflection.chain + through_reflection.chain
    chain[0] = self # Use self so we don't lose the information from :source_type
    chain
  end
end

这里的source_reflectionthrough_reflection分别是:has_many关系对应的两个reflection,实现分别是:

def source_reflection
  @source_reflection ||= source_reflection_names.collect { |name| through_reflection.klass.reflect_on_association(name) }.compact.first
end

def through_reflection
  @through_reflection ||= active_record.reflect_on_association(options[:through])
end

随后这里又对chain的第一个值,也就是source_reflection.chain的第一个结果用当前值,也就是ActiveRecord::Reflection::ThroughReflection对象取代,以避免丢失一些属性。

随后,利用前面取得的scope,与chain中后面那些通过:through连接的reflection的scope一一合并。合并方法之前已经解析过,并且这里的合并并不是重点,因此不再解析。

association_scope的实现才是真正的重点,虽然大部分代码之前也已经解析,但是add_constraints中对has_and_belongs_to_many的处理依然要细讲:

def add_constraints(scope)
  tables = construct_tables

  chain.each_with_index do |reflection, i|
    table, foreign_table = tables.shift, tables.first

    if reflection.source_macro == :has_and_belongs_to_many
      join_table = tables.shift

      scope = scope.joins(join(
        join_table,
        table[reflection.association_primary_key].
          eq(join_table[reflection.association_foreign_key])
      ))

      table, foreign_table = join_table, tables.first
    end

    if reflection.source_macro == :belongs_to
      if reflection.options[:polymorphic]
        key = reflection.association_primary_key(klass)
      else
        key = reflection.association_primary_key
      end

      foreign_key = reflection.foreign_key
    else
      key         = reflection.foreign_key
      foreign_key = reflection.active_record_primary_key
    end

    conditions = self.conditions[i]

    if reflection == chain.last
      scope = scope.where(table[key].eq(owner[foreign_key]))

      if reflection.type
        scope = scope.where(table[reflection.type].eq(owner.class.base_class.name))
      end

      conditions.each do |condition|
        if options[:through] && condition.is_a?(Hash)
          condition = disambiguate_condition(table, condition)
        end

        scope = scope.where(interpolate(condition))
      end
    else
      constraint = table[key].eq(foreign_table[foreign_key])

      if reflection.type
        type = chain[i + 1].klass.base_class.name
        constraint = constraint.and(table[reflection.type].eq(type))
      end

      scope = scope.joins(join(foreign_table, constraint))

      unless conditions.empty?
        scope = scope.where(sanitize(conditions, table))
      end
    end
  end

  scope
end

首先,我们再看一下construct_tables的代码:

def construct_tables
  tables = []
  chain.each do |reflection|
    tables << alias_tracker.aliased_table_for(
      table_name_for(reflection),
      table_alias_for(reflection, reflection != self.reflection)
    )

    if reflection.source_macro == :has_and_belongs_to_many
      tables << alias_tracker.aliased_table_for(
        (reflection.source_reflection || reflection).options[:join_table],
        table_alias_for(reflection, true)
      )
    end
  end
  tables
end

从代码中我们注意到,如果reflection是:has_and_belongs_to_many的,将会有两张表对象被创建出来,其中一张表是Join表。

在处理第一个chain,:has_many的reflection的时候,将Join :has_many的这两张表,并且增加Join条件:

constraint = table[key].eq(foreign_table[foreign_key])

if reflection.type
  type = chain[i + 1].klass.base_class.name
  constraint = constraint.and(table[reflection.type].eq(type))
end

scope = scope.joins(join(foreign_table, constraint))

unless conditions.empty?
  scope = scope.where(sanitize(conditions, table))
end

这里调用了大量Arel库的API,我们仅凭方法名即可理解其作用。

而在处理:has_and_belongs_to_many的reflection的过程中还要额外执行这些代码:

join_table = tables.shift

scope = scope.joins(join(
  join_table,
  table[reflection.association_primary_key].
    eq(join_table[reflection.association_foreign_key])
))

table, foreign_table = join_table, tables.first

以处理当前表和Join表之前的Join语句。然后,作为chain中的最后一个reflection,它执行的代码是:

scope = scope.where(table[key].eq(owner[foreign_key]))

if reflection.type
  scope = scope.where(table[reflection.type].eq(owner.class.base_class.name))
end

conditions.each do |condition|
  if options[:through] && condition.is_a?(Hash)
    condition = disambiguate_condition(table, condition)
  end

  scope = scope.where(interpolate(condition))
end

这里无需再次Join了,因为已经知道了中间表中Join用的主键,直接作为条件写入SQL,可以更好的优化。

最后,获取到了合并之后的scope,与之前的scope进行合并(事实上只使用了之前的target_scope的from和select语句,所有Join和where语句均有后者association_scope提供),这样就可以得到完整的查询语句了。

Build Association, Polymorphic Associations & Scope Querying

作为Active Record Relation的尾声,这里将讨论剩下一些有趣的特性,大家先看实例代码吧:

class Picture < ActiveRecord::Base
  belongs_to :imageable, :polymorphic => true
  attr_accessible :name

  default_scope order('created_at desc')

  scope :of_employees, where(:imageable_type => 'Employee')
  scope :of_products, -> { where(:imageable_type => 'Product') }
end

class Employee < ActiveRecord::Base
  attr_accessible :name
  has_many :pictures, :as => :imageable
end

class Product < ActiveRecord::Base
  attr_accessible :name
  has_many :pictures, :as => :imageable
end

employee.pictures.build :name => 'my avatar.png'
employee.pictures.create :name => 'my avatar 2.png'

Picture.of_employees
Picture.of_products

虽然代码看上去略长,但其实只涉及三种特性,分别是Build Association, Polymorphic Associations以及Scope Querying。其中Build Association和Scope Querying用了两种相似的写法,我们将区分他们之间的不同。

首先,我们将从:has_many关系的build方法开始。首先,该方法是@association的代理方法,代理声明定义在ActiveRecord::Associations::CollectionProxy中:

delegate :select, :find, :first, :last,
         :build, :create, :create!,
         :concat, :replace, :delete_all, :destroy_all, :delete, :destroy, :uniq,
         :sum, :count, :size, :length, :empty?,
         :any?, :many?, :include?,
         :to => :@association

因此build方法的实现在ActiveRecord::Associations::CollectionAssociation中:

def build(attributes = {}, options = {}, &block)
  if attributes.is_a?(Array)
    attributes.collect { |attr| build(attr, options, &block) }
  else
    add_to_target(build_record(attributes, options)) do |record|
      yield(record) if block_given?
    end
  end
end

可以看到,主要分两个步骤,build_recordadd_to_target,其中build_record实现在其基类ActiveRecord::Associations::Association中:

def build_record(attributes, options)
  reflection.build_association(attributes, options) do |record|
    skip_assign = [reflection.foreign_key, reflection.type].compact
    attributes = create_scope.except(*(record.changed - skip_assign))
    record.assign_attributes(attributes, :without_protection => true)
  end
end

该方法调用Reflection对象的build_association方法来创建其关系的对象:

def build_association(*options, &block)
  klass.new(*options, &block)
end

可以看到,该方法仅仅是简单的创建了Active Record对象,并且为其赋值,创建时传入的block将在initializeCallback前执行:

skip_assign = [reflection.foreign_key, reflection.type].compact
attributes = create_scope.except(*(record.changed - skip_assign))
record.assign_attributes(attributes, :without_protection => true)

这里可以看到,对于所有修改过的属性,只有reflectionforeign_keytype在赋值列表里,其中type并非STI中的Type column,而是Polymorphic Associations的字段:

def type
  @type ||= options[:as] && "#{options[:as]}_type"
end

foreign_key的实现之前已经解释,它的默认值derive_foreign_key也同样具有对Polymorphic Associations的支持。

create_scope将先创建建立Polymorphic Associations必要的Arel结构,然后取出其中的where子句转换成Hash:

def create_scope
  scoped.scope_for_create.stringify_keys
end

这里的scoped的where子句又之前提到过的add_constraints实现,该方法一样支持Polymorphic Associations条件的生成。

然后调用它的scope_for_create方法:

def scope_for_create
  @scope_for_create ||= where_values_hash.merge(create_with_value)
end

这里的where_values_hash将返回Arel结构中where子句的Hash版本:

def where_values_hash
  equalities = with_default_scope.where_values.grep(Arel::Nodes::Equality).find_all { |node|
    node.left.relation.name == table_name
  }

  Hash[equalities.map { |where| [where.left.name, where.right] }].with_indifferent_access
end

实现并不困难,而且可以看出,这个方法也只适用于某个字段等于某个值的条件,不过对于Polymorphic Associations而言已经足够了。

回到build_record方法,事实上,之所以需要从record中所有被修改过的属性中只保留外键和Polymorphic Associations,是因为其他值在之前创建这个对象实例的时候就已经调用过assign_attributes赋值了。这里将对刚才生成的Hash再次调用assign_attributes赋值,添加:without_protection参数是为了防止该属性因在黑名单中或是不在白名单中而赋值失效,该功能将在下文中详细解释。

接着,将该新记录插入到它所属的表中,不过只是在内存中而已,真正的插入必须调用save保存才行。这个操作将由add_to_target方法实现:

def add_to_target(record)
  callback(:before_add, record)
  yield(record) if block_given?

  if options[:uniq] && index = @target.index(record)
    @target[index] = record
  else
    @target << record
  end

  callback(:after_add, record)
  set_inverse_instance(record)

  record
end

这里将在先后各调用before和after的callback方法返回定义在类上的所有callback方法列表,然后根据不同类型采用不同的方法予以调用。

def callback(method, record)
  callbacks_for(method).each do |callback|
    case callback
    when Symbol
      owner.send(callback, record)
    when Proc
      callback.call(owner, record)
    else
      callback.send(method, owner, record)
    end
  end
end

def callbacks_for(callback_name)
  full_callback_name = "#{callback_name}_for_#{reflection.name}"
  owner.class.send(full_callback_name.to_sym) || []
end

至于在内存中插入新记录的方法很简单,就是向@target插入记录,如果指定了:uniq选项则需要先试图查找是否已经存在这个元素,如果不存在再进行插入,如果存在则只是进行替换(虽然id一致,但是数据可能不是最新的,因此只需要在原位置替换即可)。

最后,将返回新创建的记录,build方法结束。

build相似,create的实现如下:

def create(attributes = {}, options = {}, &block)
  create_record(attributes, options, &block)
end

def create_record(attributes, options, raise = false, &block)
  unless owner.persisted?
    raise ActiveRecord::RecordNotSaved, "You cannot call create unless the parent is saved"
  end

  if attributes.is_a?(Array)
    attributes.collect { |attr| create_record(attr, options, raise, &block) }
  else
    transaction do
      add_to_target(build_record(attributes, options)) do |record|
        yield(record) if block_given?
        insert_record(record, true, raise)
      end
    end
  end
end

可以看到create_recordbuild方法结构相似,唯一的区别就是:

  1. create_record方法中有transaction方法创建Transaction来包裹代码块。
  2. add_to_target方法的block中存在insert_record方法。

transaction方法会在下文中详细解析,这里我们只关注insert_record方法,该方法定义在ActiveRecord::Associations::HasManyAssociation中:

def insert_record(record, validate = true, raise = false)
  set_owner_attributes(record)

  if raise
    record.save!(:validate => validate)
  else
    record.save(:validate => validate)
  end
end

这里调用set_own_attributes再次做赋值:

# Sets the owner attributes on the given record
def set_owner_attributes(record)
  creation_attributes.each { |key, value| record[key] = value }
end

def creation_attributes
  attributes = {}

  if reflection.macro.in?([:has_one, :has_many]) && !options[:through]
    attributes[reflection.foreign_key] = owner[reflection.active_record_primary_key]

    if reflection.options[:as]
      attributes[reflection.type] = owner.class.base_class.name
    end
  end

  attributes
end

目前不清楚这里为何再次做了赋值,但是这里明显并不需要。接着,将视参数而定调用savesave!,成功后,将依旧返回新创建的对象。

接着,让我们来看看Scope Querying,首先,先从声明开始。Active Record对象关于Scope的声明有两种方法,直接写查询条件,或是将条件写在一个lambda里。其中后者每次执行scope都将执行一次,而前者只需要在一开始执行一次即可。从效率上看显然前者更好,但前者也因此具备了可以在调用时传参数的特性。值得一提的是,从Rails 4开始正式取消了前者写法,原因在这里解释。

但是由于本文讨论的是Rails 3.2,因此两种定义Scope的写法都将解析,首先是第一种方法的声明部分,实例代码是scope :of_employees, where(:imageable_type => 'Employee'),在ActiveRecord::Base的类中直接执行where方法将被代理到scoped中,该声明定义在ActiveRecord::Querying中:

delegate :select, :group, :order, :except, :reorder, :limit, :offset, :joins,
         :where, :preload, :eager_load, :includes, :from, :lock, :readonly,
         :having, :create_with, :uniq, :to => :scoped

因此在这里执行where方法的结果是将返回基于scopedRelation对象。

然后我们进入scope方法,该方法定义的位置在ActiveRecord

def scope(name, scope_options = {})
  name = name.to_sym
  valid_scope_name?(name)
  extension = Module.new(&Proc.new) if block_given?

  scope_proc = lambda do |*args|
    options = scope_options.respond_to?(:call) ? unscoped { scope_options.call(*args) } : scope_options
    options = scoped.apply_finder_options(options) if options.is_a?(Hash)

    relation = scoped.merge(options)

    extension ? relation.extending(extension) : relation
  end

  singleton_class.send(:redefine_method, name, &scope_proc)
end

可以看到,这个方法的特点就是它同时支持了传入Relation对象和Proc对象。只要名字符合valid_scope_name?方法的需求,事实上valid_scope_name?并非强制阻止,它的源码是:

def valid_scope_name?(name)
  if logger && respond_to?(name, true)
    logger.warn "Creating scope :#{name}. " \
                "Overwriting existing method #{self.name}.#{name}."
  end
end

可以看到仅仅是简单的警告而已。随后,将创建用于方法执行的lambda,并且定义这个方法到类方法中。

对于传入的参数是Relation对象的情况,将直接将传入的Relation对象与scoped merge。注意,如果此时该类存在default_scope,则仅仅在此时default_scope才与Relation对象merge,这就是保证了不会因为事先声明了default_scope就导致以后声明的Scope都包含了default_scope的内容造成一些不该存在的bug。

接着我们看下传入lambda的情况,事例代码是scope :of_products, -> { where(:imageable_type => 'Product') },方法依旧是scope,其主要差异是在之前调用了unscoped方法并且传入block,虽然unscoped方法我们已经看到过,但只是简单的一笔带过,传入block的情况之前并没有提到过,这里将详细解析:

def unscoped
  block_given? ? relation.scoping { yield } : relation
end

可以看到,这里首先重新创建了relation对象(这步其实已经得到了unscope过的Relation对象),然后调用了它的scoping方法:

def scoping
  @klass.with_scope(self, :overwrite) { yield }
end

该方法主要是接受一个block,并且用上下文的scope覆盖掉(由于传入了:override参数)原来的scope,然后调用block:

def with_scope(scope = {}, action = :merge, &block)
  # If another Active Record class has been passed in, get its current scope
  scope = scope.current_scope if !scope.is_a?(Relation) && scope.respond_to?(:current_scope)

  previous_scope = self.current_scope

  if scope.is_a?(Hash)
    # Dup first and second level of hash (method and params).
    scope = scope.dup
    scope.each do |method, params|
      scope[method] = params.dup unless params == true
    end

    scope.assert_valid_keys([ :find, :create ])
    relation = construct_finder_arel(scope[:find] || {})
    relation.default_scoped = true unless action == :overwrite

    if previous_scope && previous_scope.create_with_value && scope[:create]
      scope_for_create = if action == :merge
        previous_scope.create_with_value.merge(scope[:create])
      else
        scope[:create]
      end

      relation = relation.create_with(scope_for_create)
    else
      scope_for_create = scope[:create]
      scope_for_create ||= previous_scope.create_with_value if previous_scope
      relation = relation.create_with(scope_for_create) if scope_for_create
    end

    scope = relation
  end

  scope = previous_scope.merge(scope) if previous_scope && action == :merge

  self.current_scope = scope
  begin
    yield
  ensure
    self.current_scope = previous_scope
  end
end

可以看到该方法的第二个参数只有:merge:overwrite两种选项,其中:merge表示将暂时合并两种scope,而:overwrite则完全使用传入的的scope。

因此,在实际执行lambda内查询语句的时候,当前scope正是刚刚新创建的Relation对象,这就保证了预设的defalt_scope不会影响到lambda的执行结果。至于具体的merge位置则与刚才介绍的不传入lambda的方法完全一致。

至于声明和获取default_scope,也比较简单。defalt_scope方法声明在ActiveRecord::Scoping::Default模块中:

def default_scope(scope = {})
  scope = Proc.new if block_given?
  self.default_scopes = default_scopes + [scope]
end

可以看到default_scope接受哈希,Relation对象和block,default_scopes是定义在同一模块下的数组,可以存储类的多个default_scope声明。

至于获取default_scope,就在先前介绍过的with_default_scope方法中,它定义在ActiveRecord::Relation中,源码是:

def with_default_scope #:nodoc:
  if default_scoped? && default_scope = klass.send(:build_default_scope)
    default_scope = default_scope.merge(self)
    default_scope.default_scoped = false
    default_scope
  else
    self
  end
end

如果当前Scope是default scope的话,将尝试调用build_default_scope看是否之前声明过default scope:

def build_default_scope #:nodoc:
  if method(:default_scope).owner != ActiveRecord::Scoping::Default::ClassMethods
    evaluate_default_scope { default_scope }
  elsif default_scopes.any?
    evaluate_default_scope do
      default_scopes.inject(relation) do |default_scope, scope|
        if scope.is_a?(Hash)
          default_scope.apply_finder_options(scope)
        elsif !scope.is_a?(Relation) && scope.respond_to?(:call)
          default_scope.merge(scope.call)
        else
          default_scope.merge(scope)
        end
      end
    end
  end
end

从代码中可知,如果default_scopeowner不是Default模块,则表明该方法已经被覆盖过,则直接调用该方法即可,否则如果之前声明过default scope,首先调用evaluate_default_scope方法:

def evaluate_default_scope
  return if ignore_default_scope?

  begin
    self.ignore_default_scope = true
    yield
  ensure
    self.ignore_default_scope = false
  end
end

该方法只是临时设置ignore_default_scope为true(不过这个变量在Rails中并不具备具体的功能,可能是为了扩展所需,或仅仅是起到不重复调用该方法的目的)。

随后,创建一个全新的Relation对象,将声明过的default scope一一与之合并,如果default scope是Hash则调用apply_finder_options,如果是block则先执行这个block再调用merge,如果是Relation对象就直接调用merge。最后将得到最终的default scope结果。

Base

Read and Write attribute, STI & Serialize

从本篇开始将不再讨论Active Record最基本的查询和关系功能,本篇将简单解析下Active Record对象的属性初始化,读取,写入,STI和序列化这样的简单功能,案例如下:

class User < ActiveRecord::Base
  attr_accessible :contact, :type, :username
  serialize :contact
end

class Student < User; end

s.username = 'bachue'
p s.username

s.contact = {:phone => '123456', :city => 'Shanghai', :address => 'NanJing RD'}
p s.contact

首先,让我们先从serialize方法的声明开始,在Active Record中,与序列化相关的部分一般定义在ActiveRecord::AttributeMethods::Serialization模块中:

def serialize(attr_name, class_name = Object)
  coder = if [:load, :dump].all? { |x| class_name.respond_to?(x) }
            class_name
          else
            Coders::YAMLColumn.new(class_name)
          end

  # merge new serialized attribute and create new hash to ensure that each class in inheritance hierarchy
  # has its own hash of own serialized attributes
  self.serialized_attributes = serialized_attributes.merge(attr_name.to_s => coder)
end

serialize的第二个参数表示序列化的方法,接受一个实现了:load:dump方法的类。如果传入的类没有符合这个要求,或是没有参数参数,则默认使用Coders::YAMLColumn类的实例,该类将调用YAML库来序列化数据。最后,将属性名和序列化类放入serialized_attributess中。

接着,让我们解析下Active Record对象是如何初始化属性的。事实上,Active Record初始化属性有多个可能的入口:respond_to?read_attribute或者write_attributemethod_missing。但总之入口方法始终都是define_attribute_methods,该方法定义在ActiveRecord::AttributeMethods

def define_attribute_methods
  unless defined?(@attribute_methods_mutex)
    msg = "It looks like something (probably a gem/plugin) is overriding the " \
          "ActiveRecord::Base.inherited method. It is important that this hook executes so " \
          "that your models are set up correctly. A workaround has been added to stop this " \
          "causing an error in 3.2, but future versions will simply not work if the hook is " \
          "overridden. If you are using Kaminari, please upgrade as it is known to have had " \
          "this problem.\n\n"
    msg << "The following may help track down the problem:"

    meth = method(:inherited)
    if meth.respond_to?(:source_location)
      msg << " #{meth.source_location.inspect}"
    else
      msg << " #{meth.inspect}"
    end
    msg << "\n\n"

    ActiveSupport::Deprecation.warn(msg)

    @attribute_methods_mutex = Mutex.new
  end

  # Use a mutex; we don't want two thread simaltaneously trying to define
  # attribute methods.
  @attribute_methods_mutex.synchronize do
    return if attribute_methods_generated?
    superclass.define_attribute_methods unless self == base_class
    super(column_names)
    column_names.each { |name| define_external_attribute_method(name) }
    @attribute_methods_generated = true
  end
end

首先,添加属性方法的时候需要加锁,防止线程安全问题。如果当前类有父类,则调用父类同名方法来定义属性方法。随后就调用ActiveModel::AttributeMethods中的同名方法并传入所有Column的名字作为参数:

def define_attribute_methods(attr_names)
  attr_names.each { |attr_name| define_attribute_method(attr_name) }
end

可以看到,这里对每个属性名字都调用了define_attribute_method方法:

def define_attribute_method(attr_name)
  attribute_method_matchers.each do |matcher|
    method_name = matcher.method_name(attr_name)

    unless instance_method_already_implemented?(method_name)
      generate_method = "define_method_#{matcher.method_missing_target}"

      if respond_to?(generate_method, true)
        send(generate_method, attr_name)
      else
        define_optimized_call generated_attribute_methods, method_name, matcher.method_missing_target, attr_name.to_s
      end
    end
  end
  attribute_method_matchers_cache.clear
end

首先,这里遍历了attribute_method_matchers数组,该数组在Model每次调用attribute_method_prefixattribute_method_suffixattribute_method_affix的时候均会添加一个元素,这个元素维护一个正则表达式和一个Format,即可根据方法名找属性又可以根据属性找方法名。这里调用了matchermethod_name方法来得到方法名,随后调用instance_method_already_implemented?确定该方法是否已经实现,该实现被ActiveRecord::AttributeMethods覆盖过:

def instance_method_already_implemented?(method_name)
  if dangerous_attribute_method?(method_name)
    raise DangerousAttributeError, "#{method_name} is defined by ActiveRecord"
  end

  if superclass == Base
    super
  else
    # If B < A and A defines its own attribute method, then we don't want to overwrite that.
    defined = method_defined_within?(method_name, superclass, superclass.generated_attribute_methods)
    defined && !ActiveRecord::Base.method_defined?(method_name) || super
  end
end

首先判断该属性是否是id或是与ActiveRecord::Base中某个方法重名(但如果是与Object中某个方法重名却是允许的),然后,如果当前Model类父类就是ActiveRecord::Base,将直接调用父类方法:

def instance_method_already_implemented?(method_name)
  generated_attribute_methods.method_defined?(method_name)
end

而父类实现就是判断该方法是否定义在generated_attribute_methods这个模块中。

随后我们回到define_attribute_method方法,假设该方法还没有被定义,则按照约定生成一个可以定义该方法的方法名,然后查看是否已经定义了这个方法,如果没有定义,则调用define_optimized_call来生成这个方法的内容:

# Define a method `name` in `mod` that dispatches to `send`
# using the given `extra` args. This fallbacks `define_method`
# and `send` if the given names cannot be compiled.
def define_optimized_call(mod, name, send, *extra)
  if name =~ NAME_COMPILABLE_REGEXP
    defn = "def #{name}(*args)"
  else
    defn = "define_method(:'#{name}') do |*args|"
  end

  extra = (extra.map(&:inspect) << "*args").join(", ")

  if send =~ CALL_COMPILABLE_REGEXP
    target = "#{send}(#{extra})"
  else
    target = "send(:'#{send}', #{extra})"
  end

  mod.module_eval <<-RUBY, __FILE__, __LINE__ + 1
    #{defn}
      #{target}
    end
  RUBY
end

按照约定,这个方法将调用一个替代方法,这个方法名是将原方法名中属性名的部分替换成attribute以后的结果,并将属性名作为一个参数传入,例如定义name=方法的内容为attribute.=(name),如果已经定义,则调用该方法。用作定义读方法的define_method_attribute和用作写方法的define_method_attribute=就是在这个时候被调用的,过会将详细解释这两个方法的实现。

最后,将清理attribute_method_matchers_cache的内容,这个缓存一般用于根据方法名用正则表达式在attribute_method_matchers中搜索matcher的时候保存搜索结果,每次更新属性方法都将造成缓存的失效。

随后,将对每个属性名调用define_external_attribute_method方法:

def define_external_attribute_method(attr_name)
  generated_external_attribute_methods.module_eval <<-STR, __FILE__, __LINE__ + 1
    def __temp__(v, attributes, attributes_cache, attr_name)
      #{external_attribute_access_code(attr_name, attribute_cast_code(attr_name))}
    end
    alias_method '#{attr_name}', :__temp__
    undef_method :__temp__
  STR
end

可以看到这里定义方法的手段略奇怪,原因在注释中已经写明,define_method由于要创建闭包可能效率偏低并且占用更多内存,但传统的def语法可能无法创建一些名字不符合Ruby规范的方法,因此采用先创建__temp__方法再做alias的手法解决这个问题。该方法的内容主要是为generated_external_attribute_methods增加了与属性名同名的方法,而这个方法的内容则是external_attribute_access_code(attr_name, attribute_cast_code(attr_name))的结果,其中attribute_cast_code的实现如下,该实现有多层,第一层实现在ActiveRecord::AttributeMethods::Serialization中:

def attribute_cast_code(attr_name)
  if serialized_attributes.include?(attr_name)
    "v.unserialized_value"
  else
    super
  end
end

如果该属性是被序列化过的话,则调用其unserialized_value方法(对数据进行反序列化,过会将会详细解释这个方法),否则调用父类方法,该层实现定义在ActiveRecord::AttributeMethods::TimeZoneConversion中:

# The enhanced read method automatically converts the UTC time stored in the database to the time
# zone stored in Time.zone.
def attribute_cast_code(attr_name)
  column = columns_hash[attr_name]

  if create_time_zone_conversion_attribute?(attr_name, column)
    typecast             = "v = #{super}"
    time_zone_conversion = "v.acts_like?(:time) ? v.in_time_zone : v"

    "((#{typecast}) && (#{time_zone_conversion}))"
  else
    super
  end
end

其中create_time_zone_conversion_attribute的实现是:

def create_time_zone_conversion_attribute?(name, column)
  time_zone_aware_attributes && !self.skip_time_zone_conversion_for_attributes.include?(name.to_sym) && column.type.in?([:datetime, :timestamp])
end

由于Rails中可以维护当前时区,如果当前Column的类型是时间,则对于从数据库中得到的时间将会按照Rails中设定的时区进行转换,而转换方法则是ActiveSupport::TimeWithZone定义的in_time_zone

如果不是,将继续调用上层方法,这层实现定义在ActiveRecord::AttributeMethods::Read中:

def attribute_cast_code(attr_name)
  columns_hash[attr_name].type_cast_code('v')
end

由于一般通过数据库Adapter获取的数据通常都是字符串类型,在Rails中则根据Column类型应该转换成相应的Ruby的类型,这个方法就是ActiveRecord::ConnectionAdapters::Column实现的type_cast_code方法:

def type_cast_code(var_name)
  klass = self.class.name

  case type
  when :string, :text        then var_name
  when :integer              then "#{klass}.value_to_integer(#{var_name})"
  when :float                then "#{var_name}.to_f"
  when :decimal              then "#{klass}.value_to_decimal(#{var_name})"
  when :datetime, :timestamp then "#{klass}.string_to_time(#{var_name})"
  when :time                 then "#{klass}.string_to_dummy_time(#{var_name})"
  when :date                 then "#{klass}.string_to_date(#{var_name})"
  when :binary               then "#{klass}.binary_to_string(#{var_name})"
  when :boolean              then "#{klass}.value_to_boolean(#{var_name})"
  else var_name
  end
end

而紧接着调用external_attribute_access_code的实现:

def external_attribute_access_code(attr_name, cast_code)
  access_code = "v && #{cast_code}"

  if cache_attribute?(attr_name)
    access_code = "attributes_cache[attr_name] ||= (#{access_code})"
  end

  access_code
end

可以看到,这里还会有缓存功能的实现,这是ActiveRecord::AttributeMethods::Read自身定义的功能,当Column类型是[:datetime, :timestamp, :time, :date]中某一个的时候,其值将会被缓存。随后将返回所有生成的字符串作为代码。

最后,回到define_attribute_methods,将会将@attribute_methods_generated置为true,则所有属性方法生成完毕。

下面将解析define_method_attributedefine_method_attribute=两个方法,以便于等会对于属性读写的解析。首先是define_method_attribute,一样分多个层次实现,第一层实现在ActiveRecord::AttributeMethods::PrimaryKey

def define_method_attribute(attr_name)
  super

  if attr_name == primary_key && attr_name != 'id'
    generated_attribute_methods.send(:alias_method, :id, primary_key)
    generated_external_attribute_methods.module_eval <<-CODE, __FILE__, __LINE__
      def id(v, attributes, attributes_cache, attr_name)
        attr_name = '#{primary_key}'
        send(attr_name, attributes[attr_name], attributes, attributes_cache, attr_name)
      end
    CODE
  end
end

该层的实现是,当主键名字不为id的时候,依然创建id方法并alias到对应的主键方法去。然后进入上一层ActiveRecord::AttributeMethods::Read的实现:

def define_method_attribute(attr_name)
  generated_attribute_methods.module_eval <<-STR, __FILE__, __LINE__ + 1
    def __temp__
      #{internal_attribute_access_code(attr_name, attribute_cast_code(attr_name))}
    end
    alias_method '#{attr_name}', :__temp__
    undef_method :__temp__
  STR
end

可以看到这个实现与define_external_attribute_method的实现非常相似,区别主要是在于后者将方法定义在了generated_external_attribute_methods这个模块上,__temp__接受外部传入的参数而不是依靠instance variable,生成代码的方法也用到了external_attribute_access_code方法。

下面我们来看看internal_attribute_access_code里是如何生成代码的:

def internal_attribute_access_code(attr_name, cast_code)
  access_code = "(v=@attributes[attr_name]) && #{cast_code}"

  unless attr_name == primary_key
    access_code.insert(0, "missing_attribute(attr_name, caller) unless @attributes.has_key?(attr_name); ")
  end

  if cache_attribute?(attr_name)
    access_code = "@attributes_cache[attr_name] ||= (#{access_code})"
  end

  "attr_name = '#{attr_name}'; #{access_code}"
end

可以看到,生成的代码也很简单,与external_attribute_access_code相似,只是改成从@attributes中读取数据,然后进行类型转换。如果找不到该属性的话,则调用miss_attribute方法,该方法将抛出ActiveModel::MissingAttributeError异常。

随后我们来看define_method_attribute=方法,该方法同样有多层实现,第一层实现定义在ActiveRecord::AttributeMethods::TimeZoneConversion

# Defined for all +datetime+ and +timestamp+ attributes when +time_zone_aware_attributes+ are enabled.
# This enhanced write method will automatically convert the time passed to it to the zone stored in Time.zone.
def define_method_attribute=(attr_name)
  if create_time_zone_conversion_attribute?(attr_name, columns_hash[attr_name])
    method_body, line = <<-EOV, __LINE__ + 1
      def #{attr_name}=(original_time)
        original_time = nil if original_time.blank?
        time = original_time
        unless time.acts_like?(:time)
          time = time.is_a?(String) ? Time.zone.parse(time) : time.to_time rescue time
        end
        time = time.in_time_zone rescue nil if time
        previous_time = attribute_changed?("#{attr_name}") ? changed_attributes["#{attr_name}"] : read_attribute(:#{attr_name})
        write_attribute(:#{attr_name}, original_time)
        #{attr_name}_will_change! if previous_time != time
        @attributes_cache["#{attr_name}"] = time
      end
    EOV
    generated_attribute_methods.module_eval(method_body, __FILE__, line)
  else
    super
  end
end

与之前提及的一致,该方法接受外部传入的时间,按照Rails的设定进行时区转换,然后将结果通过write_attribute写入,随后强制将日期标记为changed(原因在这里)。这里的read_attributewrite_attribute方法将在过会详细解析。随后,进入下一层ActiveRecord::AttributeMethods::Write中的实现:

def define_method_attribute=(attr_name)
  if attr_name =~ ActiveModel::AttributeMethods::NAME_COMPILABLE_REGEXP
    generated_attribute_methods.module_eval("def #{attr_name}=(new_value); write_attribute('#{attr_name}', new_value); end", __FILE__, __LINE__)
  else
    generated_attribute_methods.send(:define_method, "#{attr_name}=") do |new_value|
      write_attribute(attr_name, new_value)
    end
  end
end

可以看到,方法一样定义在generated_attribute_methods上,主要实现就是调用write_attribute方法来写入新值。

我们也由此可以发现一个细节,从ActiveRecord中读取属性的时候可以不使用read_attribute方法,直接从@attributes中读取然后转换即可。但是写入属性则必须通过write_attribute来完成。事实上,调用read_attribute额外获得的好处仅仅只是如果属性方法在那时还没有定义,则定义他们而已,我们来看下read_attribute的源码:

# Returns the value of the attribute identified by <tt>attr_name</tt> after it has been typecast (for example,
# "2004-12-12" in a data column is cast to a date object, like Date.new(2004, 12, 12)).
def read_attribute(attr_name)
  self.class.type_cast_attribute(attr_name, @attributes, @attributes_cache)
end

def type_cast_attribute(attr_name, attributes, cache = {})
  return unless attr_name
  attr_name = attr_name.to_s

  if generated_external_attribute_methods.method_defined?(attr_name)
    if attributes.has_key?(attr_name) || attr_name == 'id'
      generated_external_attribute_methods.send(attr_name, attributes[attr_name], attributes, cache, attr_name)
    end
  elsif !attribute_methods_generated?
    # If we haven't generated the caster methods yet, do that and
    # then try again
    define_attribute_methods
    type_cast_attribute(attr_name, attributes, cache)
  else
    # If we get here, the attribute has no associated DB column, so
    # just return it verbatim.
    attributes[attr_name]
  end
end

可以看到,read_attribute实际上依赖了generated_external_attribute_methods里的方法来实现读取,并将自身的@attributes@attributes_cache传入,这个做法有可能只是因为希望实现代码共享。

好,在进行对write_attribute的解析前,不如顺便先解析STI。事实上,与STI相关的代码存在与Active Record的方方面面,不能真正集中解析,而且之前的代码也多有提及,这里我们只是集中在STI的初始化和赋值部分上。

ActiveRecord::Baseinitialize方法的实现中,有一个方法叫ensure_proper_type

def initialize(attributes = nil, options = {})
  defaults = Hash[self.class.column_defaults.map { |k, v| [k, v.duplicable? ? v.dup : v] }]
  @attributes = self.class.initialize_attributes(defaults)
  @association_cache = {}
  @aggregation_cache = {}
  @attributes_cache = {}
  @new_record = true
  @readonly = false
  @destroyed = false
  @marked_for_destruction = false
  @previously_changed = {}
  @changed_attributes = {}

  ensure_proper_type

  populate_with_current_scope_attributes

  assign_attributes(attributes, options) if attributes

  yield self if block_given?
  run_callbacks :initialize
end

这个方法的实现在ActiveRecord::Inheritance

# Sets the attribute used for single table inheritance to this class name if this is not the
# ActiveRecord::Base descendant.
# Considering the hierarchy Reply < Message < ActiveRecord::Base, this makes it possible to
# do Reply.new without having to set <tt>Reply[Reply.inheritance_column] = "Reply"</tt> yourself.
# No such attribute would be set for objects of the Message class in that example.
def ensure_proper_type
  klass = self.class
  if klass.finder_needs_type_condition?
    write_attribute(klass.inheritance_column, klass.sti_name)
  end
end

klass.finder_needs_type_condition的源码之前已经介绍过,只是简单检查Column中是否存在inheritance_column。随后,通过write_attribute方法,向这个Column写入klass.sti_namesti_name的实现如下:

def sti_name
  store_full_sti_class ? name : name.demodulize
end

一般store_full_sti_class默认为true,因此将类完整的名字写入inheritance_column,这里我们将解析write_attribute方法的实现,该方法也有多层实现,其中第一层是ActiveRecord::AttributeMethods::Dirty中的实现:

# Wrap write_attribute to remember original attribute value.
def write_attribute(attr, value)
  attr = attr.to_s

  # The attribute already has an unsaved change.
  if attribute_changed?(attr)
    old = @changed_attributes[attr]
    @changed_attributes.delete(attr) unless _field_changed?(attr, old, value)
  else
    old = clone_attribute_value(:read_attribute, attr)
    # Save Time objects as TimeWithZone if time_zone_aware_attributes == true
    old = old.in_time_zone if clone_with_time_zone_conversion_attribute?(attr, old)
    @changed_attributes[attr] = old if _field_changed?(attr, old, value)
  end

  # Carry on.
  super(attr, value)
end

可以看到,如果之前已经修改过这个属性,并且修改之前的属性的值与新写入的值一致,就像还原可一样,write_attribute就会删除@changed_attributes中的设置。如果之前没有修改过,将调用clone_attribute_value调用read_attribute取出该属性的副本:

def clone_attribute_value(reader_method, attribute_name)
  value = send(reader_method, attribute_name)
  value.duplicable? ? value.clone : value
rescue TypeError, NoMethodError
  value
end

随后,检查是否需要时区转换,方法是clone_with_time_zone_conversion_attribute?

def clone_with_time_zone_conversion_attribute?(attr, old)
  old.class.name == "Time" && time_zone_aware_attributes && !self.skip_time_zone_conversion_for_attributes.include?(attr.to_sym)
end

然后,将旧值赋值到@changed_attributes中,接着就可以进入定义在ActiveRecord::AttributeMethods::Write中的上层方法:

# Updates the attribute identified by <tt>attr_name</tt> with the specified +value+. Empty strings
# for fixnum and float columns are turned into +nil+.
def write_attribute(attr_name, value)
  attr_name = attr_name.to_s
  attr_name = self.class.primary_key if attr_name == 'id' && self.class.primary_key
  @attributes_cache.delete(attr_name)
  column = column_for_attribute(attr_name)

  unless column || @attributes.has_key?(attr_name)
    ActiveSupport::Deprecation.warn(
      "You're trying to create an attribute `#{attr_name}'. Writing arbitrary " \
      "attributes on a model is deprecated. Please just use `attr_writer` etc."
    )
  end

  @attributes[attr_name] = type_cast_attribute_for_write(column, value)
end

可以看到,如果该对象的主键并非id,为id赋值也会实际上等同于为主键赋值。赋值的实际内容就是将经过type_cast_attribute_for_write转换过的值赋值给@attributes,该方法实现如下:

def type_cast_attribute_for_write(column, value)
  if column && coder = self.class.serialized_attributes[column.name]
    Attribute.new(coder, value, :unserialized)
  else
    super
  end
end

这层是ActiveRecord::AttributeMethods::Serialization里的实现,当当前Column是被序列化的话,将创建Attribute的实例用以赋值,该类主要维护三个属性,数据,当前序列化状态以及编码器:

class Attribute < Struct.new(:coder, :value, :state)
  def unserialized_value
    state == :serialized ? unserialize : value
  end

  def serialized_value
    state == :unserialized ? serialize : value
  end

  def unserialize
    self.state = :unserialized
    self.value = coder.load(value)
  end

  def serialize
    self.state = :serialized
    self.value = coder.dump(value)
  end
end

可以看到该类的实现相当灵活同时效率也相当不错。如果该Column并非序列化的话,则继续进入上层ActiveRecord::AttributeMethods::Write的实现:

def type_cast_attribute_for_write(column, value)
  if column && column.number?
    convert_number_column_value(value)
  else
    value
  end
end

这里的column.number?是指当Column类型是否是任何一类数字,包括Integer,Float,Decimal之类的。如果是数字的话,则需要调用convert_number_column_value转换:

def convert_number_column_value(value)
  case value
  when FalseClass
    0
  when TrueClass
    1
  when String
    value.presence
  else
    value
  end
end
Mass Assignment Security, Validation, Transaction, Active Record Callback and Save

这里本篇最后一个用例,内容较丰富,将解析Active Record其他一些细小零碎的功能。

class User < ActiveRecord::Base
  attr_accessible :email, :location, :login, :zip
  validates :login, :email, presence: true
  validates_format_of :email, :with => /\A([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})\z/i

  before_validation :ensure_login_has_a_value
  before_save :set_location, if: 'zip.present?'

  protected
  def ensure_login_has_a_value
    if login.nil?
      self.login = email unless email.blank?
    end
  end

  def set_location
    self.location = LocationService.query self
  end
end

User.transaction do
  user.save
end

首先我们分别解析attr_accessibleattr_protected的实现吧,这两个方法都定义在ActiveModel::MassAssignmentSecurity中,位置在activemodel-3.2.13/lib/active_model/mass_assignment_security.rb

def attr_accessible(*args)
  options = args.extract_options!
  role = options[:as] || :default

  self._accessible_attributes = accessible_attributes_configs.dup

  Array.wrap(role).each do |name|
    self._accessible_attributes[name] = self.accessible_attributes(name) + args
  end

  self._active_authorizer = self._accessible_attributes
end

ActiveModel中,_accessible_attributes是白名单的规则,可以通过accessible_attributes_configs的实现发现:

def accessible_attributes_configs
  self._accessible_attributes ||= begin
    Hash.new { |h,k| h[k] = WhiteList.new }
  end
end

每个对象都可以有多份_accessible_attributes,由一个Hash来管理,默认情况下,我们总是使用:default作为key。在完成了对_accessible_attributes的赋值之后,将它赋值给_active_authorizer属性,表示所有定义的白名单正式起效。

然后再看protected_attributes的实现:

def attr_protected(*args)
  options = args.extract_options!
  role = options[:as] || :default

  self._protected_attributes = protected_attributes_configs.dup

  Array.wrap(role).each do |name|
    self._protected_attributes[name] = self.protected_attributes(name) + args
  end

  self._active_authorizer = self._protected_attributes
end

attr_protected的实现与attr_accessible非常相似,只是用了黑名单规则而已,可以看protected_attributes_configs的实现:

def protected_attributes_configs
  self._protected_attributes ||= begin
    Hash.new { |h,k| h[k] = BlackList.new(attributes_protected_by_default) }
  end
end

这里的黑名单中可以默认添加一些属性,虽然在ActiveModel的默认为空,但是在ActiveRecord中则有一定的规则:

# The primary key and inheritance column can never be set by mass-assignment for security reasons.
def attributes_protected_by_default
  default = [ primary_key, inheritance_column ]
  default << 'id' unless primary_key.eql? 'id'
  default
end

可以看到,默认情况下主键和STI的Column都是protected的。

最后同样将_protected_attributes赋值到_active_authorizer中,由此可以发现,attr_protectedattr_accessible无法相互兼容。

在初始化ActiveRecord对象时,将会调用到assign_attributes方法来负责属性的赋值:

def assign_attributes(new_attributes, options = {})
  return if new_attributes.blank?

  attributes = new_attributes.stringify_keys
  multi_parameter_attributes = []
  nested_parameter_attributes = []
  @mass_assignment_options = options

  unless options[:without_protection]
    attributes = sanitize_for_mass_assignment(attributes, mass_assignment_role)
  end

  attributes.each do |k, v|
    if k.include?("(")
      multi_parameter_attributes << [ k, v ]
    elsif respond_to?("#{k}=")
      if v.is_a?(Hash)
        nested_parameter_attributes << [ k, v ]
      else
        send("#{k}=", v)
      end
    else
      raise(UnknownAttributeError, "unknown attribute: #{k}")
    end
  end

  # assign any deferred nested attributes after the base attributes have been set
  nested_parameter_attributes.each do |k,v|
    send("#{k}=", v)
  end

  @mass_assignment_options = nil
  assign_multiparameter_attributes(multi_parameter_attributes)
end

其中负责过滤属性的方法是sanitize_for_mass_assignment

def sanitize_for_mass_assignment(attributes, role = nil)
  _mass_assignment_sanitizer.sanitize(attributes, mass_assignment_authorizer(role))
end

其中_mass_assignment_sanitizer.sanitize有两种可能,一种是ActiveModel::MassAssignmentSecurity::StrictSanitizer对象,一种是ActiveModel::MassAssignmentSecurity::LoggerSanitizer对象。默认情况下,前者在developmenttest模式下用,后者在production模式下使用。

mass_assignment_authorizer从之前的active_authorizer中取出role对应的白名单或黑名单对象:

def mass_assignment_authorizer(role)
  self.class.active_authorizer[role || :default]
end

随后进入sanitize方法:

# Returns all attributes not denied by the authorizer.
def sanitize(attributes, authorizer)
  sanitized_attributes = attributes.reject { |key, value| authorizer.deny?(key) }
  debug_protected_attribute_removal(attributes, sanitized_attributes)
  sanitized_attributes
end

首先调用预设的authorizer检查每一个传入的key,如果是白名单就去除所有不在白名单中的key,否则去除所有在黑名单中的key。然后调用debug_protected_attribute_removal处理因此而被排除的key:

def debug_protected_attribute_removal(attributes, sanitized_attributes)
  removed_keys = attributes.keys - sanitized_attributes.keys
  process_removed_attributes(removed_keys) if removed_keys.any?
end

这里的process_removed_attributes的实现将体现出StrictSanitizerLoggerSanitizer的区别,前者将抛出ActiveModel::MassAssignmentSecurity::Error错误,而后者将记录日志。

随后让我们来看下Validation的声明。事实上声明一个Validation有多种方法,我们将从最简单的一种方法validates :login, :email, presence: true开始解析,该方法定义在ActiveModel::Validations模块中:

def validates(*attributes)
  defaults = attributes.extract_options!
  validations = defaults.slice!(*_validates_default_keys)

  raise ArgumentError, "You need to supply at least one attribute" if attributes.empty?
  raise ArgumentError, "You need to supply at least one validation" if validations.empty?

  defaults.merge!(:attributes => attributes)

  validations.each do |key, options|
    key = "#{key.to_s.camelize}Validator"

    begin
      validator = key.include?('::') ? key.constantize : const_get(key)
    rescue NameError
      raise ArgumentError, "Unknown validator: '#{key}'"
    end

    validates_with(validator, defaults.merge(_parse_validates_options(options)))
  end
end

首先,该方法将传入的参数截成两半,Hash部分表示需要使用的Validator及其选项,_validates_default_keys在这里的值为[:if, :unless, :on, :allow_blank, :allow_nil , :strict]。而其余部分作为属性合并到传入validates_with的选项中去。这里的_parse_validates_options方法将一些非Hash的属性包装成Hash,以便与传入的选项合并:

def _parse_validates_options(options)
  case options
  when TrueClass
    {}
  when Hash
    options
  when Range, Array
    { :in => options }
  else
    { :with => options }
  end
end

随后我们进入validates_with的实现:

def validates_with(*args, &block)
  options = args.extract_options!
  args.each do |klass|
    validator = klass.new(options, &block)
    validator.setup(self) if validator.respond_to?(:setup)

    if validator.respond_to?(:attributes) && !validator.attributes.empty?
      validator.attributes.each do |attribute|
        _validators[attribute.to_sym] << validator
      end
    else
      _validators[nil] << validator
    end

    validate(validator, options)
  end
end

首先创建传入的Validator类的实例,Rails中大部分Validator都是EachValidator的子类,EachValidator的实现在activemodel-3.2.13/lib/active_model/validator.rb中:

class EachValidator < Validator
  attr_reader :attributes

  # Returns a new validator instance. All options will be available via the
  # +options+ reader, however the <tt>:attributes</tt> option will be removed
  # and instead be made available through the +attributes+ reader.
  def initialize(options)
    @attributes = Array.wrap(options.delete(:attributes))
    raise ":attributes cannot be blank" if @attributes.empty?
    super
    check_validity!
  end

  # Performs validation on the supplied record. By default this will call
  # +validates_each+ to determine validity therefore subclasses should
  # override +validates_each+ with validation logic.
  def validate(record)
    attributes.each do |attribute|
      value = record.read_attribute_for_validation(attribute)
      next if (value.nil? && options[:allow_nil]) || (value.blank? && options[:allow_blank])
      validate_each(record, attribute, value)
    end
  end
end

可以看到它在初始化时接受属性,并在调用validate做实际验证的时候,对每个属性都调用validate_each方法。回到validates_with方法,_validators为每一个属性添加了对应的Validator,最后调用了validate方法,该方法定义在ActiveModel::Validations中:

def validate(*args, &block)
  options = args.extract_options!
  if options.key?(:on)
    options = options.dup
    options[:if] = Array.wrap(options[:if])
    options[:if].unshift("validation_context == :#{options[:on]}")
  end
  args << options
  set_callback(:validate, *args, &block)
end

可以看到,这里仅仅是简单的对:on选项进行了处理,接着就设置了:validate的callback,声明到此结束。

然后让我们看下Transaction的实现,该实现定义在ActiveRecord::Transactions中:

def transaction(options = {}, &block)
  # See the ConnectionAdapters::DatabaseStatements#transaction API docs.
  connection.transaction(options, &block)
end

connection.transaction的实现在ActiveRecord::ConnectionAdapters::DatabaseStatements中,该方法主要是负责选项的处理,以及调用相应数据库驱动的方法以开启,commit和rollback Transaction的方法:

def transaction(options = {})
  options.assert_valid_keys :requires_new, :joinable

  last_transaction_joinable = defined?(@transaction_joinable) ? @transaction_joinable : nil
  if options.has_key?(:joinable)
    @transaction_joinable = options[:joinable]
  else
    @transaction_joinable = true
  end
  requires_new = options[:requires_new] || !last_transaction_joinable

  transaction_open = false
  @_current_transaction_records ||= []

  begin
    if block_given?
      if requires_new || open_transactions == 0
        if open_transactions == 0
          begin_db_transaction
        elsif requires_new
          create_savepoint
        end
        increment_open_transactions
        transaction_open = true
        @_current_transaction_records.push([])
      end
      yield
    end
  rescue Exception => database_transaction_rollback
    if transaction_open && !outside_transaction?
      transaction_open = false
      decrement_open_transactions
      if open_transactions == 0
        rollback_db_transaction
        rollback_transaction_records(true)
      else
        rollback_to_savepoint
        rollback_transaction_records(false)
      end
    end
    raise unless database_transaction_rollback.is_a?(ActiveRecord::Rollback)
  end
ensure
  @transaction_joinable = last_transaction_joinable

  if outside_transaction?
    @open_transactions = 0
  elsif transaction_open
    decrement_open_transactions
    begin
      if open_transactions == 0
        commit_db_transaction
        commit_transaction_records
      else
        release_savepoint
        save_point_records = @_current_transaction_records.pop
        unless save_point_records.blank?
          @_current_transaction_records.push([]) if @_current_transaction_records.empty?
          @_current_transaction_records.last.concat(save_point_records)
        end
      end
    rescue Exception => database_transaction_rollback
      if open_transactions == 0
        rollback_db_transaction
        rollback_transaction_records(true)
      else
        rollback_to_savepoint
        rollback_transaction_records(false)
      end
      raise
    end
  end
end

这里的begin_db_transactioncreate_savepointrollback_db_transactioncommit_db_transactionrelease_savepointrollback_to_savepoint由数据库驱动提供,用以创建/回滚/提交transaction或者save point,不过Rails为save point提供了默认的current_savepoint_name方法以便数据库使用,位置在ActiveRecord::ConnectionAdapters::AbstractAdapter(它同时也是所有数据库驱动的基类),这个方法的实现非常简单:

def current_savepoint_name
  "active_record_#{open_transactions}"
end

commit_transaction_recordsrollback_transaction_records这两个方法依然由ActiveRecord本身实现,我们先来看下commit_transaction_records的实现:

# Send a commit message to all records after they have been committed.
def commit_transaction_records
  records = @_current_transaction_records.flatten
  @_current_transaction_records.clear
  unless records.blank?
    records.uniq.each do |record|
      begin
        record.committed!
      rescue Exception => e
        record.logger.error(e) if record.respond_to?(:logger) && record.logger
      end
    end
  end
end

这里将取出当前Transaction中所有需要提交的ActiveRecord对象,然后调用committed!方法:

# Call the after_commit callbacks
def committed!
  run_callbacks :commit
ensure
  clear_transaction_record_state
end

这里主要是执行:commit这个Callback。然后执行clear_transaction_record_state方法:

# Clear the new record state and id of a record.
def clear_transaction_record_state
  if defined?(@_start_transaction_state)
    @_start_transaction_state[:level] = (@_start_transaction_state[:level] || 0) - 1
    remove_instance_variable(:@_start_transaction_state) if @_start_transaction_state[:level] < 1
  end
end

@_start_transaction_state当第一次执行save或destroy方法的时候被创建,用以记录当前Transaction中对象的情况。然后这里主要是计算@_start_transaction_state的level并在需要的时候删除掉这个对象。

这样,一个完整的Transaction的过程就完成了。

当然,我们也有可能调用到rollback_transaction_records方法:

# Send a rollback message to all records after they have been rolled back. If rollback
# is false, only rollback records since the last save point.
def rollback_transaction_records(rollback)
  if rollback
    records = @_current_transaction_records.flatten
    @_current_transaction_records.clear
  else
    records = @_current_transaction_records.pop
  end

  unless records.blank?
    records.uniq.each do |record|
      begin
        record.rolledback!(rollback)
      rescue Exception => e
        record.logger.error(e) if record.respond_to?(:logger) && record.logger
      end
    end
  end
end

这个是实现与commit_transaction_records非常相似,主要是调用ActiveRecord对象的rolledback!方法:

def rolledback!(force_restore_state = false)
  run_callbacks :rollback
ensure
  IdentityMap.remove(self) if IdentityMap.enabled?
  restore_transaction_record_state(force_restore_state)
end

然后让我们看下restore_transaction_record_state方法的实现:

# Restore the new record state and id of a record that was previously saved by a call to save_record_state.
def restore_transaction_record_state(force = false)
  if defined?(@_start_transaction_state)
    @_start_transaction_state[:level] = (@_start_transaction_state[:level] || 0) - 1
    if @_start_transaction_state[:level] < 1 || force
      restore_state = remove_instance_variable(:@_start_transaction_state)
      was_frozen = restore_state[:frozen?]
      @attributes = @attributes.dup if @attributes.frozen?
      @new_record = restore_state[:new_record]
      @destroyed  = restore_state[:destroyed]
      if restore_state.has_key?(:id)
        self.id = restore_state[:id]
      else
        @attributes.delete(self.class.primary_key)
        @attributes_cache.delete(self.class.primary_key)
      end
      @attributes.freeze if was_frozen
    end
  end
end

当保存发生异常时,这个方法将之前保存的ActiveRecord对象信息还原,restore_transaction_record_state可以额外接受一个force参数,以便于嵌套在其他Transaction内部的Transaction存在:require_new选项的时候,对ActiveRecord对象的信息进行强制恢复。

顺便要说的是,从这里我们也可以发现,只要是存储在同一个数据库上,不同的ActiveRecord类的transaction方法可以混用,在保存时,所有Transaction的代码都可以兼容其他对象。

最后,我们将解析save方法,解析这个方法将使我们了解ActiveRecord是如何将对象保存进数据库的,顺便还包括Transaction一些其他的细节和之前声明的Validator的具体运作机制。其中save方法分多个层次,第一层定义在ActiveRecord::Transactions中:

def save(*)
  rollback_active_record_state! do
    with_transaction_returning_status { super }
  end
end

这里出现了两个block以包住super,最外层的方法是rollback_active_record_state!

# Reset id and @new_record if the transaction rolls back.
def rollback_active_record_state!
  remember_transaction_record_state
  yield
rescue Exception
  IdentityMap.remove(self) if IdentityMap.enabled?
  restore_transaction_record_state
  raise
ensure
  clear_transaction_record_state
end

这个方法内调用了remember_transaction_record_staterestore_transaction_record_stateclear_transaction_record_state这谢方法,其中clear_transaction_record_staterestore_transaction_record_state方法我们已经解析,它们是用来清理或者恢复@_start_transaction_state对象的,现在我们解析remember_transaction_record_state方法:

# Save the new record state and id of a record so it can be restored later if a transaction fails.
def remember_transaction_record_state
  @_start_transaction_state ||= {}
  @_start_transaction_state[:id] = id if has_attribute?(self.class.primary_key)
  unless @_start_transaction_state.include?(:new_record)
    @_start_transaction_state[:new_record] = @new_record
  end
  unless @_start_transaction_state.include?(:destroyed)
    @_start_transaction_state[:destroyed] = @destroyed
  end
  @_start_transaction_state[:level] = (@_start_transaction_state[:level] || 0) + 1
  @_start_transaction_state[:frozen?] = @attributes.frozen?
end

该方法将当前对象和属性的状态记录进@_start_transaction_state,以便在异常发生时部分ActiveRecord对象发生信息丢失后恢复信息。

随后,我们将解析下一层方法with_transaction_returning_status

def with_transaction_returning_status
  status = nil
  self.class.transaction do
    add_to_transaction
    status = yield
    raise ActiveRecord::Rollback unless status
  end
  status
end

在这里我们可以看到,事实上执行save方法会再次执行transaction方法。由此也可以知道,如果一个Transaction中仅有一个对象的保存,是必须要专门调用transaction方法的。

在这个Transaction中,首先我们就要调用add_to_transaction方法:

# Add the record to the current transaction so that the :after_rollback and :after_commit callbacks
# can be called.
def add_to_transaction
  if self.class.connection.add_transaction_record(self)
    remember_transaction_record_state
  end
end

这里也由两个方法构成,首先是add_transaction_record方法:

# Register a record with the current transaction so that its after_commit and after_rollback callbacks
# can be called.
def add_transaction_record(record)
  last_batch = @_current_transaction_records.last
  last_batch << record if last_batch
end

该方法将试图向@_current_transaction_records数组的最后一个数组元素放入当前对象,以便于commit或rollback。由于再次打开了Transaction,需要再次执行remember_transaction_record_state方法使得level加一,当然代码我们就无需再次解析了。

随后,让我们进入下一层的save方法的实现,这层实现写在ActiveRecord::AttributeMethods::Dirty模块中,从名字中可知,主要是对于脏数据的处理:

# Attempts to +save+ the record and clears changed attributes if successful.
def save(*)
  if status = super
    @previously_changed = changes
    @changed_attributes.clear
  elsif IdentityMap.enabled?
    IdentityMap.remove(self)
  end
  status
end

这个方法主要是将之前对属性做过的修改赋值到@previously_changed里,随后清理掉@changed_attributes理的内容。随后我们再进入下一层,这层将正式执行之前声明的所有Validator,它定义在ActiveRecord::Validations模块中:

# The validation process on save can be skipped by passing <tt>:validate => false</tt>. The regular Base#save method is
# replaced with this when the validations module is mixed in, which it is by default.
def save(options={})
  perform_validations(options) ? super : false
end

主要是调用perform_validations方法:

def perform_validations(options={})
  perform_validation = options[:validate] != false
  perform_validation ? valid?(options[:context]) : true
end

该首先确定是否确实需要执行Validator,然后调用valid?正式验证,valid?方法由两层实现,首先是ActiveRecord::Validations本身的实现:

def valid?(context = nil)
  context ||= (new_record? ? :create : :update)
  output = super(context)
  errors.empty? && output
end

valid?本身接受一个context参数以便于确定执行:create的Validator还是:update的,然后就调用上层实现:

# Runs all the specified validations and returns true if no errors were added
# otherwise false. Context can optionally be supplied to define which callbacks
# to test against (the context is defined on the validations using :on).
def valid?(context = nil)
  current_context, self.validation_context = validation_context, context
  errors.clear
  run_validations!
ensure
  self.validation_context = current_context
end

这里主要是对validation_context的赋值,以及清除之前存在的错误信息,随后就调用了run_validations!方法正式执行Validator:

# Overwrite run validations to include callbacks.
def run_validations!
  run_callbacks(:validation) { super }
end

执行Validator事实上是执行:validation的Callback,该方法定义在ActiveModel::Validations::Callbacks中。这里给出两个:validation相关的Callback:

def before_validation(*args, &block)
  options = args.last
  if options.is_a?(Hash) && options[:on]
    options[:if] = Array.wrap(options[:if])
    options[:if].unshift("self.validation_context == :#{options[:on]}")
  end
  set_callback(:validation, :before, *args, &block)
end

def after_validation(*args, &block)
  options = args.extract_options!
  options[:prepend] = true
  options[:if] = Array.wrap(options[:if])
  options[:if] << "!halted"
  options[:if].unshift("self.validation_context == :#{options[:on]}") if options[:on]
  set_callback(:validation, :after, *(args << options), &block)
end

可以看到,只要调用了before_validationafter_validation方法,便可在此时调用:validation的前后被正式调用。由于传入了super作为block,在此期间,run_validations!的父类方法将被调用:

def run_validations!
  run_callbacks :validate
  errors.empty?
end

这里正式调用了之前定义的:validateCallback,至此,所有Validator被执行完毕。

好了,假设Validator全部执行完毕之后,我们继续执行super到上一层,这层实现在ActiveRecord::Persistence中:

def save(*)
  begin
    create_or_update
  rescue ActiveRecord::RecordInvalid
    false
  end
end

这里将对save的调用变成了对create_or_update的调用,其第一层实现在ActiveRecord::Callbacks中:

def create_or_update
  run_callbacks(:save) { super }
end

可以看到,这里将调用:save这个Callback,before_savearound_saveafter_save都会围绕着这个Callback执行,继续深入便可回到ActiveRecord::Persistence中的实现:

def create_or_update
  raise ReadOnlyRecord if readonly?
  result = new_record? ? create : update
  result != false
end

到了这里,将区分createupdate的调用,这里我们将进入create的实现,update的实现与之十分类似。create方法的第一层依然是ActiveRecord::Callbacks的Callback调用,这次的Callback是:create

def create
  run_callbacks(:create) { super }
end

它的上一层就会抵达ActiveRecord::Timestamp的实现,这层实现将为ActiveRecord对象增加时间方面的赋值,在Rails中,总共有四个这方面的属性,:created_at, :created_on, :updated_at, :updated_on需要在创建时赋值为当前时间:

def create
  if self.record_timestamps
    current_time = current_time_from_proper_timezone

    all_timestamp_attributes.each do |column|
      if respond_to?(column) && respond_to?("#{column}=") && self.send(column).nil?
        write_attribute(column.to_s, current_time)
      end
    end
  end

  super
end

继续进入下一层ActiveRecord::Persistence的实现:

# Creates a record with values matching those of the instance attributes
# and returns its id.
def create
  attributes_values = arel_attributes_values(!id.nil?)

  new_id = self.class.unscoped.insert attributes_values

  self.id ||= new_id if self.class.primary_key

  IdentityMap.add(self) if IdentityMap.enabled?
  @new_record = false
  id
end

这里首先调用arel_attributes_values方法获取Arel的属性及其值的Hash:

# Returns a copy of the attributes hash where all the values have been safely quoted for use in
# an Arel insert/update method.
def arel_attributes_values(include_primary_key = true, include_readonly_attributes = true, attribute_names = @attributes.keys)
  attrs      = {}
  klass      = self.class
  arel_table = klass.arel_table

  attribute_names.each do |name|
    if (column = column_for_attribute(name)) && (include_primary_key || !column.primary)

      if include_readonly_attributes || !self.class.readonly_attributes.include?(name)

        value = if klass.serialized_attributes.include?(name)
                  @attributes[name].serialized_value
                else
                  # FIXME: we need @attributes to be used consistently.
                  # If the values stored in @attributes were already type
                  # casted, this code could be simplified
                  read_attribute(name)
                end

        attrs[arel_table[name]] = value
      end
    end
  end

  attrs
end

随后,将这个Map传入到该类的scopedinsert方法中:

def insert(values)
  primary_key_value = nil

  if primary_key && Hash === values
    primary_key_value = values[values.keys.find { |k|
      k.name == primary_key
    }]

    if !primary_key_value && connection.prefetch_primary_key?(klass.table_name)
      primary_key_value = connection.next_sequence_value(klass.sequence_name)
      values[klass.arel_table[klass.primary_key]] = primary_key_value
    end
  end

  im = arel.create_insert
  im.into @table

  conn = @klass.connection

  substitutes = values.sort_by { |arel_attr,_| arel_attr.name }
  binds       = substitutes.map do |arel_attr, value|
    [@klass.columns_hash[arel_attr.name], value]
  end

  substitutes.each_with_index do |tuple, i|
    tuple[1] = conn.substitute_at(binds[i][0], i)
  end

  if values.empty? # empty insert
    im.values = Arel.sql(connection.empty_insert_statement_value)
  else
    im.insert substitutes
  end

  conn.insert(
    im,
    'SQL',
    primary_key,
    primary_key_value,
    nil,
    binds)
end

这个方法将负责对INSERT的SQL语句进行构建,然后执行该语句,可以任何是保存新记录时最重要的方法。首先,它查找新对象中是否已经存在主键,如果不存在的话,将询问数据库Adapter是否可以预取新对象的主键的值,如果可以,则赋值给新对象。然后,创建Arel的Insert对象,随后,将前面传入的Hash按照键名排序,将Hash重新map成由Column和值为元素构成的数组作为数据绑定,然后,将原来Hash中值的部分改成Arel的问号,最后,将Hash赋值给Arel的Insert语句对象中的数据部分,最后,调用conn.insert方法即可完成插入语句的执行。下面将进入conn.insert方法,首先调用insert方法将使得query cache被删除,这段代码定义在ActiveRecord::ConnectionAdapters::QueryCache中:

def included(base)
  dirties_query_cache base, :insert, :update, :delete
end

def dirties_query_cache(base, *method_names)
  method_names.each do |method_name|
    base.class_eval <<-end_code, __FILE__, __LINE__ + 1
      def #{method_name}(*)                         # def update_with_query_dirty(*args)
        clear_query_cache if @query_cache_enabled   #   clear_query_cache if @query_cache_enabled
        super                                       #   update_without_query_dirty(*args)
      end                                           # end
    end_code
  end
end

可以看到,insert,update,delete都将造成query cache被彻底清除。clear_query_cache的代码非常简单:

# Clears the query cache.
#
# One reason you may wish to call this method explicitly is between queries
# that ask the database to randomize results. Otherwise the cache would see
# the same SQL query and repeatedly return the same result each time, silently
# undermining the randomness you were expecting.
def clear_query_cache
  @query_cache.clear
end

然后我们正式进入insert的实现,这个实现定义在ActiveRecord::ConnectionAdapters::DatabaseStatements中:

def insert(arel, name = nil, pk = nil, id_value = nil, sequence_name = nil, binds = [])
  sql, binds = sql_for_insert(to_sql(arel, binds), pk, id_value, sequence_name, binds)
  value      = exec_insert(sql, name, binds)
  id_value || last_inserted_id(value)
end

对SQLite 3而言,exec_insertexec_query等效,用该方法执行经过to_sql生成的SQL语句后,即可得到返回值,可以通过last_valie_id方法获取返回值中该新记录的id,赋值给该对象,这样一次插入的过程就已经完成了。