Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add expire methods by key values #495

Closed
wants to merge 2 commits into from

Conversation

Hectorhammett
Copy link

@Hectorhammett Hectorhammett commented Jun 7, 2021

There are situations where performance is crucial. For those reasons it can be desired to utilize methods that do not instantiate active record objects, for example pluck for reading data and upsert_all for writing data. Utilizing these gives the developer the flexibility to read or update the needed information without needing to take the time to instantiate a new Active Record instance for each of the objects as this instantiation can really add up on processing time.

Following the pluck example utilizing the IDC feature of cache_attribute it is super useful in the case that we need just one value of one record. In the case where we need to update these records it might be preferred to utilize upsert_all which does not instantiate any ActiveRecords and we don't have a way to expire the cache. A possible way to accomplish this is to fetch the data before doing the update or update them through the ActiveRecord object, although the performance would be very poorly in which upsert_all is preferred.

We are proposing on this PR a method to expire the cache when you don't have a direct access to an ActiveRecord instance. The method is attached to the Attribute class and we can access it like this:

# item.rb
cache_attribute :name, by: :id
# item_management.rb in the project
NameOfScope::Item.cache_indexes.each do |index|
    index.expire_by_key_value({ id: 1 })
end

We decided to pass a hash for the case of when we have an attribute cached by multiple keys:

# item.rb
cache_attribute :name, by: [:id, :item_id]
# item_management.rb in the project
NameOfScope::Item.cache_indexes.each do |index|
    index.expire_by_key_value({ id: 1, item_id:10 })
end

This also lets us have this case:

# item.rb
cache_attribute :name, by: :id
cache_attribute :name, by: :other_property
cache_attribute :name, by: [:id, :item_id]
# item_management.rb in the project
NameOfScope::Item.cache_indexes.each do |index|
    index.expire_by_key_value({ id: 1, item_id: 10, other_property: 11 })
end

We made this the most flexible possible to be able to handle these different possibilities as well as to avoid expiring the cache and missing one key, the attribute class will raise an exception letting you know that there is a missing field.

If the update changes the value of the key, then pluck should be used to get the current values, and the expiration call should include both the new and old values of the keys.

There is also some discussion about these changes, the main questions that comes out from these changes are:

  • Should we also add methods to expire expire_<attribute>_by_<key>? In my opinion it could add a possible problem where you only expire one key of the object and not the other one, but we believe it is a good discussion to have.
  • Should the gem have a method that does the NameOfScope::Item.cache_indexes.each

@Shopify/delivery-platform

@Hectorhammett Hectorhammett force-pushed the add-attribute-cache-expire branch 2 times, most recently from 0ebf37b to d4851b8 Compare June 7, 2021 18:49
@Hectorhammett Hectorhammett force-pushed the add-attribute-cache-expire branch 6 times, most recently from 19e3685 to a2f57ea Compare June 8, 2021 18:42
@Hectorhammett Hectorhammett marked this pull request as ready for review June 8, 2021 18:47
Copy link
Contributor

@dylanahsmith dylanahsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the cache_attribute method can be complicated on the case that you expire it when you utilize an upsert_all or update_all method on ActiveRecord as they do not return an instance of ActiveRecord.

It isn't complicated because you don't have an ActiveRecord instance. The API is coupled to Active Record instances and Active Record callbacks because it provides the required data to both invalidate the old and new key if the cache index key columns change.

Even if everyone you tagged on the PR understands the problem, it is still important to make it clear what the scope of the problem that the PR actually addresses and it can be helpful context to anyone coming across the PR (e.g. through git archeology or issue/PR searching).

upsert_all and update_all are really just doing underlying database statements that don't load the rows. That doesn't mean we couldn't load the rows before those statements to get the Active Record objects. Was this considered as a way to get the old values?

Does your specific problem have a more limited scope, such as knowing that the cache indexes keys won't be modified?

We decided to add methods for this use case using expire_by_key_value and adding dynamic methods for expire_<attribute>_by_<key> to be able to expire the cache.

This doesn't really explain the solution. What are the thoughts behind that decision? Why did you add more than one way to expire these caches? How do you intend for these methods to be used to actually solve the problem?

The meta-programming defined methods to invalidate individual cache indexes seem like they could end up with code implicitly coupled to the set of cache indexes, such that a new one being added could result in it not being invalidated by existing code doing these manual cache invalidations. I like how the approach taken with expire_by_key_value seems like it helps avoid this problem, although I still don't like how it hides the old key problem.

Feel free to answer questions in the form of documentation. I'm also very concerned about how this could be mis-used, so documentation that helps guide the library user would be helpful.

@@ -19,6 +20,13 @@ def all_cached_associations # :nodoc:
cached_has_manys.merge(cached_has_ones).merge(cached_belongs_tos)
end

# Expire the cache by key values
def expire_by_key_values(key_values) # :nodoc:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# :nodoc generally implies the code is internal (although, I think we should use # @api private to make that clearer). However, in this case you actually intend for this to be public.

It doesn't seem like the callsite for this method will be clear enough for readers of these method calls. key_values is a term used in the method name, yet no where else in the usage of Identity Cache, so it will be hard for the reader of the method call to know what this is referring to. The object it is called on is the Active Record model, so it isn't even in the context of Identity Cache, let alone in the context of a cache attribute or index.

@@ -19,6 +20,13 @@ def all_cached_associations # :nodoc:
cached_has_manys.merge(cached_has_ones).merge(cached_belongs_tos)
end

# Expire the cache by key values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems quite useless, considering that it mostly just repeats the method name, with the addition of the cache. It is quite clear that this library is all about caches.

Documentation could provide some very helpful context to these methods.

For instance, it isn't clear outside of the PR description that these PRs are for expiring the cache when doing an upsert_all or update_all. The code itself should answer questions about why the method exist without having to do git archeology to get this information.

Even the PR description doesn't explain when to use this method compared to the meta-programming defined ones, which would be helpful context on why this specific method actually exists. Also, how the method be used to actually solve your problem is not explained anywhere and would be important to document for developers that encounter the same problem to discover or be referred to.

@Larochelle Larochelle removed the request for review from camilo June 9, 2021 19:56
@Hectorhammett Hectorhammett force-pushed the add-attribute-cache-expire branch 3 times, most recently from a095e0f to 94bb7ce Compare June 10, 2021 15:47
def expire_by_key_value(key_values)
missing_keys = missing_keys(key_values)
unless missing_keys.empty?
raise MissingKeyName,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't just need the name of the key, so MissingKey seems like a more appropriate name, but then why not just use KeyError?

assert_equal("foo", AssociatedRecord.fetch_name_by_id_and_item_id(1, 1))
end

key_values = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be this hash that crashes the cop?

undefined method `loc' for nil:NilClass
/opt/hostedtoolcache/Ruby/3.0.1/x64/lib/ruby/gems/3.0.0/gems/rubocop-1.16.1/lib/rubocop/cop/layout/hash_alignment.rb:221:in `autocorrect_incompatible_with_other_cops?'
/opt/hostedtoolcache/Ruby/3.0.1/x64/lib/ruby/gems/3.0.0/gems/rubocop-1.16.1/lib/rubocop/cop/layout/hash_alignment.rb:206:in `on_hash'

Or maybe because ({ item_id: 1 }) does not need the {}?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rubocop failure was a bug and the fix hasn't been released yet. I opened #496 to fix that.

# frozen_string_literal: true
require "test_helper"

class ExpireByKeyValuesTest < IdentityCache::TestCase
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I believe that these feature based unit test files are an anti-pattern (#498). I just hadn't gotten around to actually refactoring the tests around the code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants