Memoization strategy #539

jmeinerz · 2023-05-10T09:09:30Z

Memoization is useful if you want to run a command once but be able to access that value many times.

However, the popular ||= operator is not great for when your command might return a falsey value. By guarding memoization instance methods with if defined?, we make sure it only runs once.

This is particularly important to stop expensive queries that may return nil or false from running needlessly.

Memoization is useful if you want to run a command once but be able to access that value many times. However, the popular `||=` operator is not great for when your command might return a falsey value. By guarding memoization instance methods with `if defined?`, we make sure it only runs once.

nunosilva800

I think this is a great addition to the guide.

README.md

Co-authored-by: Nuno Silva <nunosilva800@gmail.com>

README.md

cbothner

Can we add a little nuance to the recommendation? Most of the time the terser rose memoization is appropriate.

cbothner · 2023-05-10T10:42:33Z

README.md

+
+## Memoization
+
+* Prefer `return @x if defined?(@x)` over a simple `||=`


Suggested change

* Prefer `return @x if defined?(@x)` over a simple `||=`

* Prefer `return @x if defined?(@x)` over `||=` if the statement being memoized can evaluate to `nil` or `false`.

Honestly, I didn't specify this nuance on purpose. I think having a single way of doing things simplifies the process of coding, and can protect the code from certain changes.

To illustrate my point, let's say we have the following method:

def merchants @merchant ||= Merchant.where(some_condition) end

Now, even when no merchants are found, the result would be [], which according to your suggestion would not grant a guard statement because it would never be false-y. If for some reason, though, the method changes to just check if merchants exist, it would require the person changing it to remember this distinction in behaviour and then apply the guard.

Without having the guard in the original method, I find it very likely the method would evolve to something like this:

def merchants? @merchants_exist ||= Merchant.exists?(some_condition) end

Which could in fact result in N queries.

Thoughts?

Frankly, everything around memoization requires nuance, because memoization is caching and cache invalidation is one of the Two Hard Problems¹. I think your addition is a good detail to include in our style guide to reduce the chance someone mindlessly repeats the rose memoization pattern. But it would also be suboptimal for someone to mindlessly repeat return @x if defined?(@x).

I don't support forbidding rose memoization, so I'll advocate for the nuance in our discussion of it.

On a similar note, a section on Memoization might also include a note about memoizing a method that takes arguments.

# Bad. A stale value can be returned when a different argument is passed for subsequent calls def expensive_result(input) @expensive_result ||= ExpensiveResult.calculate(input) end # Better. Recalculates when a different argument is passed. # Consider using an LRU or not memoizing at all if the argument cardinality # is high or the object is long-lived. def expensive_result(input) @expensive_results ||= {} @expensive_results[input] ||= ExpensiveResult.calculate(input) end # If the expensive result might be falsey def expensive_result(input) @expensive_results ||= {} return @expensive_results[input] if @expensive_results.key?(input) @expensive_results[input] = ExpensiveResult.calculate(input) end

Footnotes

along with naming things and off-by-one errors, famously ↩

I don't support forbidding rose memoization, so I'll advocate for the nuance in our discussion of it.

I completely agree we shouldn't forbid anything. The way I think about these style guides are not as rules, but as suggestions that apply to at least 90% of cases. As I see it, in the vast majority of cases, there's no harm to "mindlessly repeat" the guard statement - but I wouldn't advocate for using it in more complex caching situations.

Given my understanding of what these guides are for, I do think adding too much nuance to the instruction makes it sort of void, though.

What is your view on how these guides should be used?

Co-authored-by: Tim Perkins <tjwp@users.noreply.github.com>

sambostock · 2023-05-10T14:12:34Z

There was a previous discussion about this in #195.

rafaelfranca

There is overhead of using defined?. Since it is a keyword, it is hard to optimize by the VM. In reality, memoization itself is hard to optimize in the VM right now.

If I'd write anything in the styleguide about memoization would be in the lines of telling people to avoid memoization. Judging by the code I see in our applications, most of the time memoization is done at the wrong time and in the wrong place.

But regardless, defined? over ||= isn't a matter of style, they have very different behavior and both can and should be used, so I don't think a style guide is the right place to include an explanation of when use one over the other. This is more content for a Ruby book, or a Ruby tutorial document. People need to learn when to use one over the other, or when not use any of them. This isn't the job of this styleguide.

jmeinerz · 2023-05-11T08:38:32Z

There is overhead of using defined?. Since it is a keyword, it is hard to optimize by the VM. In reality, memoization itself is hard to optimize in the VM right now.

If both are hard to optimise, is there really a drawback in suggesting one over the other?

If I'd write anything in the styleguide about memoization would be in the lines of telling people to avoid memoization. Judging by the code I see in our applications, most of the time memoization is done at the wrong time and in the wrong place.

This is an interesting angle. I'm not opposed to it at all, but I think there's still value in nudging people to a way with fewer side effects.

People need to learn when to use one over the other, or when not use any of them. This isn't the job of this styleguide.

In my view, it does fit the styleguide. The way I think about it is that there's more harm to come from leaving this completely open than to make a loose suggestion for defined?. And I do agree with you that "people need to learn" but I also think we should try and support those who are earlier in their journey to write good code without having to know everything. For a junior developer, a day at Shopify is already a lot to take, and I would like to help them write code with fewer side effects.

In your opinion, what is the goal of the styleguide?

jmeinerz · 2023-05-11T12:01:57Z

I just realised this rule is already kind of in the styleguide:

Use ||= to initialize variables only if they're not already initialized.

With that I'm happy to close this, unless we want to rephrase the existing rule

rafaelfranca · 2023-05-11T17:47:03Z

I'm ok to expand that text to explain when ||= isn't good and provide examples

jmeinerz requested a review from a team as a code owner May 10, 2023 09:09

jmeinerz self-assigned this May 10, 2023

nunosilva800 reviewed May 10, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

jmeinerz and others added 2 commits May 10, 2023 10:22

Update README.md

7d55778

Co-authored-by: Nuno Silva <nunosilva800@gmail.com>

Update README.md

f670ae2

Co-authored-by: Nuno Silva <nunosilva800@gmail.com>

tjwp reviewed May 10, 2023

View reviewed changes

README.md Outdated Show resolved Hide resolved

cbothner reviewed May 10, 2023

View reviewed changes

Update README.md

69807fe

Co-authored-by: Tim Perkins <tjwp@users.noreply.github.com>

rafaelfranca requested changes May 10, 2023

View reviewed changes

jmeinerz closed this Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memoization strategy #539

Memoization strategy #539

jmeinerz commented May 10, 2023 •

edited

nunosilva800 left a comment

cbothner left a comment

cbothner May 10, 2023

jmeinerz May 10, 2023

cbothner May 10, 2023 •

edited

jmeinerz May 10, 2023

sambostock commented May 10, 2023

rafaelfranca left a comment

jmeinerz commented May 11, 2023

jmeinerz commented May 11, 2023

rafaelfranca commented May 11, 2023


		## Memoization

		* Prefer `return @x if defined?(@x)` over a simple `\|\|=`

Memoization strategy #539

Memoization strategy #539

Conversation

jmeinerz commented May 10, 2023 • edited

nunosilva800 left a comment

Choose a reason for hiding this comment

cbothner left a comment

Choose a reason for hiding this comment

cbothner May 10, 2023

Choose a reason for hiding this comment

jmeinerz May 10, 2023

Choose a reason for hiding this comment

cbothner May 10, 2023 • edited

Choose a reason for hiding this comment

Footnotes

jmeinerz May 10, 2023

Choose a reason for hiding this comment

sambostock commented May 10, 2023

rafaelfranca left a comment

Choose a reason for hiding this comment

jmeinerz commented May 11, 2023

jmeinerz commented May 11, 2023

rafaelfranca commented May 11, 2023

jmeinerz commented May 10, 2023 •

edited

cbothner May 10, 2023 •

edited