Performance: Cache translations to reduce file read and parse operations #2939

alextaujenis · 2024-04-19T07:45:11Z

Motivation / Background

This Pull Request has been created because the Faker library is reading from disk and parsing the yml translation files on nearly every method call. These redundant "file read and YAML parse" operations can be efficiently cached in-memory to increase the performance of the entire library by 10.6% (faster), full modules are up to 18% (faster), and individual methods are up to 30% (faster).

Additional Information

The translate cache in this PR is stored in a class variable so it can be shared during the program operation.
The lookup key is a combination of the args string with a deterministically serialized (sorted) opts hash. (This allows the cache to generate the correct lookup key regardless of the order of the provided opts keys.)
The call to I18n.translate is then either retrieved from memory, or cached for future lookup and returned.

@@translate_cache = {}

def translate(*args, **opts)
  translate_key = args.to_s + opts.sort.join
  @@translate_cache[translate_key] ||= I18n.translate(*args, **opts)
end

This cache speeds up the operation of the entire Faker library by 10.6%, but it comes with the slight tradeoff of increasing memory size as the cache is warmed (as you use Faker methods within your program). Fortunately enough; the ENTIRE Faker English translation directory is only 2.2MB, while all Faker translations combined are 7.1MB. Allocating anywhere from 2MB - 7MB of extra memory for the Faker library to run 10.6% faster is a great tradeoff today with system memory typically measured in thousands of megabytes.

Performance Benchmark

You can see from the benchmark below that after caching the redundant "file read and YAML parse" operations that the Faker modules perform up to 18% faster. Even the popular Faker::Lorem receives a 14.8% performance increase (averaged across all methods within that module). When combined; the entire Faker library benefits from a 10.6% speed improvement. Here is a list of the top 50 improved module times from caching the translations:

Checklist

Before submitting the PR make sure the following are checked:

This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
Tests are added or updated if you fix a bug, refactor something, or add a feature.
Tests and Rubocop are passing before submitting your proposed changes.
Double-check the existing generators documentation to make sure the new generator you want to add doesn't already exist.
You've reviewed and followed the Contributing guidelines.

thdaraujo · 2024-04-28T18:17:37Z

That's an interesting idea, but we'll have to think about the implications - increased memory size and extra complexity.

@salochara has been investigating another big opportunity for improving the performance of i18n translations by moving away from yaml to json files. The jump in performance would be much larger than 10% without any cost to memory usage.

I think we should look at the json vs yaml thing first, as I believe it will have a higher impact. Then we could come back to your idea to explore further improvements.

alextaujenis · 2024-04-29T16:20:52Z

@thdaraujo I agree that investigating the json vs yaml performance improvement is a great idea, but this translation cache is at a higher level than i18n. We can implement both of these performance improvements independent from each other and this PR is compatible with whatever future changes you would like to make to the translation files or i18n.

but we'll have to think about the implications - increased memory size and extra complexity

This cache does not increase memory size by any significant amount. The 2.2MB number mentioned above is an extreme case of someone using every single available Faker method within their application, and the 7.1MB number mentioned above is the even more extreme case of someone using every single available Faker method across all locales within their application. These numbers have nothing to do with real-world usage.

Just to be clear, this PR does not load the entire translation directory into memory. It is a cache that warms as the application makes calls to the Faker library.

If you were to use all Faker methods of the top 10 modules in the performance benchmark (graph above) within your application it would only increase the memory footprint by 19KB. That number is much closer to what you would see with real-world usage from this performance improvement.

thdaraujo · 2024-05-06T20:51:38Z

If you were to use all Faker methods of the top 10 modules in the performance benchmark (graph above) within your application it would only increase the memory footprint by 19KB. That number is much closer to what you would see with real-world usage from this performance improvement.

Ah okay, that makes more sense then! I will think about it, thanks for adding more context!

Interestingly, it looks like i18n already allows a cache backend to be plugged in for caching translations:
https://github.com/ruby-i18n/i18n/blob/master/lib/i18n/backend/cache.rb

thdaraujo

I'm thinking about the pros as cons of faker keeping a translation cache versus just utilizing i18n's built-in cache backend. According to their docs, we would need to pass a cache store that only needs to respond to #fetch and #write.

https://github.com/ruby-i18n/i18n/blob/7842cab498a9b1da0ddd27a986caf61c38ad776b/lib/i18n/backend/cache.rb#L16

We should also be careful with thread-safety given that faker could be running in a threaded web server. Do you think it's worth it to introduce a mutex or store the cache in a thread-local variable?

Probably not a big deal for this specific feature, as I think the only potential issue would be cache misses.

thdaraujo · 2024-05-06T20:25:09Z

lib/faker.rb

@@ -161,8 +163,9 @@ def parse(key)
      # locale is specified
      def translate(*args, **opts)
        opts[:locale] ||= Faker::Config.locale
+        translate_key = args.to_s + opts.sort.join


maybe translate_key should be hashed?

It does sound reasonable to hash the translate_key, but there is no reason to compute this digest because the translate_key (as provided in this PR) is a valid string and can be used for lookups. The extra time taken on every method call to compute even an md5 digest erases all performance gains for this PR.

Another interesting fact here is that i18n computes the sha256 digest for all cache keys if you enable their default translation cache. This suggests that implementing the i18n default cache will also erase the performance gains for this PR.

lib/faker.rb

alextaujenis added 4 commits April 19, 2024 00:47

Cache translations to reduce file read and parse operations

756767a

Disable rubocop Style/ClassVars

379b0b3

Enable rubocop Style/ClassVars

7233dfa

Merge branch 'faker-ruby:main' into translate-performance

dac7cff

thdaraujo added ⏲️ PR: In Queue for Review ⭐ PR: Feature ⚒️ Improvement labels Apr 28, 2024

thdaraujo added the 🤔 Needs Decision label Apr 28, 2024

thdaraujo reviewed May 6, 2024

View reviewed changes

thdaraujo removed the ⏲️ PR: In Queue for Review label May 6, 2024

alextaujenis added 2 commits May 8, 2024 10:32

Merge branch 'main' into translate-performance

aa82207

Convert the translate cache to a thread safe variable

559859a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: Cache translations to reduce file read and parse operations #2939

Performance: Cache translations to reduce file read and parse operations #2939

alextaujenis commented Apr 19, 2024 •

edited

thdaraujo commented Apr 28, 2024

alextaujenis commented Apr 29, 2024 •

edited

thdaraujo commented May 6, 2024

thdaraujo left a comment

thdaraujo May 6, 2024

alextaujenis May 8, 2024

Performance: Cache translations to reduce file read and parse operations #2939

Are you sure you want to change the base?

Performance: Cache translations to reduce file read and parse operations #2939

Conversation

alextaujenis commented Apr 19, 2024 • edited

Motivation / Background

Additional Information

Performance Benchmark

Checklist

thdaraujo commented Apr 28, 2024

alextaujenis commented Apr 29, 2024 • edited

thdaraujo commented May 6, 2024

thdaraujo left a comment

Choose a reason for hiding this comment

thdaraujo May 6, 2024

Choose a reason for hiding this comment

alextaujenis May 8, 2024

Choose a reason for hiding this comment

alextaujenis commented Apr 19, 2024 •

edited

alextaujenis commented Apr 29, 2024 •

edited