Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace hiredis gem with hiredis-client #11708

Merged
merged 3 commits into from Oct 31, 2023

Conversation

abdellani
Copy link
Member

What? Why?

In this PR, the dependabot is trying to upgrade the redis library to 5.0.8. The tests are failing due to what looks like a compatibility issue between redis and hiredis (the latest release was in 2018).

I found that hiredis-client is used in the example given on redis gem documentation.
In this PR, I want to check if replace hiredis with hiredis-client will solve the issue.

What should we test?

  • run some operations that require using Redis, for example, bulk printing of orders.

Release notes

Changelog Category (reviewers may add a label for the release notes):

  • User facing changes
  • API changes (V0, V1, DFC or Webhook)
  • Technical changes only
  • Feature toggled

The title of the pull request will be included in the release notes.

Dependencies

Documentation updates

@abdellani
Copy link
Member Author

abdellani commented Oct 25, 2023

The tests on 18035175968 are failing now due to a change in the behavior of the Redis client (reference)

   Failure/Error: Rails.cache.delete(@payment_method.calculator.preference_cache_key(key))
     
     TypeError:
       Unsupported command argument type: NilClass
     

redis-client-0.17.0/lib/redis_client/command_builder.rb

        command.map! do |element|
          case element
          when String
            element
          when Integer, Float, Symbol
            element.to_s
          else
            raise TypeError, "Unsupported command argument type: #{element.class}"
          end
        end

Do we sill want to upgrade?
we'll need to check the types before calling Rails.cache.delete
we'll also need to make sure that the upgrade will not introduce bugs with jobs running by Sidekiq

Tests on 18036538555 are passing on my local environment, I don't know why they not passing on the pipeline.

@abdellani abdellani marked this pull request as ready for review October 25, 2023 10:22
@dacook
Copy link
Member

dacook commented Oct 26, 2023

Thanks for investigating this. hiredis-client seems to be a good choice. It's a shame there's not a clear migration path.
I looked a bit on Ruby Toolbox: https://www.ruby-toolbox.com/compare/hiredis,hiredis-client,redis?display=table&order=score (it's confusing the gem names don't match the github names)

I wonder why we don't use the redis gem directly. @mkllnk do you have any insights you can share?

Otherwise, I agree with your approach, it makes sense to update our code to prevent sending nil parameters.

@mkllnk
Copy link
Member

mkllnk commented Oct 30, 2023

Things may have changed but we tried this before and it caused connection problems in production:

We need to investigate this further.

Copy link
Member

@mkllnk mkllnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so confusing. 😵

I just went through the different gems to find out what is doing what and my findings are:

  • We depend on the redis gem. That's all we actually need.
  • The hiredis gem was added for better performance but could be removed if it causes problems.
  • The redis gem used to include a hiredis driver. So we would use redis which would know how to leverage hiredis for better performance.
  • Version 5 of redis removed the hiredis driver and that causes the LoadError.
  • At the same time, it started using the redis-client gem which does have a hiredis driver in its "companion gem" hiredis-client.

From the readme of the redis-client gem, the upgrade path would just be to add the hiredis-client gem while keeping the hiredis gem which is the thing that does the actual work. It doesn't mention any additional require statements. I would think that bundler requires it already and the default driver is set automatically. I was able to confirm that by calling RedisClient.default_driver in the console.

I will share where I got to. But there's another issue still open:

We may need to test this on a newer staging server for a bit to see if we get that error again. Last messages on that issue suggest that even after switching to hiredis-client that error was still there. So if we don't have a particular reason to upgrade redis then maybe we can keep the current version or dump hiredis altogether hoping that the default redis is catching up on the performance.

Gemfile Show resolved Hide resolved
Gemfile Outdated Show resolved Hide resolved
@mkllnk
Copy link
Member

mkllnk commented Oct 30, 2023

I think that I'm still confused. I tried this branch now and everything seems to work even after uninstalling hiredis. And it's using the hiredis-client. Maybe the hiredis interface comes with the redis server and redis-client has everything it needs to use it? Only a performance benchmark could tell us if it actually works.

@mkllnk
Copy link
Member

mkllnk commented Oct 30, 2023

I did a quick benchmark. Without hiredis-client:

r = RedisClient.config.new_client
t0=Time.now; 100_000.times { r.call("SET", "foo", "bar") }; puts Time.now - t0;
# 4.572544891

With hiredis-client:

t0=Time.now; 100_000.times { r.call("SET", "foo", "bar") }; puts Time.now - t0;
# 3.840093155

In this simple test, we reduce execution time by 16%. Let's go with this. I'll just add a commit for cleaning up.

Copy link
Member

@dacook dacook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for looking into all of this Maikel.

Based on that info, this seems to be the right approach, and specs pass!
So we can try deploying to staging. I'd suggest all three servers given the issues had before.

In addition, it seems worth doing some testing. I didn't see any mention on that old PR about what actions caused the issues, but Maikel previously suggested bulk invoice printing which covers cable_ready, so I suggest we try that.

@dacook dacook added pr-staged-uk staging.openfoodnetwork.org.uk pr-staged-au staging.openfoodnetwork.org.au pr-staged-fr staging.coopcircuits.fr labels Oct 31, 2023
@dacook
Copy link
Member

dacook commented Oct 31, 2023

Deploy to Staging:

  • au-staging (Ubuntu 16)
  • uk-staging (Ubuntu 16)
  • fr-staging (Ubuntu 18)
TASK [deploy : precompile assets] **********************************************
fatal: [staging.coopcircuits.fr]: FAILED! 
...
Compiling...
Compilation failed:
Browserslist: caniuse-lite is outdated. Please run:
  npx update-browserslist-db@latest
  Why you should do it regularly: https://github.com/browserslist/update-db#readme

DEPRECATION WARNING: Using / for division is deprecated and will be removed in Dart Sass 2.0.0.
...
  1. Retrying..
TASK [deploy : clear transient data] *******************************************
changed: [staging.coopcircuits.fr]
Tuesday 31 October 2023  02:32:44 +0000 (0:00:14.761)       0:01:29.812 ******* 
client_loop: send disconnect: Broken pipe
Error: Process completed with exit code 255. [2023-10-31T02:37:17Z]

Died after waiting 5 mins for task to complete. Why does that "transient data" step take so long anyway?

  1. Retrying...
PLAY RECAP *********************************************************************
staging.coopcircuits.fr    : ok=31   changed=16   unreachable=0    failed=0    skipped=6    rescued=0    ignored=0   

Tuesday 31 October 2023  02:53:35 +0000 (0:00:00.738)       0:06:31.243 ******* 

Success! 🤷

@dacook
Copy link
Member

dacook commented Oct 31, 2023

Testing [au-staging]

https://staging.openfoodnetwork.org.au/admin/orders
I tried send invoices, and cancel orders, all seems to work as expected. Order state column was updated on-page as expected. Email notifications probably went to @drummer83 (sorry!).
There was one error raised in Bugsnag, but it seems unrelated, maybe due to staging data or a bug in new tax logic.

Any other testing necessary?

If not, and if deploy to all staging servers succeeds, shall we merge?

@dacook dacook removed pr-staged-uk staging.openfoodnetwork.org.uk pr-staged-au staging.openfoodnetwork.org.au pr-staged-fr staging.coopcircuits.fr labels Oct 31, 2023
@mkllnk
Copy link
Member

mkllnk commented Oct 31, 2023

The previous pull request was merged and then we noticed a flaky spec. The use of cable_ready has been increased since and we ironed out a few specs to work more reliably with reflexes. So I'm more certain now that the error would show up more consistently instead of just being flaky. Plus your testing, it all seems good. Let's merge.

@mkllnk mkllnk merged commit e85a1ce into openfoodfoundation:master Oct 31, 2023
56 checks passed
@mkllnk
Copy link
Member

mkllnk commented Nov 1, 2023

We need to revert this again. Lots of test failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants