Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make autopipelining group commands by slot instead of by node #1693

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gustavotoyota
Copy link

This change solved my problems with auto-pipelining + clusters.
ioredis itself states that pipeline commands should target keys in the same slot:
https://github.com/luin/ioredis/blob/807cfc0e211f72885e00228edc5e72878916a938/lib/Pipeline.ts#L309

@gustavotoyota
Copy link
Author

This is the issue I opened where I explored the problem and found the solution: #1692

@luin luin requested a review from TysonAndre December 16, 2022 07:09
@TysonAndre
Copy link
Collaborator

TysonAndre commented Dec 16, 2022

This looks correct but may be a regression for throughput (especially if there is a long backlog of commands and tcp buffers fill up, though I'm confused about the specifics of node tcp/stream internals with noDelay)

    - `retryDelayOnMoved`: By default, this value is `0` (in ms), which means when a `MOVED` error is received, the client will resend
      the command instantly to the node returned together with the `MOVED` error. However, sometimes it takes time for a cluster to become
      state stabilized after a failover, so adding a delay before resending can prevent a ping pong effect.

I wonder if the lib/cluster/index.ts handling of "MOVED" is wrong for pipelines and it's retrying the entire pipeline? I'm not familiar with how that works

From the code of lib/Pipeline.ts - it looks like the entire pipeline is re-execed on moved:, which would make this fix appropriate

        const cluster = this.redis as Cluster;
        cluster.handleError(commonError, this.leftRedirections, {
          moved: function (_slot: string, key: string) {
            _this.preferKey = key;
            cluster.slots[errv[1]] = [key];
            cluster._groupsBySlot[errv[1]] =
              cluster._groupsIds[cluster.slots[errv[1]].join(";")];
            cluster.refreshSlotsCache();
            _this.exec();
          },

Note that there's 65536 slots, so this would have some impact on throughput (e.g. a hundred commands might still be a hundred stream writes). I wonder if it'd be better to group the writes from the pipelines (that would go to the same client belonging to the cluster) into a single write() rather than executing multiple write()s to avoid head of line blocking and keep throughput high

@TysonAndre
Copy link
Collaborator

#1377 (comment)

Never mind, I've been misunderstanding what autopipelining does. https://en.wikipedia.org/wiki/Protocol_pipelining is used whether or not enableAutoPipelining is enabled, enableAutoPipelining just reduces the tcp overhead by combining smaller writes into a larger write and sends fewer packets as a result

@TysonAndre
Copy link
Collaborator

From the code of lib/Pipeline.ts - it looks like the entire pipeline is re-execed on moved:, which would make this fix appropriate

This PR definitely looks appropriate, especially since the error handling is only done when all errors are exactly the same (e.g. the old way wouldn't handle multiple slots being moved).

Boohi added a commit to Boohi/ioredis that referenced this pull request Feb 13, 2023
@JTaylor-myenergi
Copy link

@TysonAndre @luin Is there any reason this wasn't merged? We have applied this as a local patch to fix the same autopipelining issues with our Redis Clusters.

@AVVS
Copy link
Collaborator

AVVS commented Mar 21, 2024

My 2 cents is that you shouldnt use autopipelining if slots are frequently moved. Idea is that you combine requests that happen during the tick into as few large pipes as possible. One node hosts many slots, by default you have 16k slots, so this basically disables benefits of autopipelining as you’d have many small groups instead of few large ones

-1 to this pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants