Ensure command details are populated for Cluster clients #1026
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I recently stumbled over a corner case in the clustered client which is sort of interesting:
Background
If the Redis nodes that the client is connecting to are degraded when
CommandLoader
is called, the command details hash returned will be{}
- the client essentially doesn't understand what any of the commands' syntax is.redis-rb/lib/redis/cluster/command_loader.rb
Lines 13 to 20 in 688ac95
(This is pretty rare in practice. The queries made by
SlotLoader
andNodeLoader
immediately prior to this need to have worked - otherwise those will raise - but then all nodes must be unreachable when we're runningCommandLoader
.)These command details are later used by
Command
to process a command into a key, so that the command can be routed to the right node. If the command details are empty and we don't understand the command syntax,Command.determine_first_key_position
doesn't know which command argument to get the hash from, soCommand.extract_first_key
returns''
:redis-rb/lib/redis/cluster/command.rb
Lines 50 to 61 in 688ac95
redis-rb/lib/redis/cluster/command.rb
Lines 14 to 16 in 688ac95
...and that means that
Cluster.find_node_key
returnsnil
, because we don't know where the command needs to be routed. This is handled relatively gracefully -find_node
routes to a random node if we don't have a specific node to route to.redis-rb/lib/redis/cluster.rb
Lines 258 to 260 in 688ac95
redis-rb/lib/redis/cluster.rb
Lines 272 to 273 in 688ac95
However, routing commands randomly means most commands are incurring an extra hop, because the randomly selected node is most likely wrong, and will need to redirect the client to the correct node. This works, but is a bit slower, and unnecessarily takes a dependency on an extra node (ie: if either of the random node or the node our data is on is degraded, the command will fail). The client gets wedged in this wonky state for its entire lifetime, because there's no mechanism to later update the command details. Because clients are intended to be long-lived, for most use cases, this means until the process is restarted.
Change Summary
This patch changes the client to require that the initial fetch of command details succeeds, and raises a
CannotConnectError
if that's not possible. Users should already handle this error, and retry creating the client later, or use some other suitable failure handling mechanism. SinceSlotLoader
andNodeLoader
can already raiseCannotConnectError
, this shouldn't be a breaking change to the gem's external interface.