Closed
Description
Bug Report / Help wanted
We have a simple 6-node redis5 cluster: no SSL, no password, bare-metal deployment, can be connected via machine IP: port directly.
And the application(running in Kubernetes environment) used to be normal until I upgrade the lettuce-core to 6.2.X.
Seems the application is stuck at connecting redis-cluster when starting up, until it is killed by Kubernetes liveness probe.
Below are the logs:
application logs under 6.2.X
After the last error='ERR unknown command HELLO
output, the timeout error stack was thrown after about 1 minute.
2023-02-01 16:32:24,741 DEBUG io.lettuce.core.protocol.CommandHandler [channel=0x3e3653f5, /10.233.75.132:55582 -> /192.168.99.171:7004, epid=0x5, chid=0x6] Completing command LatencyMeteredCommand [type=HELLO, output=GenericMapOutput [output=null, error='ERR unknown command `HELLO`, with args beginning with: `3`, `AUTH`, `default`, ``, '], commandType=io.lettuce.core.protocol.AsyncCommand]
2023-02-01 16:32:24,741 DEBUG io.lettuce.core.protocol.RedisStateMachine Decode done, empty stack: true
2023-02-01 16:32:24,741 DEBUG io.lettuce.core.protocol.CommandHandler [channel=0x892d4fa0, /10.233.75.132:42120 -> /192.168.99.171:7005, epid=0x6, chid=0x4] Completing command LatencyMeteredCommand [type=HELLO, output=GenericMapOutput [output=null, error='ERR unknown command `HELLO`, with args beginning with: `3`, `AUTH`, `default`, ``, '], commandType=io.lettuce.core.protocol.AsyncCommand]
2023-02-01 16:33:24,606 DEBUG io.lettuce.core.AbstractRedisClient Connecting to Redis at 192.168.99.171/<unresolved>:7000, initialization: 192.168.99.171/<unresolved>:7000
io.lettuce.core.RedisCommandTimeoutException: Connection initialization timed out after 1 minute(s)
at io.lettuce.core.protocol.RedisHandshakeHandler.lambda$channelRegistered$0(RedisHandshakeHandler.java:67)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:1589)
2023-02-01 16:33:24,606 DEBUG io.lettuce.core.AbstractRedisClient Connecting to Redis at 192.168.99.171/<unresolved>:7001, initialization: 192.168.99.171/<unresolved>:7001
io.lettuce.core.RedisCommandTimeoutException: Connection initialization timed out after 1 minute(s)
at io.lettuce.core.protocol.RedisHandshakeHandler.lambda$channelRegistered$0(RedisHandshakeHandler.java:67)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.PromiseTask.run(PromiseTask.java:106)
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
at io.netty.util.concurrent.DefaultEventExecutor.run(DefaultEventExecutor.java:66)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:1589)
2023-02-01 16:33:24,608 DEBUG io.lettuce.core.RedisChannelHandler closeAsync()
Metadata
Metadata
Assignees
Labels
Type
Projects
Relationships
Development
No branches or pull requests
Activity
liyuntao commentedon Feb 1, 2023
liyuntao commentedon Feb 14, 2023
Hi @mp911de I feel like I've found the problem. There may exist some reason that hides the actual exception in 6.2.X.
Since we were using redis5, I tried to change the client option to protoVer=RESP2 explicitly, and the error message becomes:
and then I tried to pass an empty string to RedisURI.withPassword(), same error again.
Finally, the application can start nicely only when
null
value being explicitly passed to builder method. e.g.So a redis5 cluster without a password and such code below can reproduce the problem.
Summary:
Auto-discovery
+no password protected redis5
[-]cannot connect to Redis5 cluster under 6.2.X[/-][+]Fallback to RESP2 hides potential authentication configuration problems[/+]mp911de commentedon Feb 14, 2023
You're right, this is a problem on our side where we do not properly surface configuration issues with the password. Since Redis passwords can be empty (although not recommended), we should accept empty passwords.
Propagate exceptions during connection handshake after protocol downg…
Propagate exceptions during connection handshake after protocol downg…