XDS Ring Hash inconsistent after client and server restarts #6927

k-raval · 2024-01-18T11:42:54Z

GRPC Version - 1.60.1
Go Version - 1.19.1
OS - Linux 5.4.0-81-generic #91-Ubuntu SMP aarch64 GNU/Linux

Setup:

Kubernetes cluster with Istio (1.20.2) installed
3 GRPC servers (S1, S2, S3) run as one kubernetes deployment with 3 replicas
3 GRPC clients (C1, C2, C3) run as one kubernetes deployment with 3 replicas
Istio virtual service and destination rule pointing to server-service (and load balancer policy as consistent hash based on GRPC header)
Servers and Clients using GRPC XDS feature
Clients send a Hello message to server over GRPC with metadata as a header field - "XYZ"
The GRPC XDS ring hash is supposed to do hashing on this header and route it to one of the servers
ALL clients are repeating this in a loop and with same field value, and hence all messages are reaching just 1 server (which has been chosen based on hash of value "XYZ")

Problem:
When one server and one client each are killed (or crash at the same time) and they restart back up again, now the hashing pattern changes. Some clients send the message to one server, but other clients send the message to S2. The header field is still the same, however different clients now hash it to different destinations.

Expectation:
No matter in what order client or server go down and come back up, the final hashing logic should be such that a given field value should be hashed to exactly same destination.

Other Information:
Instead of using GRPC XDS hashing, if we try to use Envoy sidecar which does its own hashing, we see the expected behaviour always. However, we want to use GRPC XDS features as it avoids one hop to Envoy proxy and gives us better performance.

@easwars - we talked about similar in a question/answer format some time back #5933 . Appreciate your help here. thanks in advance.

k-raval · 2024-01-18T11:46:00Z

A point to add - Since this setup is on Kubernetes when the clients or servers crash and come back up, they come up with different IP address/names. Just for info if that makes any difference.

k-raval · 2024-01-30T12:18:39Z

Hi @ginayeh , any update ? let me know if more info is needed. Thanks.

k-raval · 2024-02-06T07:31:09Z

Hi @zasweq , @ginayeh - any update ? awaiting your inputs. Thanks.

ginayeh · 2024-02-06T23:43:28Z

Hi @k-raval, thanks for the patience. Are you only seeing this inconsistent behavior when both client AND server are killed? Would you mind sharing your example code to reproduce the ring hash inconsistent behavior?

k-raval · 2024-02-07T10:03:55Z

To answer your question, when I kill either one server or one client independently, this behaviour is not observed, but when I kill a server and client simultaneously, this problem is observed.

Attaching the codebase here. Directory contains:

Client code. <go build -o client cmd/client/main.go>
Server code <go build -o server cmd/server/main.go>
server.yaml file - Replace the path of local dir from which server binary is mounted, search for path - /home/kraval/test/grpctest
client.yaml file - Replace the path of local dir from which client binary is mounted, search for path - /home/kraval/test/grpctest

Environment:
K3s based Kubernetes cluster
Istio installed using Helm chart (just the Istio-base and Istiod, Ingress Gateway not required)

Steps:

Start server - kubectl apply -f server.yaml
Wait for servers to start, it will start 3 replicas using a deployment
Start client - kubectl apply -f client.yaml
Again 3 replicas of client will start.
Client has 4 different unique strings, 1-AAAA...., 2-BBBB...., 3-CCCC...., 4-DDDD....
With each of above unique string used as a GRPC Metadata pair, the message is sent to server
As we have setup Istio rules to hash on above string which in turn map to GRPC XDS Ring hash, the outgoing messages from client are hashed on above and they reach appropriate server.
Note the server logs of 3 servers. Each unique string reaches exactly 1 server. e.g. 1-AAAA reaches any server, say server-1, irrespective of the client sending it (1/2/3). Hence, you see 3 messages of 1-AAAA reaching the server (one from each client).
kubectl delete pod (pick any one server and client to kill)
Client and servers come back up. Again observe all the 3 server logs
It is observed that one particular unique key gets hashed to different destinations by different clients. You start seeing 1-BBBB messages reaching server-1 as well as at server-2. So, different clients are doing different hashing now.

Thanks. Let me know if you want to try something.
hashtest.tgz

k-raval · 2024-02-12T06:15:06Z

Hi @ginayeh , let me know if you need any help in reproducing the issue. Some more info : I used calico as CNI on the K3s cluster and my Linux was Ubuntu 22.04 with uname: Linux kartik-ubuntu 5.15.0-92-generic #102-Ubuntu SMP Wed Jan 10 09:37:39 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

k-raval · 2024-02-19T06:41:48Z

Hi @ginayeh , any update so far ? let me know. thanks.

k-raval · 2024-03-04T04:56:02Z

Hi @arvindbr8 , let me know if you need any help in simulating this. awaiting your reply. thanks.

arvindbr8 · 2024-03-04T18:52:31Z

@k-raval -- We appreciate your patience here. Just taking a bit to come up to context with xDS Ring Hash consistencies. Will get back to you asap.

arvindbr8 · 2024-03-12T21:32:48Z

Hi @k-raval -- apologies. We are a bit thin on the ground with multiple teammates on vacation. Hence the delayed turn back time.

I would lean on to you for understanding the behavior of istiod here. Looking at your setup, it seems like you are expecting a header based ring hash load balancing. As per the spec:

header: gRPC will support this type of hash policy. This allows hashing on a request header. The header matching logic will behave the same way as specified for route matching, as per gRFC A28 (i.e., it will ignore headers with a -bin suffix and may need special handling for content-type).

And this is the code that actually generates the hash for each RPC (

grpc-go/xds/internal/resolver/serviceconfig.go

Line 224 in 7c37770

case xdsresource.HashPolicyTypeHeader:

).

Based on the reading above, the header to use for hashing needs to come though the xDS response as a RouteAction in the hash_policy field. I'm not sure how the xDS response in an istio environment would look like and I would like your help here. Could you paste the RouteAction part of the xDS response here so that we could debug better?

PS: The particular scenario when the client and server restarts is one of those cases where the definition of persistence is different for different people. Especially because in gRPC when in the event of a down/upscale of backends (or even if there is a churn of backends) the ring changes. Also its important to note that a new gRPC channel could potentially create a different ring for the same backends based on the ordering of the endpoints response from the xdsResolver.

k-raval · 2024-03-15T13:12:47Z

Thanks @arvindbr8.
I had started 3 clients and 3 servers and collected GRPC logs of one particular client to capture XDS response. Attaching the log file here.

For your comment on the ring formation, my comment is - irrespective of the order in which XDS resolver sends the list of server endpoints, the client can order them probably alphabetically to construct the ring, so that all different clients will follow the same behaviour and when we will take a hash on a parameter, it will result in choosing same destination server "across clients". Let me know if this makes sense ?

xds.txt

arvindbr8 · 2024-03-18T21:15:07Z

Thanks for the logs. The RouteAction looks correct to me with the hash_policy providing the header to use.

"hash_policy":{"header":{"header_name":"ULID"}}

Also, I should rephrase my previous comment about ring formation.

the ring changes

what I meant here is that, its hard to determine if the ring is going to look alike across all the clients. But the current hashing algorithm tries to ensure that an address update only a small portion of the keys in the ring are updates in the same binary. But when the client restarts, the shape of the ring might look totally different

For your comment on the ring formation, my comment is - irrespective of the order in which XDS resolver sends the list of server endpoints, the client can order them probably alphabetically to construct the ring, so that all different clients will follow the same behaviour and when we will take a hash on a parameter, it will result in choosing same destination server "across clients". Let me know if this makes sense ?

The ring hash LB policy is implemented as per the xDS spec. Keeping consistency over different runs of the client doesn't seem to be in the scope of this LB policy. Seems like Ring Hash LB Policy might be too barebones for your usecase here.

A55-xds-stateful-session-affinity.md might something you might be interested in- but with the caveats of client/server shutdown cases. But this feature is pending implementation in golang and has not been prioritized yet.

k-raval · 2024-03-20T03:24:27Z

Hi @arvindbr8 , thanks for explanation.

My requirement is simple. For a given parameter (e.g. ULID string above), I want all the different clients independently started and stateless to hash it to SAME destination server endpoint. If this costs me un-equal load distribution, I am fine with it. If the server endpoints change (add/delete/crash-restart), the same ULID may get hashed to any other server endpoint, that is fine, but I need all the clients to follow the same behaviour independently without exchanging any state.

And I see this working with Envoy proxy doing the load balancing using same configuration. However, to avoid the cost of proxy, I wanted to used GRPC proxy less mode using XDS. Hence expected same behaviour here as well.

Let me know if there is any solution you may see since you are much more into the GRPC codebase.

Thanks again and awaiting your reply.

arvindbr8 · 2024-03-20T16:56:55Z

Let me followup with the team to get you better guidance for this usecase. Meanwhile, could you confirm that in the case with Envoy proxy you are also using the same ring hash LB policy?

k-raval · 2024-04-03T16:54:05Z

Thanks. The logs were as it is, nothing stripped from it. I can test from your branch with logs added. Let me know the branch.

arvindbr8 · 2024-04-09T20:05:24Z

Apologies for the delay. Could you try with this branch ?

github-actions · 2024-04-15T20:45:49Z

This issue is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

arvindbr8 · 2024-04-15T21:10:49Z

Removing the stale label to give more time on for the reproduction, since there might be a potential bug here.

k-raval · 2024-04-16T09:25:40Z

Hi @arvindbr8, I was able to reproduce the issue with your branch with logs enabled.

Sequence of events:

3 clients, 3 servers running
Client-2 and Server-2 killed at the same time.
Somehow, no issue was observed
Client-1 and Server-1 killed at the same time
Issue now observed.

Attaching logs:
client2.txt - client2 logs before restart
client2-restarted.txt - client2 logs after restart
client1.txt - client1 logs before restart
client1-restarted.txt - client1 logs after restart (issue is seen here in these logs)
client3.txt - client3 - never restarted, has all events - both client1 and client2 restart in logs.

testlogs.tgz

arvindbr8 · 2024-04-17T21:22:33Z

Thanks.

Somehow, no issue was observed

Shows evidence of not being deterministic. I'm sort of inclined now to believe if there is an interaction with istiod ADS responses which makes the behavior of gRPC non deterministic. Anyways, I'll take a look

arvindbr8 · 2024-04-17T22:47:04Z

I feel like I have found what the issue could be. I have pushed some more commits to https://github.com/arvindbr8/grpc-go/tree/dbug_6927. This contains a potential fix and other debug messages (which would be useful in case the fix does not completely fix the issue)

Do you mind running the test again with the HEAD @ https://github.com/arvindbr8/grpc-go/tree/dbug_6927

k-raval · 2024-04-18T11:24:13Z

Hi @arvindbr8 ,happy to inform that with your new commit, the issue seems to have been fixed. attaching the logs again for your review. thanks a lot for putting so much effort.

Same procedure I tried twice and each time I did two sets of restarts (each set consisting of one client and one server).
Attaching logs from both runs.

logsnew2.tgz
logsnew.tgz

github-actions · 2024-04-24T12:52:48Z

This issue is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

k-raval added the Type: Bug label Jan 18, 2024

zasweq assigned ginayeh Jan 24, 2024

ginayeh added the Status: Requires Reporter Clarification label Feb 6, 2024

ginayeh assigned k-raval Feb 6, 2024

k-raval removed their assignment Feb 12, 2024

dfawley removed the Status: Requires Reporter Clarification label Feb 14, 2024

arvindbr8 assigned arvindbr8 and unassigned ginayeh Feb 28, 2024

arvindbr8 assigned k-raval and unassigned arvindbr8 Mar 12, 2024

arvindbr8 added the Status: Requires Reporter Clarification label Mar 12, 2024

k-raval removed their assignment Mar 15, 2024

arvindbr8 removed the Status: Requires Reporter Clarification label Mar 18, 2024

arvindbr8 added the Status: Requires Reporter Clarification label Mar 18, 2024

arvindbr8 removed the Status: Requires Reporter Clarification label Mar 20, 2024

arvindbr8 added the Status: Requires Reporter Clarification label Mar 20, 2024

arvindbr8 removed the Status: Requires Reporter Clarification label Apr 3, 2024

arvindbr8 assigned arvindbr8 and unassigned k-raval Apr 3, 2024

arvindbr8 assigned k-raval and unassigned arvindbr8 Apr 9, 2024

arvindbr8 added the Status: Requires Reporter Clarification label Apr 9, 2024

github-actions bot added the stale label Apr 15, 2024

arvindbr8 removed the stale label Apr 15, 2024

k-raval removed their assignment Apr 16, 2024

aranjans assigned arvindbr8 Apr 17, 2024

dfawley removed the Status: Requires Reporter Clarification label Apr 17, 2024

arvindbr8 assigned k-raval and unassigned arvindbr8 Apr 17, 2024

arvindbr8 added Status: Requires Reporter Clarification P0 labels Apr 17, 2024

k-raval removed their assignment Apr 18, 2024

github-actions bot added the stale label Apr 24, 2024

dfawley removed the Status: Requires Reporter Clarification label Apr 24, 2024

dfawley assigned arvindbr8 Apr 24, 2024

arvindbr8 removed the stale label Apr 30, 2024

arvindbr8 linked a pull request Apr 30, 2024 that will close this issue

ringhash: fix normalizeWeights #7156

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XDS Ring Hash inconsistent after client and server restarts #6927

XDS Ring Hash inconsistent after client and server restarts #6927

k-raval commented Jan 18, 2024

k-raval commented Jan 18, 2024

k-raval commented Jan 30, 2024

k-raval commented Feb 6, 2024

ginayeh commented Feb 6, 2024

k-raval commented Feb 7, 2024

k-raval commented Feb 12, 2024

k-raval commented Feb 19, 2024

k-raval commented Mar 4, 2024

arvindbr8 commented Mar 4, 2024

arvindbr8 commented Mar 12, 2024

k-raval commented Mar 15, 2024

arvindbr8 commented Mar 18, 2024

k-raval commented Mar 20, 2024

arvindbr8 commented Mar 20, 2024

k-raval commented Apr 3, 2024

arvindbr8 commented Apr 9, 2024

github-actions bot commented Apr 15, 2024

arvindbr8 commented Apr 15, 2024

k-raval commented Apr 16, 2024

arvindbr8 commented Apr 17, 2024 •

edited

arvindbr8 commented Apr 17, 2024

k-raval commented Apr 18, 2024

github-actions bot commented Apr 24, 2024

XDS Ring Hash inconsistent after client and server restarts #6927

XDS Ring Hash inconsistent after client and server restarts #6927

Comments

k-raval commented Jan 18, 2024

k-raval commented Jan 18, 2024

k-raval commented Jan 30, 2024

k-raval commented Feb 6, 2024

ginayeh commented Feb 6, 2024

k-raval commented Feb 7, 2024

k-raval commented Feb 12, 2024

k-raval commented Feb 19, 2024

k-raval commented Mar 4, 2024

arvindbr8 commented Mar 4, 2024

arvindbr8 commented Mar 12, 2024

k-raval commented Mar 15, 2024

arvindbr8 commented Mar 18, 2024

k-raval commented Mar 20, 2024

arvindbr8 commented Mar 20, 2024

k-raval commented Apr 3, 2024

arvindbr8 commented Apr 9, 2024

github-actions bot commented Apr 15, 2024

arvindbr8 commented Apr 15, 2024

k-raval commented Apr 16, 2024

arvindbr8 commented Apr 17, 2024 • edited

arvindbr8 commented Apr 17, 2024

k-raval commented Apr 18, 2024

github-actions bot commented Apr 24, 2024

arvindbr8 commented Apr 17, 2024 •

edited