Introduce grpc health check in etcd client #16276

chaochn47 · 2023-07-20T15:49:06Z

What would you like to be added?

Background

gRPC Health checks are used to probe whether the server is able to handle RPCs. A server may choose to reply “unhealthy” because it is not ready to take requests, it is shutting down or some other reason. The client can act accordingly if the response is not received within some time window or the response says unhealthy in it.

ref.

#8121 added basic grpc health service only on server side since etcd v3.3.

etcd/server/etcdserver/api/v3rpc/grpc.go

Lines 81 to 86 in 6220174

    
           // server should register all the services manually 
        
           // use empty service name for all etcd services' health status, 
        
           // see https://github.com/grpc/grpc/blob/master/doc/health-checking.md for more 
        
           hsrv := health.NewServer() 
        
           hsrv.SetServingStatus("", healthpb.HealthCheckResponse_SERVING) 
        
           healthpb.RegisterHealthServer(grpcServer, hsrv)

Problem

In a multi etcd server endpoints scenario, etcd client only fails over to the other endpoint when existing connection/channel is in not in Ready state. However, etcd client does not know about if etcd server can handle RPCs.

For example

It needs a comprehensive design and testing.

Placeholder google doc etcd client grpc health check copied from the KEP template.

Why is this needed?

Improve etcd availability by failing over to other healthy etcd endpoints

The text was updated successfully, but these errors were encountered:

chaochn47 · 2023-07-20T15:51:44Z

@serathius @ahrtr @wenjiaswe @ptabor @etcd-io/members for discussion if this feature is wanted.

fuweid · 2023-07-23T02:42:24Z

Hi @chaochn47 , based on your PR #16278, it seems like that the PR marks not_serving status because of defrag. It should be applied to any maintenance APIs, is it correct? My only concern is that based on that https://github.com/grpc/grpc-go/blob/d7f45cdf9ae720256d328fcdb6356ae58122a7f6/health/client.go#L104, it might impact the on-going requests(still in trasfering data) and force to recreate connection. (I haven't verified it yet, update it later)

serathius · 2023-07-24T10:00:19Z

For me this duplicates #16007. We need single comprehensive design for etcd probing. As part of this work we can update the grpc probes.

chaochn47 · 2023-07-24T15:56:31Z

Thanks @fuweid for the review!

based on your PR #16278, it seems like that the PR marks not_serving status because of defrag. It should be applied to any maintenance APIs, is it correct?

Online Defrag is one of the use cases that server reports "unhealthy" to client to fail over the requests. I do not plan to support any other maintenance APIs now. #16278 is just a POC and health server should not exposed to maintenance server, while to defrag API only ideally.

My only concern is that based on that https://github.com/grpc/grpc-go/blob/d7f45cdf9ae720256d328fcdb6356ae58122a7f6/health/client.go#L104, it might impact the on-going requests(still in trasfering data) and force to recreate connection.

I won't be too concerned about this failure mode since it is not called out in https://github.com/grpc/proposal/blob/master/A17-client-side-health-checking.md and grpc/proposal#97. But I will validate with adding some testing cases in https://github.com/grpc/grpc-go/blob/master/test/healthcheck_test.go in case it's an unexpected golang specific implementation issue.

chaochn47 · 2023-07-24T16:05:45Z

Thanks @serathius for taking a look!

For me this duplicates #16007. We need single comprehensive design for etcd probing. As part of this work we can update the grpc probes.

I don't agree that it's a duplicate of #16007. I believe /livez and /readyz concepts are brought from https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/ and it's a polling health check mechanism. It is designed for etcd administration to determine when to restart/replace etcd with a new process / container or when to route the client traffic to etcd server.

However, gRPC client side health check replies on etcd (gRPC) server pushes its health status to client for requests fail-over via a server side streaming API. It is designed only for round-robin policy and will improve etcd availability.

I do agree some of the server health status determination logic can be re-used but the two proposals don't share the same motivation. For example, we definitely do not want terminating etcd server when online defrag is active but expect client to route requests to other etcd servers.

@wenjiaswe @logicalhan please drop a comment in case I misunderstood the proposal #16007, thx~

serathius · 2023-07-25T07:22:31Z

However, gRPC client side health check replies on etcd (gRPC) server pushes its health status to client for requests fail-over via a server side streaming API. It is designed only for round-robin policy and will improve etcd availability.

Pull push, just changes who has the control over validation period. Server or client.

What's important here is definition of etcd health. I expect both types of health checking will end up with pretty similar definition of health.

chaochn47 · 2023-07-25T20:22:08Z

Thanks. I think we are on the same page now.

Do you still think it's a duplicate of #16007? Will it cover how to consume the health status change?

I would like the etcd client to fail over to other healthy endpoints considering etcd health, not just connection is up or down.

chaochn47 · 2023-07-25T20:27:23Z

/cc @jyotima

wenjiaswe · 2023-07-26T04:25:54Z

gRPC Health checks are used to probe whether the server is able to handle RPCs.

That sounds like one case of not ready for /readyz, is my understanding correct?

chaochn47 · 2023-07-26T06:03:39Z

Yeah I think so but I want this thread focus on etcd client fail over by introducing gRPC health checks.

I suggest we move the conversation about etcd failure modes classification into which probes to #16007

serathius · 2023-07-26T09:48:56Z

This issue can be left opened, as a followup to introduce grpc client health checks when we figure out proper health definitions.

ahrtr · 2023-07-26T13:47:01Z

It's a valid feature to me, and I agree we should design it together with the Livez/Readyz feature, because the key is how to define health (e.g. get stuck on write for a long time > a threshold).

Also two detailed comments:

We plan to get rid of defragmentation in 3.7 in Support customizing the rebalance threshold bbolt#422. [But I do not get time to do a PoC so far :( ]
The PR gRPC health server sets serving status to NOT_SERVING on defrag #16278 is still depending on an experimental gRPC API WithResolvers, but we want to remove the dependency on the experimental gRPC API in Remove the dependency on grpc-go's experimental API (e.g. resolver & balancer) #15145

chaochn47 · 2023-07-27T00:21:35Z

Thanks @ahrtr for the review,

Definitely we want any feature request to be future-proof and won't be thrown away. However, unless we have a foreseeable approach to resolve the above two items in the upcoming year, this issue can be left on the table.

chaochn47 · 2023-08-01T15:47:24Z

My only concern is that based on that https://github.com/grpc/grpc-go/blob/d7f45cdf9ae720256d328fcdb6356ae58122a7f6/health/client.go#L104, it might impact the on-going requests(still in trasfering data) and force to recreate connection. (I haven't verified it yet, update it later)

Hi @fuweid, I think the on-going RPCs can still proceed, please check the simulated test case chaochn47/grpc-go#1. Thank you ~

I will polish it up and try merge it to the gRPC-go tests.

One caveat is if the watch stream is not interrupted, client that is built on top of the watch as a cache may be stale. In order to terminate the watch stream, I guess the etcd server needs to be stopped.

chaochn47 · 2023-08-02T06:15:31Z

One caveat is if the watch stream is not interrupted, client that is built on top of the watch as a cache may be stale. In order to terminate the watch stream, I guess the etcd server needs to be stopped.

FWIW, when the disk I/O is blocked/stalled, partitioned node stays with state s.Leader() != types.ID(raft.None). Watch is not cancelled because of monitorLeader depends on s.Leader() == types.ID(raft.None). It is explained in #13527 and reproduce steps provided.

kube-apiserver watch cache built on top of etcd watch will not receive update and could potentially receive stale data. Instead, the watch should be cancelled by server and requires client to re-create watch on a different etcd node that is connected to quorum.

So after etcd health status is properly defined, it should be consumed by monitorLeader as well.

fuweid · 2023-08-02T12:59:54Z

I think the on-going RPCs can still proceed, please check the simulated test case chaochn47/grpc-go#1. Thank you ~

@chaochn47 Thanks for the confirm. I also verified it in my local. It won't impact streaming or ongoing unary requests. Side note: if the client, like etcdctl, doesn't use the health profile, the client can still get the connection. I think the existing pod's exec liveness probe won't be impacted.

lavacat · 2023-08-05T23:36:51Z

I think this is very useful specifically for defrag. In high db size, high traffic clusters this is a real problem. When maxGapBetweenApplyAndCommitIndex is reached and defrag is still blocking applies, etcd will fail requests.

chaochn47 added the type/feature label Jul 20, 2023

This was referenced Jul 20, 2023

gRPC health server sets serving status to NOT_SERVING on defrag #16278

Merged

Documentation: add roadmap #16279

Merged

chaochn47 mentioned this issue Aug 1, 2023

test/healthcheck_test.go: verify existing RPCs can proceed chaochn47/grpc-go#1

Open

ahrtr mentioned this issue Sep 7, 2023

Defrag on etcd node causing API 500s kubernetes/kubernetes#93280

Open

BobyMCbobs mentioned this issue Oct 26, 2023

Planning etcd contribfest for kubecon #16804

Closed

2 tasks

tjungblu mentioned this issue Apr 30, 2024

Livez/Readyz #16007

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce grpc health check in etcd client #16276

Introduce grpc health check in etcd client #16276

chaochn47 commented Jul 20, 2023 •

edited

chaochn47 commented Jul 20, 2023

fuweid commented Jul 23, 2023

serathius commented Jul 24, 2023

chaochn47 commented Jul 24, 2023 •

edited

chaochn47 commented Jul 24, 2023 •

edited

serathius commented Jul 25, 2023 •

edited

chaochn47 commented Jul 25, 2023

chaochn47 commented Jul 25, 2023

wenjiaswe commented Jul 26, 2023

chaochn47 commented Jul 26, 2023

serathius commented Jul 26, 2023

ahrtr commented Jul 26, 2023

chaochn47 commented Jul 27, 2023

chaochn47 commented Aug 1, 2023 •

edited

chaochn47 commented Aug 2, 2023

fuweid commented Aug 2, 2023

lavacat commented Aug 5, 2023

Introduce grpc health check in etcd client #16276

Introduce grpc health check in etcd client #16276

Comments

chaochn47 commented Jul 20, 2023 • edited

What would you like to be added?

Background

Problem

Why is this needed?

chaochn47 commented Jul 20, 2023

fuweid commented Jul 23, 2023

serathius commented Jul 24, 2023

chaochn47 commented Jul 24, 2023 • edited

chaochn47 commented Jul 24, 2023 • edited

serathius commented Jul 25, 2023 • edited

chaochn47 commented Jul 25, 2023

chaochn47 commented Jul 25, 2023

wenjiaswe commented Jul 26, 2023

chaochn47 commented Jul 26, 2023

serathius commented Jul 26, 2023

ahrtr commented Jul 26, 2023

chaochn47 commented Jul 27, 2023

chaochn47 commented Aug 1, 2023 • edited

chaochn47 commented Aug 2, 2023

fuweid commented Aug 2, 2023

lavacat commented Aug 5, 2023

chaochn47 commented Jul 20, 2023 •

edited

chaochn47 commented Jul 24, 2023 •

edited

chaochn47 commented Jul 24, 2023 •

edited

serathius commented Jul 25, 2023 •

edited

chaochn47 commented Aug 1, 2023 •

edited