Fix Load Balancer Health Checks #724

bbassingthwaite · 2024-05-16T02:44:22Z

Our current health check implementation is flawed. The external load balancer should not be responsible for health checking the pods since Kubernetes is already doing that. It instead should be health checking the worker nodes themselves. This is also dependent on the externalTrafficPolicy set on the Service itself.

If the externalTrafficPolicy=Local then a health check node port is created to serve health checks for the node. If there are 1 or more active pods on the node, then the health check will return 200 and it will indicate how many healthy pods are running. If there is no active pods, then it will return a 503 and will fail the health checks. It is necessary for the LB to health check this dedicated endpoint to ensure we respect the lifecycle of the worker node and the pods running on it.

If the externalTrafficPolicy=Cluster then we should health check kube-proxy running on the node. This is controlled by --healthz-bind-address which defaults to 0.0.0.0:10256. Health checking kube-proxy allows us to follow the lifecycle of the node such as whether it is up and ready to serve traffic, or whether we should stop routing traffic because it is tainted, is being removed due to scaling down, or being removed due to maintenance. In this case, kube proxy is responsible for understanding the health of each individual pod and managing which pods should receive traffic. In our current behaviour, if a pod goes unhealthy, it can incorrectly mark entire worker nodes as unhealthy which isn't the case. If/when we make the switch to full cilium, we will need to make sure we set --kube-proxy-replacement-healthz-bind-address.

This change will delegate pod healthiness to Kubernetes, which it should have always done. It will instead start health checking the Kubernetes components themselves to ensure they are up and ready to process traffic, and we can stop sending traffic to them when they are no longer healthy or are being drained for any number of reasons. Pod unhealthiness will no longer mark nodes as unhealthy. This should also fix a number of edge cases customers are experiencing when using cluster autoscaling or pod autoscalers.

timoreimann

This looks good to me in general.

A few quick points:

We should probably do a bit of extra QA/testing to make sure this behaves as expected end-to-end for both the Local and Cluster policies. I can help with that.
Should we add a quick section to our implementation details documentation describing how health checking is configured for each policy (and reasoning why maybe so that customers don't wonder why they can't configure it ony their own anymore)?
Could you please add a change log item already highlighting the change, especially as far as it concerns the "interface changes" (annotations being dropped in favor of a static and better LB health check configuration)?
We should check in with the corresponding team internally to make sure the changes here will be reflected in our production documentation page as well.

cloud-controller-manager/do/loadbalancers.go

timoreimann · 2024-05-16T08:36:41Z

Forgot one point: can you please update the annotations documentation as well accordingly?

gottwald

Overall this looks awesome. Thank you Braden.
The only thing I have slight concerns with is the removal of annotations w/o deprecation period.

cloud-controller-manager/do/loadbalancers.go

timoreimann

PR needs a rebase now.

timoreimann · 2024-05-16T17:12:29Z

cloud-controller-manager/do/loadbalancers_test.go

@@ -2184,327 +2184,63 @@ func Test_buildHealthCheck(t *testing.T) {
 		errMsgPrefix string
 	}{
 		{
-			name: "tcp health check",


Should we bring some/all of the tests back now that we still support it (through the fallback annotation)?

I have added them back

timoreimann · 2024-05-16T17:13:06Z

CHANGELOG.md

+When `ExternalTrafficPolicy=Local`, the configured health check node port will be used which indicates whether the node has active pods.
+In both scenarios, the change will have a positive effect on load balancing behaviour. This will ensure that during life cycle changes within the cluster such as
+node autoscaling, node taints, pods going up and down will have the proper behaviour to ensure we don't send traffic to components that are not in a state to serve traffic.
+* A new annotation was added `service.beta.kubernetes.io/do-loadbalancer-revert-to-old-health-check` has been added to


Could you please add the new annotation to the annotations documentation?

Our current health check implementation is flawed. The external load balancer should not be responsible for health checking the pods since Kubernetes is already doing that. It instead should be health checking the worker nodes themselves. It is also dependent on the externalTrafficPolicy set on the Service itself. If the `externalTrafficPolicy=Local` then a health check node port is created to serve health checks for the node. If there are 1 or more active pods on the node, then the health check will return 200 and it will indicate how many healthy pods are running. If there is no active pods, then it will return a 503 and will fail the health checks. It is necessary for the LB to health check this dedicated endpoint to ensure we respect the lifecycle of the worker node and the pods running on it. If the `externalTrafficPolicy=Cluster` then we should health check kube-proxy running on the node. This is controlled by `--healthz-bind-address` which defaults to `0.0.0.0:10256`. Health checking kube-proxy allows us to follow the lifecycle of the node such as whether it is up and ready to serve traffic, or whether we should stop routing traffic because it is tainted, is being removed due to scaling down, or being removed due to maintenance. In this case, kube proxy is responsible for understanding the health of each individual pod and managing which pods should receive traffic. In our current behaviour, if a pod goes unhealthy, it can incorrectly mark entire worker nodes as unhealthy which isn't the case. If/when we make the switch to full cilium, we will need to make sure we set `--kube-proxy-replacement-healthz-bind-address`. This change will delegate pod healthiness to Kubernetes, which it should have always done. It will instead start health checking the Kubernetes components themselves to ensure they are up and ready to process traffic, and we can stop sending traffic to them when they are no longer healthy or are being drained for any number of reasons. Pod unhealthiness will no longer mark nodes as unhealthy. This should also fix a number of edge cases customers are experiencing when using cluster autoscaling or pod autoscalers.

…eck` annotation.

timoreimann

LGTM. Please squash-merge once we're ready.

bbassingthwaite force-pushed the bbass/LBAAS-3249/health branch from 1f3810c to a6e61bd Compare May 16, 2024 02:50

timoreimann reviewed May 16, 2024

View reviewed changes

cloud-controller-manager/do/loadbalancers.go Show resolved Hide resolved

gottwald reviewed May 16, 2024

View reviewed changes

cloud-controller-manager/do/loadbalancers.go Show resolved Hide resolved

timoreimann reviewed May 16, 2024

View reviewed changes

bbassingthwaite force-pushed the bbass/LBAAS-3249/health branch from 2d86330 to 3d31011 Compare May 21, 2024 14:52

bbassingthwaite added 2 commits May 21, 2024 09:02

Keep existing logic and gate it behind a new `revert-to-old-health-ch…

ed70862

…eck` annotation.

bring back health check tests

eb49f1f

bbassingthwaite force-pushed the bbass/LBAAS-3249/health branch from 3d31011 to eb49f1f Compare May 21, 2024 16:19

timoreimann approved these changes May 27, 2024

View reviewed changes

bbassingthwaite merged commit 47d22d7 into master May 27, 2024
3 checks passed

bbassingthwaite deleted the bbass/LBAAS-3249/health branch May 27, 2024 21:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Load Balancer Health Checks #724

Fix Load Balancer Health Checks #724

bbassingthwaite commented May 16, 2024

timoreimann left a comment

timoreimann commented May 16, 2024

gottwald left a comment

timoreimann left a comment

timoreimann May 16, 2024

bbassingthwaite May 21, 2024

timoreimann May 16, 2024

bbassingthwaite May 21, 2024

timoreimann left a comment

Fix Load Balancer Health Checks #724

Fix Load Balancer Health Checks #724

Conversation

bbassingthwaite commented May 16, 2024

timoreimann left a comment

Choose a reason for hiding this comment

timoreimann commented May 16, 2024

gottwald left a comment

Choose a reason for hiding this comment

timoreimann left a comment

Choose a reason for hiding this comment

timoreimann May 16, 2024

Choose a reason for hiding this comment

bbassingthwaite May 21, 2024

Choose a reason for hiding this comment

timoreimann May 16, 2024

Choose a reason for hiding this comment

bbassingthwaite May 21, 2024

Choose a reason for hiding this comment

timoreimann left a comment

Choose a reason for hiding this comment