New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

KEP-4631: LoadBalancer Service Status Improvements, initial proposal #4632

Draft

danwinship wants to merge 1 commit into kubernetes:master from danwinship:loadbalancerstatus

Contributor

danwinship commented May 13, 2024

One-line PR description: Initial proposal for KEP-4631

Issue link: LoadBalancer Service Status Improvements #4631

Other comments:

While updating the e2e load balancer tests after the final removal of
in-tree cloud providers, we have run into three problems:

  1. The tests have hard-coded timeouts (that sometimes differ per
     cloud provider) for deciding how long to wait for the cloud
     provider to update the service. It would make much more sense for
     the cloud provider to just provide information about its status
     on the Service object, so the tests could just monitor that.

  2. The tests recognize that not all cloud providers can implement
     all load balancer features, but in the past this was handled by
     hard-coding the information into the individual tests. (e.g.,
     `e2eskipper.SkipUnlessProviderIs("gce", "gke", "aws")`) These
     skip rules no longer work in the providerless tree, and this
     approach doesn't scale anyway. OTOH, we don't want to have to
     provide a separate `Feature:` tag for each load balancer
     subfeature, or have each cloud provider have to maintain their
     own set of `-ginkgo.skip` rules. It would be better if the e2e
     tests themselves could just figure out, somehow, whether they
     were running under a cloud provider that intends to implement the
     feature they are testing, or a cloud provider that doesn't.

  3. In some cases, because the existing tests were only run on
     certain clouds, it is not clear what the expected semantics are
     on other clouds. For example, since `IPMode: Proxy` load
     balancers can't preserve the client source IP in the way that
     `ExternalTrafficPolicy: Local` expects, should they refuse to
     provision a load balancer at all, or should they provision a load
     balancer that fails to preserve the source IP?

This KEP proposes new additions to `service.Status.LoadBalancer` and
`service.Status.Conditions` to allow cloud providers to better
communicate the status of load balancer support and provisioning, and
new guidelines on how cloud providers should handle load balancers for
services that they cannot fully support.

/assign @aojea @thockin

Contributor

k8s-ci-robot commented May 13, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

k8s-ci-robot added the do-not-merge/work-in-progress label

k8s-ci-robot assigned aojea and thockin

k8s-ci-robot added cncf-cla: yes kind/kep labels

Contributor

k8s-ci-robot commented May 13, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-network/OWNERS~~ [danwinship]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot requested review from aojea and MikeZappa87

May 13, 2024 12:25

k8s-ci-robot added sig/network approved size/XXL labels

Contributor Author

danwinship commented May 13, 2024

/sig cloud-provider
/sig testing

k8s-ci-robot added sig/cloud-provider sig/testing labels

danwinship changed the title ~~KEP-4631: LoadBalancer Service Static Improvements, initial proposal~~ KEP-4631: LoadBalancer Service Status Improvements, initial proposal

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md

+                   approach doesn't scale anyway. OTOH, we don't want to have to
+                   provide a separate `Feature:` tag for each load balancer
+                   subfeature, or have each cloud provider have to maintain their
+                   own set of `-ginkgo.skip` rules. It would be better if the e2e

Member

aojea May 13, 2024

I don't like "smart tests" that have different execution flows depending on what... current e2e for loadbalancers are not good tests as try to test a lot of different things in one execution ... the test should assert on a feature and a behavior, we may need to break existing tests down to see if we still need to do this

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md

+. In some cases, because the existing tests were only run on
+                   certain clouds, it is not clear what the expected semantics are
+                   on other clouds. For example, since `IPMode: Proxy` load

Member

aojea May 13, 2024

@thockin this was an interesting finding during the implementation of this mode in cloud-provider-kind, we need to try to flesh out more details before going to GA

Contributor Author

danwinship May 13, 2024

The problem isn't with the IPMode KEP; the distinction between "proxy mode" and "VIP mode" already existed before that; it's just that before IPMode made it explicit, it was controlled implicitly by whether the LB set Hostname or IP.

(Which is to say, even if we dropped IPMode, the problem would still exist. For example, with the default cloud-provider-aws load balancers.)

Member

thockin May 15, 2024

To this specific feature: IMO, eTP=Local means what it says. It's not an awesomely designed feature because it presumes too much, but it says "if you are forwarding external traffic, only choose a local endpoint". Proxy-ish LB impls can't really retain the client IP, regardless of eTP, that doesn't change the meaning.

Looking at API docs for it, I think we can clarify:

     // externalTrafficPolicy describes how nodes distribute service traffic they
     // receive on one of the Service's "externally-facing" addresses (NodePorts,
     // ExternalIPs, and LoadBalancer IPs). If set to "Local", the proxy will configure
     // the service in a way that assumes that external load balancers will take care
     // of balancing the service traffic between nodes, and so each node will deliver
     // traffic only to the node-local endpoints of the service, without masquerading                                                                                                
-    // the client source IP. (Traffic mistakenly sent to a node with no endpoints will
+    // the source IP. (Traffic mistakenly sent to a node with no endpoints will
     // be dropped.) The default value, "Cluster", uses the standard behavior of
     // routing to all endpoints evenly (possibly modified by topology and other
     // features). Note that traffic sent to an External IP or LoadBalancer IP from
     // within the cluster will always get "Cluster" semantics, but clients sending to
     // a NodePort from within the cluster may need to take traffic policy into account
     // when picking a node.

For a Proxy-ish LB the source IP is the proxy itself. eTP=Local should still preserve that. Whether it is useful or not is a question for end users.

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md

Comment on lines +139 to +150

+              - Allow cloud providers to indicate that they are working on
+                provisioning load balancer infrastructure, so that
+                users/operators/tests can distinguish the case of "it is taking a
+                while for the cloud to provision the load balancer" from "the cloud
+                has failed to provision a load balancer" and "there is no cloud
+                provider so load balancers don't work".

Member

aojea May 13, 2024

❤️

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md

+                provision a particular `LoadBalancer` service, and why.
+              - Allow cloud providers to indicate when they have provided an
+                "imperfect" load balancer that the user may or may not consider to

Member

aojea May 13, 2024

kind of degraded mode?

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md Show resolved Hide resolved

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md Show resolved Hide resolved

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md Outdated Show resolved Hide resolved

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md Show resolved Hide resolved

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md

+              so this feels "un-Kubernetes-like", though at the same time, it's not
+              like the current Kubernetes networking configuration situation is
+              really great, and there has been some discussion of trying to provide
+              more explicit and well-defined cluster networking configuration in the

Member

aojea May 13, 2024

yeah, this is in the "train has left the station" section :)

aojea reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md Outdated

+              Would it be better to add a `Conditions` field to
+              `v1.LoadBalancerIngress` so that we can specify conditions per element
+              of `.Status.LoadBalancer.Ingress`? IOW, should it be possible for a
+              load balancer to express that it has multiple IPs in different states?

Member

aojea May 13, 2024

I think we should only consider final states, and not partial ones

danwinship mentioned this pull request

LoadBalancer Service Status Improvements #4631

Open

4 tasks

thockin reviewed

View reviewed changes

keps/sig-network/4631-loadbalancerstatus/README.md

+. In some cases, because the existing tests were only run on
+                   certain clouds, it is not clear what the expected semantics are
+                   on other clouds. For example, since `IPMode: Proxy` load

Member

thockin May 15, 2024

To this specific feature: IMO, eTP=Local means what it says. It's not an awesomely designed feature because it presumes too much, but it says "if you are forwarding external traffic, only choose a local endpoint". Proxy-ish LB impls can't really retain the client IP, regardless of eTP, that doesn't change the meaning.

Looking at API docs for it, I think we can clarify:

     // externalTrafficPolicy describes how nodes distribute service traffic they
     // receive on one of the Service's "externally-facing" addresses (NodePorts,
     // ExternalIPs, and LoadBalancer IPs). If set to "Local", the proxy will configure
     // the service in a way that assumes that external load balancers will take care
     // of balancing the service traffic between nodes, and so each node will deliver
     // traffic only to the node-local endpoints of the service, without masquerading                                                                                                
-    // the client source IP. (Traffic mistakenly sent to a node with no endpoints will
+    // the source IP. (Traffic mistakenly sent to a node with no endpoints will
     // be dropped.) The default value, "Cluster", uses the standard behavior of
     // routing to all endpoints evenly (possibly modified by topology and other
     // features). Note that traffic sent to an External IP or LoadBalancer IP from
     // within the cluster will always get "Cluster" semantics, but clients sending to
     // a NodePort from within the cluster may need to take traffic policy into account
     // when picking a node.

For a Proxy-ish LB the source IP is the proxy itself. eTP=Local should still preserve that. Whether it is useful or not is a question for end users.

keps/sig-network/4631-loadbalancerstatus/README.md Outdated

+                - The Service has `AllocateLoadBalancerNodePorts: false`, but the
+                  cloud only supports NodePort-based load balancing.
+                - The Service is `ExternalTrafficPolicy: Local` but the cloud cannot

Member

thockin May 15, 2024

I think this is not really about IP preservation but has everything to do with not implementing hCNP or some equivalent mechanism. Minor wording change requested

Contributor Author

danwinship May 16, 2024

eTP=Local means what it says

lol, I thought that in the whole "what things should be traffic policy vs what things should be topology" debate (around PreferLocal) we had agreed that eTP=Local doesn't mean what it says, because the actual intent of the feature is "preserve client source IP", not "route traffic in a particular way"; the routing is purely a side effect of making it possible to implement "preserve client source IP". (IOW an implementation of eTP:Local that preserves client IP while doing routing in an unexpected way would be compliant, but an implementation that routes "correctly" but loses client IP is not.)

I think this is not really about IP preservation but has everything to do with not implementing hCNP or some equivalent mechanism.

What I was saying in that example was, if client IP preservation is considered a mandatory-to-implement aspect of eTP:Local, then there's no point in proxy-ish LBs implementing eTP:Local at all, and thus no reason for them to implement HCNP. (Whereas for VIP-ish LBs, there is really no good argument for not implementing HCNP.)

(If we don't think that client IP preservation is required for eTP:Local, then we should just assume all LBs will implement HCNP, and they're just buggy if they don't.)

keps/sig-network/4631-loadbalancerstatus/README.md Outdated Show resolved Hide resolved

keps/sig-network/4631-loadbalancerstatus/README.md

+                  supports single-stack load balancers, so it would only be able to
+                  serve clients of one IP family.
+                - The Service is `ExternalTrafficPolicy: Local` but the cloud cannot

Member

thockin May 15, 2024

I think this one is not a warning - IPMode already indicates this, right?

Contributor Author

danwinship May 16, 2024

I think actually no. Azure has this trick where the entire cloud network is aware of the load balancer NAT state, so the LB can DNAT packets to a NodePort without masquerading them, and then the reply packet will get un-DNAT-ed correctly even though logically speaking it doesn't pass through the LB.

keps/sig-network/4631-loadbalancerstatus/README.md Show resolved Hide resolved

keps/sig-network/4631-loadbalancerstatus/README.md Show resolved Hide resolved

keps/sig-network/4631-loadbalancerstatus/README.md

+              <<[/UNRESOLVED]>>
+              ```
+              #### The `LoadBalancerServing` Condition

Member

thockin May 15, 2024

Can we arrange it so that, as much as possible, the behavior and names are the same between Services and Gateways?

keps/sig-network/4631-loadbalancerstatus/README.md

+              update, even if the value of `LoadBalancerProvisioning` remains
+              `False`.
+              #### Terminating Condition

Member

thockin May 15, 2024

I'm not conviced this is useful - what does a user do with this information?

keps/sig-network/4631-loadbalancerstatus/README.md

+                   indicating that the load balancer for the service is already
+                   available.
+. The cloud provider sets `LoadBalancerProvisioning=True`,

Member

thockin May 15, 2024

What about (true, true) ? Conditions are best when they are orthogonal. These are not. We have to spec it, since both fields COULD be set at the same time.

IOW, don't let this become a state-machine API. It's a collection of roughly independent observations.

I see it more clearly below, so now this feels duplicative

Contributor Author

danwinship May 16, 2024

This is actually specified below; (true, true) means it is both serving and provisioning. ie, it is continuing to serve while reprovisioning for an update.

IOW, don't let this become a state-machine API. It's a collection of roughly independent observations.

I feel like we definitely want the "provisioning" observation. I feel like the "serving" observation is also useful? Did you have some other idea for conditions?

keps/sig-network/4631-loadbalancerstatus/README.md Show resolved Hide resolved

danwinship force-pushed the loadbalancerstatus branch from d5813da to a02030a Compare

May 23, 2024 14:18

k8s-ci-robot removed the cncf-cla: yes label

k8s-ci-robot added the cncf-cla: no label


          Initial proposal for KEP-4631 LoadBalancer Service Static Improvements

795cdd5

danwinship force-pushed the loadbalancerstatus branch from a02030a to 795cdd5 Compare

May 24, 2024 14:14

k8s-ci-robot added cncf-cla: yes and removed cncf-cla: no labels

Contributor Author

danwinship commented May 24, 2024

Updates:

Updated for review comments, started filling in the rest of the template ("Test Plan", "Graduation", "Version Skew Strategy", PRR, etc)
Removed UNRESOLVED auto-skipping: Antonio agrees it's reasonable to have "tri-state" tests (pass/fail/skip) as long as the semantics are explicit. In some cases this will require rewriting the tests to put the "objectionable" bits at the start, so we can always skip immediately after the initial LB provisioning if the cloud doesn't support the feature.
The version skew section made me think about the behavior when a cloud provider is updated and finds itself in a cluster with pre-existing "bad" load balancers, and whether we should maybe add an explicit "unsupported" or "broken" condition.
Added a summary section to the top of the "Expected Behavior When the Cloud Provider Doesn't Know That It Can't Implement a Load Balancer" section, which I think is the big open question before this becomes implementable.
Added detailed notes to the e2e Test Plan section clarifying what skips will be needed for which tests. Also, while writing out the Test Plan section, I realized that to avoid regressing coverage, we're basically going to have to fork the LB tests, so we can have one hacky GCE-and-kind-only set, and one KEP-4631 set, which will initially be Alpha, but which will eventually replace the hacky ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved cncf-cla: yes do-not-merge/work-in-progress kind/kep sig/cloud-provider sig/network sig/testing size/XXL