-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-4631: LoadBalancer Service Status Improvements, initial proposal #4632
base: master
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/sig cloud-provider |
approach doesn't scale anyway. OTOH, we don't want to have to | ||
provide a separate `Feature:` tag for each load balancer | ||
subfeature, or have each cloud provider have to maintain their | ||
own set of `-ginkgo.skip` rules. It would be better if the e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like "smart tests" that have different execution flows depending on what... current e2e for loadbalancers are not good tests as try to test a lot of different things in one execution ... the test should assert on a feature and a behavior, we may need to break existing tests down to see if we still need to do this
|
||
3. In some cases, because the existing tests were only run on | ||
certain clouds, it is not clear what the expected semantics are | ||
on other clouds. For example, since `IPMode: Proxy` load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thockin this was an interesting finding during the implementation of this mode in cloud-provider-kind, we need to try to flesh out more details before going to GA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem isn't with the IPMode
KEP; the distinction between "proxy mode" and "VIP mode" already existed before that; it's just that before IPMode
made it explicit, it was controlled implicitly by whether the LB set Hostname
or IP
.
(Which is to say, even if we dropped IPMode
, the problem would still exist. For example, with the default cloud-provider-aws load balancers.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To this specific feature: IMO, eTP=Local
means what it says. It's not an awesomely designed feature because it presumes too much, but it says "if you are forwarding external traffic, only choose a local endpoint". Proxy-ish LB impls can't really retain the client IP, regardless of eTP
, that doesn't change the meaning.
Looking at API docs for it, I think we can clarify:
// externalTrafficPolicy describes how nodes distribute service traffic they
// receive on one of the Service's "externally-facing" addresses (NodePorts,
// ExternalIPs, and LoadBalancer IPs). If set to "Local", the proxy will configure
// the service in a way that assumes that external load balancers will take care
// of balancing the service traffic between nodes, and so each node will deliver
// traffic only to the node-local endpoints of the service, without masquerading
- // the client source IP. (Traffic mistakenly sent to a node with no endpoints will
+ // the source IP. (Traffic mistakenly sent to a node with no endpoints will
// be dropped.) The default value, "Cluster", uses the standard behavior of
// routing to all endpoints evenly (possibly modified by topology and other
// features). Note that traffic sent to an External IP or LoadBalancer IP from
// within the cluster will always get "Cluster" semantics, but clients sending to
// a NodePort from within the cluster may need to take traffic policy into account
// when picking a node.
For a Proxy-ish LB the source IP is the proxy itself. eTP=Local
should still preserve that. Whether it is useful or not is a question for end users.
- Allow cloud providers to indicate that they are working on | ||
provisioning load balancer infrastructure, so that | ||
users/operators/tests can distinguish the case of "it is taking a | ||
while for the cloud to provision the load balancer" from "the cloud | ||
has failed to provision a load balancer" and "there is no cloud | ||
provider so load balancers don't work". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
provision a particular `LoadBalancer` service, and why. | ||
|
||
- Allow cloud providers to indicate when they have provided an | ||
"imperfect" load balancer that the user may or may not consider to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind of degraded mode?
cc: @bowei
so this feels "un-Kubernetes-like", though at the same time, it's not | ||
like the current Kubernetes networking configuration situation is | ||
really great, and there has been some discussion of trying to provide | ||
more explicit and well-defined cluster networking configuration in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this is in the "train has left the station" section :)
Would it be better to add a `Conditions` field to | ||
`v1.LoadBalancerIngress` so that we can specify conditions per element | ||
of `.Status.LoadBalancer.Ingress`? IOW, should it be possible for a | ||
load balancer to express that it has multiple IPs in different states? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should only consider final states, and not partial ones
|
||
3. In some cases, because the existing tests were only run on | ||
certain clouds, it is not clear what the expected semantics are | ||
on other clouds. For example, since `IPMode: Proxy` load |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To this specific feature: IMO, eTP=Local
means what it says. It's not an awesomely designed feature because it presumes too much, but it says "if you are forwarding external traffic, only choose a local endpoint". Proxy-ish LB impls can't really retain the client IP, regardless of eTP
, that doesn't change the meaning.
Looking at API docs for it, I think we can clarify:
// externalTrafficPolicy describes how nodes distribute service traffic they
// receive on one of the Service's "externally-facing" addresses (NodePorts,
// ExternalIPs, and LoadBalancer IPs). If set to "Local", the proxy will configure
// the service in a way that assumes that external load balancers will take care
// of balancing the service traffic between nodes, and so each node will deliver
// traffic only to the node-local endpoints of the service, without masquerading
- // the client source IP. (Traffic mistakenly sent to a node with no endpoints will
+ // the source IP. (Traffic mistakenly sent to a node with no endpoints will
// be dropped.) The default value, "Cluster", uses the standard behavior of
// routing to all endpoints evenly (possibly modified by topology and other
// features). Note that traffic sent to an External IP or LoadBalancer IP from
// within the cluster will always get "Cluster" semantics, but clients sending to
// a NodePort from within the cluster may need to take traffic policy into account
// when picking a node.
For a Proxy-ish LB the source IP is the proxy itself. eTP=Local
should still preserve that. Whether it is useful or not is a question for end users.
- The Service has `AllocateLoadBalancerNodePorts: false`, but the | ||
cloud only supports NodePort-based load balancing. | ||
|
||
- The Service is `ExternalTrafficPolicy: Local` but the cloud cannot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not really about IP preservation but has everything to do with not implementing hCNP
or some equivalent mechanism. Minor wording change requested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eTP=Local means what it says
lol, I thought that in the whole "what things should be traffic policy vs what things should be topology" debate (around PreferLocal) we had agreed that eTP=Local doesn't mean what it says, because the actual intent of the feature is "preserve client source IP", not "route traffic in a particular way"; the routing is purely a side effect of making it possible to implement "preserve client source IP". (IOW an implementation of eTP:Local that preserves client IP while doing routing in an unexpected way would be compliant, but an implementation that routes "correctly" but loses client IP is not.)
I think this is not really about IP preservation but has everything to do with not implementing
hCNP
or some equivalent mechanism.
What I was saying in that example was, if client IP preservation is considered a mandatory-to-implement aspect of eTP:Local, then there's no point in proxy-ish LBs implementing eTP:Local at all, and thus no reason for them to implement HCNP. (Whereas for VIP-ish LBs, there is really no good argument for not implementing HCNP.)
(If we don't think that client IP preservation is required for eTP:Local, then we should just assume all LBs will implement HCNP, and they're just buggy if they don't.)
supports single-stack load balancers, so it would only be able to | ||
serve clients of one IP family. | ||
|
||
- The Service is `ExternalTrafficPolicy: Local` but the cloud cannot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one is not a warning - IPMode already indicates this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think actually no. Azure has this trick where the entire cloud network is aware of the load balancer NAT state, so the LB can DNAT packets to a NodePort without masquerading them, and then the reply packet will get un-DNAT-ed correctly even though logically speaking it doesn't pass through the LB.
<<[/UNRESOLVED]>> | ||
``` | ||
|
||
#### The `LoadBalancerServing` Condition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we arrange it so that, as much as possible, the behavior and names are the same between Services and Gateways?
update, even if the value of `LoadBalancerProvisioning` remains | ||
`False`. | ||
|
||
#### Terminating Condition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not conviced this is useful - what does a user do with this information?
indicating that the load balancer for the service is already | ||
available. | ||
|
||
3. The cloud provider sets `LoadBalancerProvisioning=True`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about (true, true)
? Conditions are best when they are orthogonal. These are not. We have to spec it, since both fields COULD be set at the same time.
IOW, don't let this become a state-machine API. It's a collection of roughly independent observations.
I see it more clearly below, so now this feels duplicative
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually specified below; (true, true) means it is both serving and provisioning. ie, it is continuing to serve while reprovisioning for an update.
IOW, don't let this become a state-machine API. It's a collection of roughly independent observations.
I feel like we definitely want the "provisioning" observation. I feel like the "serving" observation is also useful? Did you have some other idea for conditions?
d5813da
to
a02030a
Compare
a02030a
to
795cdd5
Compare
Updates:
|
/assign @aojea @thockin