-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-3085: promote to beta in 1.29 #4139
Conversation
kannon92
commented
Jul 28, 2023
- One-line PR description: Promote KEP 3085 to beta in 1.29.
- Issue link: [KEP-3085] Add condition for sandbox creation (xposted from original issue) #4138
- Other comments:
/lgtm |
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Outdated
Show resolved
Hide resolved
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Outdated
Show resolved
Hide resolved
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Outdated
Show resolved
Hide resolved
/assign @ddebroy |
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Show resolved
Hide resolved
b45c5e6
to
0221c9d
Compare
/assign |
The PRR for this needs to be updated a little bit for beta. There is one section defined under scalability that needs to be added to the KEP (https://github.com/kubernetes/enhancements/blob/master/keps/NNNN-kep-template/README.md#can-enabling--using-this-feature-result-in-resource-exhaustion-of-some-node-resources-pids-sockets-inodes-etc). While reviewing the previous PRR template, I had a few questions. I can't comment since you didn't update the PRR template, so I'll ask below:
What happens if someone doesn't do the "should" statements above? When controllers that consume the condition get data in the middle of a rollout or rollback is there a way to determine this or are there any ways to help determine this?
Could we better define a sharp increase? This seems like a bad metric for those other enviroments. For people operating clusters with those special runtime environments, is there any way to know that this feature is functioning properly?
The check of |
Ah. I will add that.
I'll add an update for this. Basically new pods will add this condition if the feature gate is on. If the feature gate is turned off, that condition will not be updated. If a service/controller relies on this condition and it doesn't existing or is not being updated, then it will most likely not report anything related to this condition.
I don't know the answer to this one. I think it depends on what is the baseline for this metric for their instance. The main idea of this question/metric will be to report that there could be way more patches than what is expected. On a chatty cluster, I could see this being large. So it'd expect problems could be cases where the sandbox is condition is being set to true and then back to false and repeat.
Yes. We didn't see any other failures in the alpha state. |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
keps/sig-node/3085-pod-conditions-for-starting-completition-of-sandbox-creation/README.md
Show resolved
Hide resolved
@jeremyrickard You bring up an interesting point about mid rollouts. If a user disables a feature, should we remove the condition entirely from the pod? I can add that has something we add for beta if need be. I hope this doesn't block the promotion. |
d371139
to
68af10d
Compare
68af10d
to
81fa3e0
Compare
bfc3b53
to
d4333ce
Compare
d4333ce
to
31280ee
Compare
So, you have already documented that the conditions will be "frozen" after disablement. We obviously do have a timestamp on conditions. We can also patch them manually. What I suggest is that you document how you would "clear" the conditions after disablement. We should not have special clean up code that runs automatically on disablement (because it might be broken too!). Instead, just document how to do the cleanup. If it seems really necessary, some script or small client side binary could be provided to do clean up. Please do this in a follow up PR (I won't hold up merge for this). /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnbelamaric, kannon92, mrunalp, SergeyKanzhelev The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |