Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webhook with a timeoutSeconds greater than 29 seconds will block upgrades? #167

Open
kyrofa opened this issue Apr 15, 2024 · 3 comments
Open

Comments

@kyrofa
Copy link

kyrofa commented Apr 15, 2024

I'm trying to upgrade from dok8s v1.28 to v1.29, but clusterlint is telling me that cert-manager's webhook, which uses a timeoutSeconds of 30, will block the upgrade. Would someone mind filling me in on why that's the case suddenly? I've been using cert-manager for years, across several k8s versions, without issue. I see in the upstream docs where it says:

The timeout value must be between 1 and 30 seconds.

But it's unclear whether that's inclusive or not. Clusterlint has seemingly decided that it's 1-inclusive and 30-exclusive. Is that actually based in reality? Is there another document I've missed? 29 seems so random, haha!

For now I will customize my deployment and set this to 29 to make clusterlint happy, but I'd like to get to the bottom of this. I've logged an issue against cert-manager as well. If clusterlint is correct, cert-manager should probably make the default value 29. On the other hand, if 30 is indeed a valid value, clusterlint probably shouldn't complain about it.

@kyrofa
Copy link
Author

kyrofa commented Apr 15, 2024

I just tried changing timeoutSeconds to 31 and got this:

Error: cannot patch "cert-manager-webhook" with kind MutatingWebhookConfiguration: MutatingWebhookConfiguration.admissionregistration.k8s.io "cert-manager-webhook" is invalid: webhooks[0].timeoutSeconds: Invalid value: 31: the timeout value must be between 1 and 30 seconds && cannot patch "cert-manager-webhook" with kind ValidatingWebhookConfiguration: ValidatingWebhookConfiguration.admissionregistration.k8s.io "cert-manager-webhook" is invalid: webhooks[0].timeoutSeconds: Invalid value: 31: the timeout value must be between 1 and 30 seconds

This is leading me to believe 30 should really be a valid value. Happy to make a PR if folks agree.

@kyrofa
Copy link
Author

kyrofa commented Apr 18, 2024

@maelvls provided some additional details on the issue I logged against cert-manager:

There is this comment in kubelint's code:

Webhooks with TimeoutSeconds set: less than 1 or greater than or equal to 30 is bad.

and

TimeoutSeconds value should be set to a non-nil value (greater than or equal to 1 and less than 30). If the TimeoutSeconds value is set to nil and the cluster version is 1.13.*, users are unable to configure the TimeoutSeconds value and this value will stay at nil, breaking upgrades. It's only for versions >= 1.14 that the value will default to 30 seconds.

That combined with my experimentation above makes it pretty clear that 30 seconds is actually a valid value, and clusterlint should allow it.

@timoreimann
Copy link
Contributor

I vaguely remember there was a "conflict" between 30 seconds used by a webhook and the default 30 seconds timeout of kubectl: if a webhook used a 30 seconds timeout, the default timeout from kubectl could "override" the webhook one, rendering failure policies of Ignore ineffective.

@varshavaradarajan does that make sense, or am I completely off here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants