New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance documentation for the Gardenlet's /healthz endpoint #3359
Enhance documentation for the Gardenlet's /healthz endpoint #3359
Conversation
I'm generally not objecting but curious if you experienced any particular problems leaving the Gardenlet up and running while the API server was down? So far I was quite confident that a component like Gardenlet can deal with such intermittent issues. |
I'm still struggling a bit with this with respect to the advantages we get compared with the network I/O? |
If network I/O is a big concern, we might wait for #3109 and use the metadata only client. This should be enough for this scenario, as we don't need the spec and saves network traffic. |
Yeah, maybe. I'm still in the phase of understanding the motivation for this change and if we experienced any particular problem with it (or if we foresee problems if we don't introduce this now). |
Alternatively, we could also keep the watch but use some event handlers for delete events instead of using the lister. |
No, I did not. Thinking about it again, we might be better off just documenting the behaviour - the |
Yeah, @danielfoehrKn, maybe that's a good first step to clarify on what can be expected. |
I guess, we should generally improve our health checks and do a clean separation between liveness and readiness (ref gardener-attic/gardener-resource-manager#102 (comment)). |
90a7456
to
5745322
Compare
Maybe you can open an issue for this describing this in more detail? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
How to categorize this PR?
/kind bug
/priority normal
/area documentation
What this PR does / why we need it:
Based on #2925 but uses direct clients instead of listers.
The health manager changes the gardenlets /healthz endpoint to DOWN if it cannot renew the lease in the Garden cluster.
However, this does not cover the case when the Gardener Extended API Server is down.
The problem is, that Seeds can be queried from the lister even though the Gardener Extended API Server is down. This can be prevented by using a direct client.
Advantage:
Disadvantage:
UPDATE: THis PR now only includes enhanced documentation for the Gardenlet's /healthz endpoint.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Release note: