New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make termination grace seconds configurable #4681
base: main
Are you sure you want to change the base?
Make termination grace seconds configurable #4681
Conversation
Signed-off-by: Ben Ye <ben.ye@bytedance.com>
pkg/apis/monitoring/v1/types.go
Outdated
// Set this value longer than the expected cleanup time for your process. | ||
// Defaults to 600 seconds. | ||
// +optional | ||
TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be better to use uint to avoid negative values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add default value using validation marker so we can remove the conditional from statefulset code
// +kubebuilder:default:="600"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the review.
Signed-off-by: Ben Ye <ben.ye@bytedance.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add tests?
pkg/apis/monitoring/v1/types.go
Outdated
// +optional | ||
TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty"` | ||
// +kubebuilder:default:="600" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since its int type we don't need quotes
// +kubebuilder:default:=600
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Updated
var minReadySeconds int32 | ||
var ( | ||
minReadySeconds int32 | ||
terminationGracePeriod int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be uint
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it is casted from uint64. The pod spec needs int64 not uint64.
I have added tests. Can you please help take a look? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
👋 @yeya24 can you share more details about your use case? |
As I mentioned in thanos-io/thanos#5255, the issue is on our CNI side. It enables graceful termination on the IPVS side so the connection will remain for 10min in prometheus operator case (the termination seconds is 600s hardcoded). If the promethues itself is down somehow, then thanos sidecar is still available, causing a lot of partial query errors from our Thanos Query for 10m. |
@yeya24 sorry for the late follow-up but I'm not sure to understand exactly the scenario. You want to configure a shorter termination grace period in case Prometheus is stuck? |
Yes. If it is stuch in case the backend storage like Ceph is not responsive, then it doesn't make sense to wait for 10m to ensure data is written successfully because the storage is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the promethues itself is down somehow, then thanos sidecar is still available
That sounds like a thanos sidecar issue, not prometheus-operator or prometheus one.
I would be against configurable terminationGracePeriodSeconds
. In the majority of cases tweaking it can lead to unexpected data loss. I also understand the case that @yeya24 is making about CNI and CSI being unresponsive and that in those cases fast termination is beneficial. However, those issues look like edge cases that can happen in particular critical scenarios which are most likely handled directly by users via kubectl
. And if that is the case then kubectl delete pod <prometheus> --grace-period=0
is likely a better way.
If data loss is fine and users want to have a termination time < 10m. I think this use case is still valid since 10m might not fit all users. In our case we want a smaller duration like 5m. What about other use cases that think 10m is too short and they want to have a longer duration like 1h? My point is that an operator should provide some way to allow users to configure k8s native fields like |
Hi, team; it's been a while since the last messages in this PR. I'm interested in the feature. However, my use case is different. We want to increase beyond the default 10 minutes because, in some cases, that is not enough for a graceful shutdown if you've enabled the feature flag for snapshot in-memory chunks. See details prometheus/prometheus#7229 Our setup required more than 10 minutes to complete the chunk snapshots successfully. /cc @yeya24 I'm happy to collaborate to get this PR in good shape again. |
@rnaveiras more than 10 minutes for the chunks snapshot seems a bit extreme. Do you have an explanation why it takes so long? Have you tried reporting the issue to prometheus/prometheus? Having said that, we had many requests in the past to customize the termination grace periods and though sometimes the justification could be challenged, we also agreed in #4691 that we shouldn't block such customization if there was high demand from the community and no alternative existed. |
It makes the pod.spec TerminationGracePeriodSeconds configurable via the CRDs for prometheus and prometheusagent Fixes prometheus-operator#3433 Closes prometheus-operator#4681 Co-authored-by: Ben Ye <ben.ye@bytedance.com> Signed-off-by: Raul Navieras <me@raulnaveiras.com>
It makes the pod.spec TerminationGracePeriodSeconds configurable via the CRDs for prometheus and prometheusagent Fixes prometheus-operator#3433 Closes prometheus-operator#4681 Co-authored-by: Ben Ye <ben.ye@bytedance.com> Signed-off-by: Raul Navieras <me@raulnaveiras.com>
Signed-off-by: Ben Ye ben.ye@bytedance.com
Description
We have the usecase to configure the termination grace seconds value of the prometheus statefulset. Now the value is hardcoded as 10m.
Type of change
What type of changes does your code introduce to the Prometheus operator? Put an
x
in the box that apply.CHANGE
(fix or feature that would cause existing functionality to not work as expected)FEATURE
(non-breaking change which adds functionality)BUGFIX
(non-breaking change which fixes an issue)ENHANCEMENT
(non-breaking change which improves existing functionality)NONE
(if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)Changelog entry
Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.