Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make termination grace seconds configurable #4681

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion Documentation/api.md
Expand Up @@ -810,7 +810,7 @@ PrometheusSpec is a specification of the desired behavior of the Prometheus clus
| enforcedLabelValueLengthLimit | Per-scrape limit on length of labels value that will be accepted for a sample. If a label value is longer than this number post metric-relabeling, the entire scrape will be treated as failed. 0 means no limit. Only valid in Prometheus versions 2.27.0 and newer. | *uint64 | false |
| enforcedBodySizeLimit | EnforcedBodySizeLimit defines the maximum size of uncompressed response body that will be accepted by Prometheus. Targets responding with a body larger than this many bytes will cause the scrape to fail. Example: 100MB. If defined, the limit will apply to all service/pod monitors and probes. This is an experimental feature, this behaviour could change or be removed in the future. Only valid in Prometheus versions 2.28.0 and newer. | ByteSize | false |
| minReadySeconds | Minimum number of seconds for which a newly created pod should be ready without any of its container crashing for it to be considered available. Defaults to 0 (pod will be considered available as soon as it is ready) This is an alpha field and requires enabling StatefulSetMinReadySeconds feature gate. | *uint32 | false |
| terminationGracePeriodSeconds | Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). If this value is nil, the default grace period will be used instead. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. Defaults to 600 seconds. | *int64 | false |
| terminationGracePeriodSeconds | Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). If this value is nil, the default grace period will be used instead. Default value is set to 10 min because Prometheus may take quite long to shutdown to checkpoint existing data. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. | *uint64 | false |
| retention | Time duration Prometheus shall retain data for. Default is '24h' if retentionSize is not set, and must match the regular expression `[0-9]+(ms\|s\|m\|h\|d\|w\|y)` (milliseconds seconds minutes hours days weeks years). | string | false |
| retentionSize | Maximum amount of disk space used by blocks. | ByteSize | false |
| disableCompaction | Disable prometheus compaction. | bool | false |
Expand Down
13 changes: 8 additions & 5 deletions bundle.yaml
Expand Up @@ -17515,15 +17515,18 @@ spec:
the image URL.'
type: string
terminationGracePeriodSeconds:
default: "600"
description: Optional duration in seconds the pod needs to terminate
gracefully. May be decreased in delete request. Value must be non-negative
integer. The value zero indicates stop immediately via the kill
signal (no opportunity to shut down). If this value is nil, the
default grace period will be used instead. The grace period is the
duration in seconds after the processes running in the pod are sent
a termination signal and the time when the processes are forcibly
halted with a kill signal. Set this value longer than the expected
cleanup time for your process. Defaults to 600 seconds.
default grace period will be used instead. Default value is set
to 10 min because Prometheus may take quite long to shutdown to
checkpoint existing data. The grace period is the duration in seconds
after the processes running in the pod are sent a termination signal
and the time when the processes are forcibly halted with a kill
signal. Set this value longer than the expected cleanup time for
your process.
format: int64
type: integer
thanos:
Expand Down
Expand Up @@ -6208,15 +6208,18 @@ spec:
the image URL.'
type: string
terminationGracePeriodSeconds:
default: "600"
description: Optional duration in seconds the pod needs to terminate
gracefully. May be decreased in delete request. Value must be non-negative
integer. The value zero indicates stop immediately via the kill
signal (no opportunity to shut down). If this value is nil, the
default grace period will be used instead. The grace period is the
duration in seconds after the processes running in the pod are sent
a termination signal and the time when the processes are forcibly
halted with a kill signal. Set this value longer than the expected
cleanup time for your process. Defaults to 600 seconds.
default grace period will be used instead. Default value is set
to 10 min because Prometheus may take quite long to shutdown to
checkpoint existing data. The grace period is the duration in seconds
after the processes running in the pod are sent a termination signal
and the time when the processes are forcibly halted with a kill
signal. Set this value longer than the expected cleanup time for
your process.
format: int64
type: integer
thanos:
Expand Down
3 changes: 2 additions & 1 deletion jsonnet/prometheus-operator/prometheuses-crd.json
Expand Up @@ -5767,7 +5767,8 @@
"type": "string"
},
"terminationGracePeriodSeconds": {
"description": "Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). If this value is nil, the default grace period will be used instead. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process. Defaults to 600 seconds.",
"default": "600",
"description": "Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request. Value must be non-negative integer. The value zero indicates stop immediately via the kill signal (no opportunity to shut down). If this value is nil, the default grace period will be used instead. Default value is set to 10 min because Prometheus may take quite long to shutdown to checkpoint existing data. The grace period is the duration in seconds after the processes running in the pod are sent a termination signal and the time when the processes are forcibly halted with a kill signal. Set this value longer than the expected cleanup time for your process.",
"format": "int64",
"type": "integer"
},
Expand Down
7 changes: 4 additions & 3 deletions pkg/apis/monitoring/v1/types.go
Expand Up @@ -338,13 +338,14 @@ type CommonPrometheusFields struct {
// Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
// Value must be non-negative integer. The value zero indicates stop immediately via
// the kill signal (no opportunity to shut down).
// If this value is nil, the default grace period will be used instead.
// If this value is nil, the default grace period will be used instead. Default value is set to
// 10 min because Prometheus may take quite long to shutdown to checkpoint existing data.
// The grace period is the duration in seconds after the processes running in the pod are sent
// a termination signal and the time when the processes are forcibly halted with a kill signal.
// Set this value longer than the expected cleanup time for your process.
// Defaults to 600 seconds.
// +optional
TerminationGracePeriodSeconds *int64 `json:"terminationGracePeriodSeconds,omitempty"`
// +kubebuilder:default:="600"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since its int type we don't need quotes

// +kubebuilder:default:=600

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated

TerminationGracePeriodSeconds *uint64 `json:"terminationGracePeriodSeconds,omitempty"`
}

// Prometheus defines a Prometheus deployment.
Expand Down
5 changes: 0 additions & 5 deletions pkg/apis/monitoring/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 8 additions & 9 deletions pkg/prometheus/statefulset.go
Expand Up @@ -325,13 +325,6 @@ func makeStatefulSetService(p *monitoringv1.Prometheus, config operator.Config)

func makeStatefulSetSpec(p monitoringv1.Prometheus, c *operator.Config, shard int32, ruleConfigMapNames []string,
tlsAssetSecrets []string, version semver.Version) (*appsv1.StatefulSetSpec, error) {
// Prometheus may take quite long to shut down to checkpoint existing data.
// Allow up to 10 minutes for clean termination if not specified.
if p.Spec.TerminationGracePeriodSeconds == nil {
terminationGracePeriod := int64(600)
p.Spec.TerminationGracePeriodSeconds = &terminationGracePeriod
}

prometheusImagePath, err := operator.BuildImagePath(
operator.StringPtrValOrDefault(p.Spec.Image, ""),
operator.StringValOrDefault(p.Spec.BaseImage, c.PrometheusDefaultBaseImage),
Expand Down Expand Up @@ -907,10 +900,16 @@ func makeStatefulSetSpec(p monitoringv1.Prometheus, c *operator.Config, shard in
}
}

var minReadySeconds int32
var (
minReadySeconds int32
terminationGracePeriod int64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be uint too?

Copy link
Contributor Author

@yeya24 yeya24 Mar 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it is casted from uint64. The pod spec needs int64 not uint64.

)
if p.Spec.MinReadySeconds != nil {
minReadySeconds = int32(*p.Spec.MinReadySeconds)
}
if p.Spec.TerminationGracePeriodSeconds != nil {
terminationGracePeriod = int64(*p.Spec.TerminationGracePeriodSeconds)
}

operatorInitContainers = append(operatorInitContainers,
operator.CreateConfigReloader(
Expand Down Expand Up @@ -1004,7 +1003,7 @@ func makeStatefulSetSpec(p monitoringv1.Prometheus, c *operator.Config, shard in
AutomountServiceAccountToken: &boolTrue,
NodeSelector: p.Spec.NodeSelector,
PriorityClassName: p.Spec.PriorityClassName,
TerminationGracePeriodSeconds: p.Spec.TerminationGracePeriodSeconds,
TerminationGracePeriodSeconds: &terminationGracePeriod,
Volumes: volumes,
Tolerations: p.Spec.Tolerations,
Affinity: p.Spec.Affinity,
Expand Down