WIP: OCPBUGS-18282: Reject reserved labels used as external labels #2097

slashpai · 2023-09-26T02:42:57Z

prometheus and prometheus_replica are reserved labels, if a user configures this as a label in externalLabels in cluster-monitoring-configmap warn user about this and make operator status Degraded

I added CHANGELOG entry for this change.
No user facing changes, so no entry in CHANGELOG was needed.

openshift-ci-robot · 2023-09-26T02:43:04Z

@slashpai: This pull request references Jira Issue OCPBUGS-18282, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.15.0) matches configured target version for branch (4.15.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @juzhao

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

prometheus and prometheus_replica are reserved labels, if a user configures this as a label in externalLabels in cluster-monitoring-configmap warn user about this and make operator status Degraded

I added CHANGELOG entry for this change.

No user facing changes, so no entry in CHANGELOG was needed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci · 2023-09-26T02:43:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: slashpai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [slashpai]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

prometheus and prometheus_replica are reserved labels, if a user configures this as a label in externalLabels in cluster-monitoring-configmap warn user about this and make operator status Degraded Signed-off-by: Jayapriya Pai <janantha@redhat.com>

Signed-off-by: Jayapriya Pai <janantha@redhat.com>

machine424 · 2023-10-05T07:31:32Z

pkg/operator/operator.go

@@ -736,6 +736,9 @@ func (o *Operator) sync(ctx context.Context, key string) error {
 	} else if config.HasInconsistentAlertmanagerConfigurations() {
 		degradedConditionMessage = client.UserAlermanagerConfigMisconfiguredMessage
 		degradedConditionReason = client.UserAlermanagerConfigMisconfiguredReason
+	} else if config.HasPrometheusReservedExternalLabelsConfigured() {


How about integrating this to the PrometheusTask? Make the whole task fail if the "wrong" labels are set? This way we "never" apply that faulty config and user will have to step in.

Or maybe while loading the config, we can add such checks (once we have a CRD I expect a CR containing these labels to be rejected)

hmm I'm not sure. Iiuc if we fail this in a task the visibility is the same, i.e the operator reports this.
Only difference is that any pending config changes never get applied? since this doesn't break the CMO setup entirely I'm not sure what the better approach is.

One orthogonal issue though: This condition reporting here being a big else-if, doesn't this potentially hide issues. E.g. if no storage is configured and the forbidden external labels are set, we only ever see the storageNotConfiguredMessage and CannotUseReservedExternalLabelsMessage only pops up after storage was configured. Shouldn't we collect all conditions and report them all at once?

Yes, but this won't prevent the prometheus task from applying the faulty labels (which may in some setups break the delivery of alerts perhaps.).
Adding the check at the beginning of the prometheus task will prevent that.
Now that I think about it, doing so may also block legitimate/needed prometheus syncs...

By the way, the SetRollOutDone below sets Degraded=false, so no "default" alert will be triggered for this.

One orthogonal issue though: This condition reporting here being a big else-if, doesn't this potentially hide issues. E.g. if no storage is configured and the forbidden external labels are set, we only ever see the storageNotConfiguredMessage and CannotUseReservedExternalLabelsMessage only pops up after storage was configured. Shouldn't we collect all conditions and report them all at once?

Good point, we could group them under some "umbrella" reason and only join the messages, it's worth a ticket so we can discuss this.

openshift-ci · 2023-11-07T12:59:07Z

@slashpai: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2024-02-06T01:00:28Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

slashpai · 2024-02-07T09:37:27Z

/remove-lifecycle stale

openshift-bot · 2024-05-08T01:00:23Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 26, 2023

openshift-ci bot requested review from juzhao, jan--f and marioferh September 26, 2023 02:43

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 26, 2023

slashpai force-pushed the extlabel branch from 96fc764 to e43a681 Compare September 26, 2023 06:04

slashpai added 2 commits October 4, 2023 17:55

test/e2e: Add test for reserved external label used

9c2597f

Signed-off-by: Jayapriya Pai <janantha@redhat.com>

slashpai force-pushed the extlabel branch from e43a681 to 9c2597f Compare October 4, 2023 13:35

machine424 reviewed Oct 5, 2023

View reviewed changes

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 6, 2024

openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 7, 2024

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: OCPBUGS-18282: Reject reserved labels used as external labels #2097

WIP: OCPBUGS-18282: Reject reserved labels used as external labels #2097

slashpai commented Sep 26, 2023

openshift-ci-robot commented Sep 26, 2023

openshift-ci bot commented Sep 26, 2023

machine424 Oct 5, 2023 •

edited

jan--f Feb 6, 2024

machine424 Feb 6, 2024 •

edited

openshift-ci bot commented Nov 7, 2023

openshift-bot commented Feb 6, 2024

slashpai commented Feb 7, 2024

openshift-bot commented May 8, 2024

WIP: OCPBUGS-18282: Reject reserved labels used as external labels #2097

Are you sure you want to change the base?

WIP: OCPBUGS-18282: Reject reserved labels used as external labels #2097

Conversation

slashpai commented Sep 26, 2023

openshift-ci-robot commented Sep 26, 2023

openshift-ci bot commented Sep 26, 2023

machine424 Oct 5, 2023 • edited

Choose a reason for hiding this comment

jan--f Feb 6, 2024

Choose a reason for hiding this comment

machine424 Feb 6, 2024 • edited

Choose a reason for hiding this comment

openshift-ci bot commented Nov 7, 2023

openshift-bot commented Feb 6, 2024

slashpai commented Feb 7, 2024

openshift-bot commented May 8, 2024

machine424 Oct 5, 2023 •

edited

machine424 Feb 6, 2024 •

edited