promtool check rules does not take into account expression #7301

gotjosh · 2020-05-27T14:18:49Z

What did you do?

Run promtool check rules example.yaml where example.yaml is:

groups:
  - name: example
    rules:
      - expr: |
          sum by (code, verb) (increase(apiserver_request_total{job="default/kubernetes",verb="LIST",code=~"2.."}[30d]))
        record: code_verb:apiserver_request_total:increase30d
      - expr: |
          sum by (code, verb) (increase(apiserver_request_total{job="default/kubernetes",verb="GET",code=~"2.."}[30d]))
        record: code_verb:apiserver_request_total:increase30d

What did you expect to see?

Rules are all good, with no duplicates.

What did you see instead? Under which circumstances?

Checking example.yaml
1 duplicate rules(s) found.
Metric: code_verb:apiserver_request_total:increase30d
Label(s):
Might cause inconsistency while recording expressions.
  SUCCESS: 2 rules found

The reason for this is that our duplicate check only takes into account metric names and added labels but not the actual expression. Is this correct? Happy to submit a fix if this is not the case.

prometheus/cmd/promtool/main.go

Lines 330 to 344 in 7a541bd

    
           for _, group := range groups { 
        
           	for index, rule := range group.Rules { 
        
           		inst := compareRuleType{ 
        
           			metric: ruleMetric(rule), 
        
           			label:  rule.Labels, 
        
           		} 
        
           		for i := 0; i < index; i++ { 
        
           			t := compareRuleType{ 
        
           				metric: ruleMetric(group.Rules[i]), 
        
           				label:  group.Rules[i].Labels, 
        
           			} 
        
           			if reflect.DeepEqual(t, inst) { 
        
           				duplicates = append(duplicates, t) 
        
           			} 
        
           		}

The text was updated successfully, but these errors were encountered:

brian-brazil · 2020-05-27T14:39:09Z

Hmm, I had thought this was meant to check for the rule being identical.

roidelapluie · 2020-05-27T15:11:38Z

I think that this is valid as long as it is a warning and it succeeds at the end.

The rule should be written as

groups:
  - name: example
    rules:
      - expr: |
          sum by (code, verb) (increase(apiserver_request_total{job="default/kubernetes",verb="LIST",code=~"2.."}[30d]))
          or
          sum by (code, verb) (increase(apiserver_request_total{job="default/kubernetes",verb="GET",code=~"2.."}[30d]))
        record: code_verb:apiserver_request_total:increase30d

In which case it would not be able to produce identical results.

gotjosh · 2020-05-27T15:22:17Z

I think that this is valid as long as it is a warning and it succeeds at the end.

A warning or not, the message seems... Deceiving? Given we're saying this is a rule/alert is a duplicate when it actually isn't. The name is a duplicate but the queries are different.

wbh1 · 2022-06-08T14:04:30Z

Since v2.35.0, this is reporting failures instead of just warning. Seems intentional based on #10435, but now causes misleading failures by default since the expressions are not taken into account.

v2.34.0

Checking /tmp/default.yml
1 duplicate rule(s) found.
Metric: VeryLowDiskSpace
Label(s):
        severity: critical
Might cause inconsistency while recording expressions.
  SUCCESS: 54 rules found

v2.35.0

Checking /tmp/default.yml
  FAILED:
lint error 1 duplicate rule(s) found.
Metric: VeryLowDiskSpace
Label(s):
        severity: critical
Might cause inconsistency while recording expressions

dgl · 2022-06-09T07:59:31Z

@wbh1 sorry, yes, it was a regression, already fixed by #10815 -- we should probably backport that to a 2.36.1.

Note for the general issue, in most cases where rules are split along a label value you can provide a label to avoid the warning:

groups:
  - name: example
    rules:
      - expr: |
          sum by (code, verb) (increase(apiserver_request_total{job="default/kubernetes",verb="LIST",code=~"2.."}[30d]))
        record: code_verb:apiserver_request_total:increase30d
        labels:
          verb: LIST
      - expr: |
          sum by (code, verb) (increase(apiserver_request_total{job="default/kubernetes",verb="GET",code=~"2.."}[30d]))
        record: code_verb:apiserver_request_total:increase30d
        labels:
          verb: GET

It might be nice for promtool to just not warn if it can work out the expression can't overlap, although PromQL semantics are such it's not guaranteed a queried label is in the returned set. However part of the reason for the --lint change is to make it possible for us to add more lint style warnings (on an opt-in basis) so we could do this while giving people with remote read setups (the main reason you'd find a "missing" label) a way to not get warnings.

beorn7 · 2024-04-30T11:25:32Z

Hello from the bug scrub.

We feel that the various additional flags address the issue sufficiently. If somebody wants to refine things even more, please follow up here and we'll re-open the issue.

gotjosh added the kind/bug label May 27, 2020

gotjosh mentioned this issue Jun 1, 2020

Add a flag to check whenever rules have colons (:) on their names grafana/cortex-tools#42

Closed

brian-brazil added component/promtool priority/P3 labels Jun 29, 2020

dgl mentioned this issue Jun 22, 2022

Add a --lint flag to the promtool check rules and check config commands #10435

Merged

beorn7 closed this as completed Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

promtool check rules does not take into account expression #7301

promtool check rules does not take into account expression #7301

gotjosh commented May 27, 2020

brian-brazil commented May 27, 2020

roidelapluie commented May 27, 2020

gotjosh commented May 27, 2020

wbh1 commented Jun 8, 2022 •

edited

dgl commented Jun 9, 2022 •

edited

beorn7 commented Apr 30, 2024

promtool check rules does not take into account expression #7301

promtool check rules does not take into account expression #7301

Comments

gotjosh commented May 27, 2020

brian-brazil commented May 27, 2020

roidelapluie commented May 27, 2020

gotjosh commented May 27, 2020

wbh1 commented Jun 8, 2022 • edited

v2.34.0

v2.35.0

dgl commented Jun 9, 2022 • edited

beorn7 commented Apr 30, 2024

wbh1 commented Jun 8, 2022 •

edited

dgl commented Jun 9, 2022 •

edited