Prometheus does not recognize `HELP` and `TYPE` for OpenMetrics counters #13944

link2xt · 2024-04-17T11:04:52Z

What did you do?

I pointed Prometheus to OpenMetrics endpoint which outputs data that looks like this:

$ curl http://192.168.10.2:9001/metrics -v
*   Trying 192.168.10.2:9001...
* Connected to 192.168.10.2 (192.168.10.2) port 9001 (#0)
> GET /metrics HTTP/1.1
> Host: 192.168.10.2:9001
> User-Agent: curl/7.81.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-length: 561
< content-type: application/openmetrics-text;charset=utf-8;version=1.0.0
< date: Wed, 17 Apr 2024 10:56:28 GMT
<
# HELP direct_notifications Number of direct notifications.
# TYPE direct_notifications counter
direct_notifications_total 1
# HELP heartbeat_notifications Number of heartbeat notifications.
# TYPE heartbeat_notifications counter
heartbeat_notifications_total 123
# HELP heartbeat_registrations Number of heartbeat registrations.
# TYPE heartbeat_registrations counter
heartbeat_registrations_total 456
# HELP heartbeat_token_count Number of tokens registered for heartbeat notifications.
# TYPE heartbeat_token_count gauge
heartbeat_token_count 789
# EOF
* Connection #0 to host 192.168.10.2 left intact

Endpoint is provided by https://github.com/deltachat/notifiers which uses https://github.com/prometheus/client_rust

What did you expect to see?

I expected that Prometheus would display "counter" hint and help line "Number of direct notifications." However, Prometheus ignores HELP and TYPE for direct_notifications.

Apparently it expected that response would contain:

# HELP direct_notifications_total Number of direct notifications.
# TYPE direct_notifications_total counter
direct_notifications_total 1

but this is not correct according to OpenMetrics specification. This is the difference documented in https://opentelemetry.io/docs/specs/otel/compatibility/prometheus_and_openmetrics/

What did you see instead? Under which circumstances?

I expected that OpenMetrics parser would be used according to the content-type and recognize HELP and TYPE for direct_notifications_total counter.

System information

Linux 5.15.0-102-generic x86_64

Prometheus version

prometheus, version 2.51.2 (branch: HEAD, revision: b4c0ab52c3e9b940ab803581ddae9b3d9a452337)
  build user:       root@b63f02a423d9
  build date:       20240410-14:05:54
  go version:       go1.22.2
  platform:         linux/amd64
  tags:             netgo,builtinassets,stringlabels

Prometheus configuration file

scrape_configs:
...
  - job_name: "notifications"
    scrape_interval: 30s
    static_configs:
      - targets: ['192.168.10.2:9001']

The text was updated successfully, but these errors were encountered:

link2xt · 2024-04-17T11:46:38Z

Related, another OpenMetrics counter incompatibility: #6541

beorn7 · 2024-04-19T13:26:47Z

Could you verify what protocol is actually used in the Prometheus scrape? (I believe this particular aspect of OpenMetrics should be implemented correctly in Prometheus, and my suspicion here is that Prometheus, for some reason, thinks it is scraping the classic text format, or it might even negotiate protobuf, maybe because native histograms are enabled…)

link2xt · 2024-04-20T04:27:24Z

How can I verify which protocol is used? It is not printed in debug logs.
Note that server only outputs content-type: application/openmetrics-text;charset=utf-8;version=1.0.0, this is the only format supported:
https://github.com/deltachat/notifiers/blob/2bcedc685df371777779198e390eabb52ad89f12/src/metrics.rs#L84-L92

I changed scrape config like this and ran systemctl prometheus reload and checked in the log that config got reloaded successfully, but it did not fix the problem:

  - job_name: "notifications"
    scrape_interval: 30s
    scrape_protocols: ["OpenMetricsText1.0.0"]
    static_configs:
      - targets: ['192.168.10.2:9001']

beorn7 · 2024-05-02T17:11:00Z

How can I verify which protocol is used?

For ground truth, I would use a packet sniffer like tcpdump or Wireshark.

beorn7 · 2024-05-02T17:38:30Z

I have reproduced the issue now.

tl;dr: @link2xt is right. The UI isn't properly creating a type hint for counters ingested via OpenMetrics.

This is a UI issue. The UI is querying the /api/v1/metadata endpoint, and for counters that are ingested via OpenMetrics, it returns something like this for a counter that creates a metric request_processing_seconds within Prometheus:

    "request_processing_seconds": [
      {
        "type": "counter",
        "unit": "",
        "help": "Time spent processing request"
      }

The UI in this case is only looking for an entry request_processing_seconds_total, which doesn't exist in the metadata response.

First question to OM experts @RichiH and @SuperQ : Is the metadata endpoint behaving as expected here?

And in case it is, this would be an issue to be fixed by our frontend experts (@Nexucis and @juliusv, maybe).

SuperQ · 2024-05-02T17:45:50Z

I suppose we should make the metadata endpoint match the expected Prometheus name when exposing OM metrics output for counters.

beorn7 · 2024-05-02T17:48:18Z

Some musings about the question if the metadata API is behaving correctly: I think it's really problematic that the exposition format now determines how a counter is represented in the metadata API response. I get the rationale behind OM's decision to change the "true name" of a counter metric foo_total to just foo. This is following in the footsteps of a summary called just foo even if it also contains metrics called foo_sum and foo_count. And a (classic) histogram called just foo that implies metrics foo_bucket, foo_sum, and foo_count, and not even a metric called just foo.

However, with native histograms as the histogram of choice, the classic histograms might just become a thing of the past. Even summaries, if we still want to support them, could eventually be represented by "native summaries" (no concrete plans, just a reasonable speculation). Then there would be no footsteps to follow anymore and we could "go back" to calling a counter foo_total just foo_total everywhere. This would also remove the still unresolved collision of go_memstats_alloc_bytes and go_memstats_alloc_bytes_total, see prometheus/client_golang#829 .

beorn7 · 2024-05-02T17:50:11Z

I suppose we should make the metadata endpoint match the expected Prometheus name when exposing OM metrics output for counters.

So you would say we should change the example output above to say request_processing_seconds_total instead of request_processing_seconds?

This would work if we do it at the level of ingesting metadata. We cannot really do it after the fact in the metadata API itself because, at that point, we don't know anymore which format was used to ingest the metric.

link2xt mentioned this issue Apr 17, 2024

Counter HELP and TYPE lack _total suffix prometheus/client_rust#194

Closed

mxinden mentioned this issue Apr 29, 2024

Does this crate implement Prometheus or OpenMetrics? prometheus/client_rust#111

Open

beorn7 added kind/bug priority/P2 component/ui labels May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus does not recognize `HELP` and `TYPE` for OpenMetrics counters #13944

Prometheus does not recognize `HELP` and `TYPE` for OpenMetrics counters #13944

link2xt commented Apr 17, 2024 •

edited

link2xt commented Apr 17, 2024 •

edited

beorn7 commented Apr 19, 2024

link2xt commented Apr 20, 2024

beorn7 commented May 2, 2024

beorn7 commented May 2, 2024

SuperQ commented May 2, 2024 •

edited

beorn7 commented May 2, 2024

beorn7 commented May 2, 2024

Prometheus does not recognize HELP and TYPE for OpenMetrics counters #13944

Prometheus does not recognize HELP and TYPE for OpenMetrics counters #13944

Comments

link2xt commented Apr 17, 2024 • edited

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

System information

Prometheus version

Prometheus configuration file

link2xt commented Apr 17, 2024 • edited

beorn7 commented Apr 19, 2024

link2xt commented Apr 20, 2024

beorn7 commented May 2, 2024

beorn7 commented May 2, 2024

SuperQ commented May 2, 2024 • edited

beorn7 commented May 2, 2024

beorn7 commented May 2, 2024

Prometheus does not recognize `HELP` and `TYPE` for OpenMetrics counters #13944

Prometheus does not recognize `HELP` and `TYPE` for OpenMetrics counters #13944

link2xt commented Apr 17, 2024 •

edited

link2xt commented Apr 17, 2024 •

edited

SuperQ commented May 2, 2024 •

edited