Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add float histograms and gauge histograms to proto spec #58

Merged
merged 2 commits into from Jun 29, 2022

Conversation

beorn7
Copy link
Member

@beorn7 beorn7 commented Jun 14, 2022

This is only for the sparsehistogram branch!

@codesome as discussed before. I'm hereby starting the work to support float histograms and gauge histograms in the exposition format (and ultimately in TSDB and federation).

@bboreham to double check if my protobuf handling makes sense here. The idea is that the common case of a normal histogram doesn't look different at all on the wire.

@cstyan & @csmarchbanks Note that this is not yet an update of the remote-write protobuf but merely of the exposition format. However, we have to support float and gauge histograms there as well, so I thought we first complete the exposition proto spec and let it inform the remote-write proto spec. (But keep in mind that the more important source of inspiration is the respective Go types, i.e. https://github.com/prometheus/prometheus/blob/095b6c93dd5ab75f0c9f22f52b4fb5f45b33ff80/model/histogram/histogram.go#L37-L58 and https://github.com/prometheus/prometheus/blob/095b6c93dd5ab75f0c9f22f52b4fb5f45b33ff80/model/histogram/float_histogram.go#L30-L50 .)

Commit description follows:

Note that this is only an extension of the proto spec. Both generators
and consumers of the protobuf still need changes to make use of these
changes.

Gauge histograms measure current distributions. For one, they are
inspired by the GaugeHistogram type introducted by OpenMetrics, see
https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#gaugehistogram

They are also handled in the same way as OpenMetrics does it, by
using a new MetricType enum field GAUGE_HISTOGRAM, but not changing
anything else, i.e. for both regular and gauge histograms, the same
Histogram message type is used.

The other reason why we need gauge histograms comes from PromQL: If
you rate a histogram (which is possible with the new sparse
histograms as 1st class data type), the result is a gauge histogram. A
rate'd histogram can be created by a recording rule and then stored in
the TSDB. From there, it can be exposed by federation, so we need to
be able to represent it in the exposition format.

Float histograms are histograms where all counts (count of
observations, counts in each bucket, zero bucket count) are floating
point numbers rather than integer numbers. They are rarely needed for
direct instrumentation. Use cases are weighted histograms or timing
histograms, see kubernetes/kubernetes#109277
for a real-world example.

However, float histograms happen all the time as results of PromQL
expressions. Following the same line of argument as above, those float
histograms can end up in the TSDB via recording rules, which means
they can be exposed via federation.

Note that float histograms are implicitly supported by the original
Prometheus text format, as this format simply uses floating point
numbers for all sample values. OpenMetrics has avoided this ambiguity
and has specified integers for bucket counts and the count of
observations in a histogram, which means it needs to be extended to
support float histograms, similar to how this commit extends the
original Prometheus protobuf format.

Signed-off-by: beorn7 beorn@grafana.com

Note that this is only an extension of the proto spec. Both generators
and consumers of the protobuf still need changes to make use of these
changes.

Gauge histograms measure current distributions. For one, they are
inspired by the GaugeHistogram type introducted by OpenMetrics, see
https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#gaugehistogram

They are also handled in the same way as OpenMetrics does it, by
using a new MetricType enum field GAUGE_HISTOGRAM, but not changing
anything else, i.e. for both regular and gauge histograms, the same
Histogram message type is used.

The other reason why we need gauge histograms comes from PromQL: If
you `rate` a histogram (which is possible with the new sparse
histograms as 1st class data type), the result is a gauge histogram. A
rate'd histogram can be created by a recording rule and then stored in
the TSDB. From there, it can be exposed by federation, so we need to
be able to represent it in the exposition format.

Float histograms are histograms where all counts (count of
observations, counts in each bucket, zero bucket count) are floating
point numbers rather than integer numbers. They are rarely needed for
direct instrumentation. Use cases are weighted histograms or timing
histograms, see kubernetes/kubernetes#109277
for a real-world example.

However, float histograms happen all the time as results of PromQL
expressions. Following the same line of argument as above, those float
histograms can end up in the TSDB via recording rules, which means
they can be exposed via federation.

Note that float histograms are implicitly supported by the original
Prometheus text format, as this format simply uses floating point
numbers for all sample values. OpenMetrics has avoided this ambiguity
and has specified integers for bucket counts and the count of
observations in a histogram, which means it needs to be extended to
support float histograms, similar to how this commit extends the
original Prometheus protobuf format.

Signed-off-by: beorn7 <beorn@grafana.com>
@beorn7 beorn7 requested a review from codesome June 14, 2022 18:12
@beorn7
Copy link
Member Author

beorn7 commented Jun 14, 2022

What I forgot to mention, but maybe it is obvious anyway: In the way things are designed here, it is no problem at all to represent a histogram that is both a gauge histogram and a float histogram (the typical outcome of a recording rule that rates a histogram).

@bboreham
Copy link
Member

"float histograms" is a new concept to me; it seems that much of the discussion is at prometheus/client_golang#796.

@beorn7
Copy link
Member Author

beorn7 commented Jun 19, 2022

prometheus/client_golang#796 is more the start of a thought that, in the end, arrived at a "scaled" or float histogram, which can be seen in the aforementioned kubernetes/kubernetes#109277. While I see it as a valid use case, it's still fairly niche. I guess federation is a more pressing reason to allow a float histogram in the exposition format. At the end of the day, the reason doesn't matter, though. Float histogram are a thing, even if rare.

Signed-off-by: beorn7 <beorn@grafana.com>
@beorn7
Copy link
Member Author

beorn7 commented Jun 29, 2022

/cc @marctc

@beorn7
Copy link
Member Author

beorn7 commented Jun 29, 2022

Since this has been out for a while and it is only for the super experimental sparsehistogram branch, I will merge it now. @marctc plans to work on implementing ingestion (and ultimately storage) for this within Prometheus. Based on the experience, we can then iterate on the proto spec here before seeing it in main (if this ever goes to main).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants