New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add float histograms and gauge histograms to proto spec #58
Conversation
Note that this is only an extension of the proto spec. Both generators and consumers of the protobuf still need changes to make use of these changes. Gauge histograms measure current distributions. For one, they are inspired by the GaugeHistogram type introducted by OpenMetrics, see https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#gaugehistogram They are also handled in the same way as OpenMetrics does it, by using a new MetricType enum field GAUGE_HISTOGRAM, but not changing anything else, i.e. for both regular and gauge histograms, the same Histogram message type is used. The other reason why we need gauge histograms comes from PromQL: If you `rate` a histogram (which is possible with the new sparse histograms as 1st class data type), the result is a gauge histogram. A rate'd histogram can be created by a recording rule and then stored in the TSDB. From there, it can be exposed by federation, so we need to be able to represent it in the exposition format. Float histograms are histograms where all counts (count of observations, counts in each bucket, zero bucket count) are floating point numbers rather than integer numbers. They are rarely needed for direct instrumentation. Use cases are weighted histograms or timing histograms, see kubernetes/kubernetes#109277 for a real-world example. However, float histograms happen all the time as results of PromQL expressions. Following the same line of argument as above, those float histograms can end up in the TSDB via recording rules, which means they can be exposed via federation. Note that float histograms are implicitly supported by the original Prometheus text format, as this format simply uses floating point numbers for all sample values. OpenMetrics has avoided this ambiguity and has specified integers for bucket counts and the count of observations in a histogram, which means it needs to be extended to support float histograms, similar to how this commit extends the original Prometheus protobuf format. Signed-off-by: beorn7 <beorn@grafana.com>
What I forgot to mention, but maybe it is obvious anyway: In the way things are designed here, it is no problem at all to represent a histogram that is both a gauge histogram and a float histogram (the typical outcome of a recording rule that |
"float histograms" is a new concept to me; it seems that much of the discussion is at prometheus/client_golang#796. |
prometheus/client_golang#796 is more the start of a thought that, in the end, arrived at a "scaled" or float histogram, which can be seen in the aforementioned kubernetes/kubernetes#109277. While I see it as a valid use case, it's still fairly niche. I guess federation is a more pressing reason to allow a float histogram in the exposition format. At the end of the day, the reason doesn't matter, though. Float histogram are a thing, even if rare. |
Signed-off-by: beorn7 <beorn@grafana.com>
/cc @marctc |
Since this has been out for a while and it is only for the super experimental sparsehistogram branch, I will merge it now. @marctc plans to work on implementing ingestion (and ultimately storage) for this within Prometheus. Based on the experience, we can then iterate on the proto spec here before seeing it in main (if this ever goes to main). |
This is only for the sparsehistogram branch!
@codesome as discussed before. I'm hereby starting the work to support float histograms and gauge histograms in the exposition format (and ultimately in TSDB and federation).
@bboreham to double check if my protobuf handling makes sense here. The idea is that the common case of a normal histogram doesn't look different at all on the wire.
@cstyan & @csmarchbanks Note that this is not yet an update of the remote-write protobuf but merely of the exposition format. However, we have to support float and gauge histograms there as well, so I thought we first complete the exposition proto spec and let it inform the remote-write proto spec. (But keep in mind that the more important source of inspiration is the respective Go types, i.e. https://github.com/prometheus/prometheus/blob/095b6c93dd5ab75f0c9f22f52b4fb5f45b33ff80/model/histogram/histogram.go#L37-L58 and https://github.com/prometheus/prometheus/blob/095b6c93dd5ab75f0c9f22f52b4fb5f45b33ff80/model/histogram/float_histogram.go#L30-L50 .)
Commit description follows:
Note that this is only an extension of the proto spec. Both generators
and consumers of the protobuf still need changes to make use of these
changes.
Gauge histograms measure current distributions. For one, they are
inspired by the GaugeHistogram type introducted by OpenMetrics, see
https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#gaugehistogram
They are also handled in the same way as OpenMetrics does it, by
using a new MetricType enum field GAUGE_HISTOGRAM, but not changing
anything else, i.e. for both regular and gauge histograms, the same
Histogram message type is used.
The other reason why we need gauge histograms comes from PromQL: If
you
rate
a histogram (which is possible with the new sparsehistograms as 1st class data type), the result is a gauge histogram. A
rate'd histogram can be created by a recording rule and then stored in
the TSDB. From there, it can be exposed by federation, so we need to
be able to represent it in the exposition format.
Float histograms are histograms where all counts (count of
observations, counts in each bucket, zero bucket count) are floating
point numbers rather than integer numbers. They are rarely needed for
direct instrumentation. Use cases are weighted histograms or timing
histograms, see kubernetes/kubernetes#109277
for a real-world example.
However, float histograms happen all the time as results of PromQL
expressions. Following the same line of argument as above, those float
histograms can end up in the TSDB via recording rules, which means
they can be exposed via federation.
Note that float histograms are implicitly supported by the original
Prometheus text format, as this format simply uses floating point
numbers for all sample values. OpenMetrics has avoided this ambiguity
and has specified integers for bucket counts and the count of
observations in a histogram, which means it needs to be extended to
support float histograms, similar to how this commit extends the
original Prometheus protobuf format.
Signed-off-by: beorn7 beorn@grafana.com