New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prompb: Add histograms to remote write/read protobufs #10870
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With my limited knowledge on protos, LGTM
Thanks, @codesome . Still waiting for remote write folks to comment. Note that prometheus/client_model#58 has changes to the exposition format protobuf, adding support for float and gauge histograms, with some possibly enlightening comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No objections from me.
I do think you should link to the design (which I think is here.
Also mention / link to the explanation for "float histograms".
sint32 offset = 1; // Gap to previous span, or starting point for 1st span (which can be negative). | ||
uint32 length = 2; // Length of consecutive buckets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I note these are optional in client_model
, and I couldn't figure out why.
If the answer is other than "mistake", please explain both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that's mostly historical. The proto spec in prometheus/client_model is using the proto2 syntax. Back then, protobuf had optional and non-optional fields. There were different schools of thought. Some considered optional
fields harmful, some considered non-optional fields harmful. The Prometheus exposition protobuf was created following the latter school of thought, according to which you declare all fields optional
, even if they aren't really optional.
proto3 resolved the schism by (arguably) declaring the "make everything optional
" school as the winner. In proto3, all fields are implicitly optional. If they are missing, the default value is filled in, and (on the API level) you cannot actually distinguish between "unset" and "explicitly specified the value that happens to be the default value". If you really want an "unset" state, you have to declare a field as repeated
but only care about zero or one repeats.
If the above is inaccurate, please pardon me. My context about protobuf is quite incomplete.
repeated Span span = 1; | ||
// Only one of "delta" or "count" may be used, the former for regular | ||
// histograms with integer counts, the latter for float histograms. | ||
repeated sint64 delta = 2; // Count delta of each bucket compared to previous one (or to zero for 1st bucket). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, logically these values belong with the Span
, but as an optimisation to reduce Protobuf metadata (tag and length) you have pulled them up one level and concatenated them, so the encoder/decoder must do a little extra work.
Either way, I think it would be worth some more comments so readers can understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are quite right. I'll add more comments.
For the decoder, it's actually less work, because the protobuf layout happens to be the same as the layout in TSDB. (Which isn't a reason to do it that way on its own, of course, just a convenient coincidence.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For remote write, it's actually the TSDB layout on both sides, so it's even less work for the encoder… (Encoding is more costly for exposition, depending on the way the instrumentation library manages the buckets internally.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more context: In TSDB, the layout of buckets (i.e. the set of Span
s) is even the same across a whole chunk (currently up to 120 histograms). That's where specifying the spans separately (and only once) makes a huge dent. I was considering doing the same for all samples in one batch of remote-write, but decided against it for now, mostly for the following reason.
- Batching multiple samples of the same series into one remote-write request isn't happening that often, and even if it does, it is more like a handful of samples, not ~100. Therefore, this optimization doesn't really buy us a lot, while it comes with the cost of a leaky abstraction and more complex remote-write code (see next item).
- While consecutive samples of the same series are likely to come from the same chunk (and therefore have the same bucket layout), it's not guaranteed (or in other words: remote-write doesn't know anything about TSDB chunks, and it also shouldn't know anything about it). The remote-write code needed to reconcile different bucket layouts (by inserting empty buckets where needed) if they happen to end up in the same batch. (TSDB is doing exactly that reconciliation, but that's because there is a huge return for this investment, see above.) Inversely, there isn't any guarantee, either, that a remote-write receiver will put samples from the same batch into the same chunk (if it is a chunk-based storage at all).
I think the already mentioned prometheus/client_model#58 is a good starting point to understand the need for "float histograms". I expect the most common use case for remote write will be to send the result of recording rules to the remote storage. (If you |
05f89d3
to
3dac538
Compare
I added two more things:
|
I have now also added a link to the design doc in the doc comments, including a pointer to the concept of float histograms. This is just for the sparsehistogram branch, and it got a few looks already. I'm merging it now, but we can still iterate on it. Whoever works on implementing the actual use of these proto messages (myself, @cstyan, @csmarchbanks, maybe even @LeviHarrison), feel free to come back to the proto spec and adjust it to needs and insights found later. |
Signed-off-by: beorn7 <beorn@grafana.com>
This is only for the sparsehistogram branch!
A first attempt to add histograms to remote write/read protobufs.
@bboreham again cc'ing you to double-check if this makes sense.