-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native histograms custom buckets storage #13662
base: main
Are you sure you want to change the base?
Conversation
* add custom buckets to native histogram model * simple copy for custom bounds * return errors for unsupported add/sub operations * add test cases for string and update appendhistogram in scrape to account for new schema * check fields which are supposed to be unused but may affect results in equals * allow appending custom buckets histograms regardless of max schema Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
/prombench main |
@krajorama is not a org member nor a collaborator and cannot execute benchmarks. |
/prombench main |
I don't see any difference. Also, prombench data doesn't have any native histograms. |
Benchmark tests are running for 3 days! If this is intended ignore this message otherwise you can cancel it by commenting: |
/prombench cancel |
Benchmark cancel is in progress. |
We made modifications to the engine, was wondering about any regression on the non-experimental part in floats. |
* add custom bounds to chunks encoding * change custom buckets schema number * rename custom bounds to custom values Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
The size of histogram points are now bigger by 24 bytes due to the custom values slice. When histograms are loaded into partial results in vector selectors we use HPoint type where the size is calculated as (size of histogram + 8 for timestamp)/16. https://github.com/prometheus/prometheus/blob/a3d1a46eda682590a80fb1f15959457dad1e5d91/promql/value.go#L176 When histograms are put into Sample type in range evaluations, the Sample has more overhead and the size is calculated differently: (size of histogram / 16) + 1 for time stamp. https://github.com/prometheus/prometheus/blob/a3d1a46eda682590a80fb1f15959457dad1e5d91/promql/engine.go#L1928 When the size of the histogram is 16k, then the first calculation gives k but the second gives k+1 for the sample count. If the histogram size is 16k+8, then both would give k+1. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
…s and floats (#13828) * Use single bit to differentiate between optimized bounds and floats Use one bit to decide what kind of data to read/write. This reduces storage need of floats from 72 bits to 65 bits and makes the integers store in 5 to 32 bits instead of 16. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com> Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>
…buckets converted from classic histograms (#13794) * modify unit test framework to automatically generate native histograms with custom buckets from classic histogram series * add very basic tests for classic histogram converted into native histogram with custom bounds * fix histogram_quantile for native histograms with custom buckets * make loading with nhcb explicit * evaluate native histograms with custom buckets on queries with explicit keyword * use regex replacer * use temp histogram struct for automatically loading converted nhcb Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>
[nhcb] Merge main at bd18787
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
[nhcb branch] Merge main
* process custom values in histogram unit test framework * check for warnings when evaluating in unit test framework * add test cases for custom buckets in test framework Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>
[nhcb branch] Merge from main
I need a larger chunk of uninterrupted time to review this. I haven't got to carving that out. Will try ASAP. |
@@ -356,26 +417,55 @@ func resize[T any](items []T, n int) []T { | |||
return items[:n] | |||
} | |||
|
|||
func clearIfNotNil[T any](items []T) []T { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about move this function into generic.go
, since it is called both in float_histogram.go
and histogram.go
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it's literally a generic function.
FYI: I'm reviewing this now, but it will take me a while to get through all of it. Truly epic! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epic work! And very nicely done. I have only one substantial note (about the implicit testing of NHCB in the testing framework, see detailed comments), otherwise just nitpicking.
Note: I haven't looked at most of the Go test code. I'll do that once the other code is in its final state.
// then each power of two is divided into 2^n logarithmic buckets. Or | ||
// in other words, each bucket boundary is the previous boundary times | ||
// 2^(2^-n). | ||
// Currently valid schema numbers are -4 <= n <= 8 for exponential buckets, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Currently valid schema numbers are -4 <= n <= 8 for exponential buckets, | |
// Currently valid schema numbers are -4 <= n <= 8 for exponential buckets. |
// This slice is interned, to be treated as immutable and copied by reference. | ||
// These numbers should be strictly increasing. This field is only used when the | ||
// schema is for custom buckets, and the ZeroThreshold, ZeroCount, NegativeSpans | ||
// and NegativeBuckets fields are not used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// and NegativeBuckets fields are not used. | |
// and NegativeBuckets fields are not used in that case. |
if h.UsesCustomBuckets() { | ||
panic(fmt.Errorf("cannot reduce resolution to %d when there are custom buckets", targetSchema)) | ||
} | ||
if IsCustomBucketsSchema(targetSchema) { | ||
panic("cannot reduce resolution to custom buckets schema") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller should know that these panics happen upon certain inputs. I would add them to the doc comment above, along the lines: "This method panics if a custom buckets schema is used in the receiving FloatHistogram or as the provided targetSchema."
@@ -212,12 +255,16 @@ func (h *FloatHistogram) TestExpression() string { | |||
|
|||
// ZeroBucket returns the zero bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, note here that this will panic for custom buckets.
// value of any upper bucket boundary, the iterator starts with the first bucket. | ||
// Otherwise, it will skip all buckets with an absolute value of their upper boundary ≤ | ||
// absoluteStartValue. For custom bucket schemas, absoluteStartValue is ignored and | ||
// no buckets are skipped. | ||
// | ||
// targetSchema must be ≤ the schema of FloatHistogram (and of course within the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also mention that custom buckets don't merge with other schemas.
|
||
# Test native histograms with custom buckets. | ||
load 5m | ||
custom_buckets_histogram {{schema:-53 sum:5 count:4 custom_values:[5 10] buckets:[1 2 1]}}x10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tabs vs spaces issues makes this look weird with the "wrong" tab settings. As above, I'd recommend to just de-tabify all the *.test files.
@@ -1400,7 +1400,7 @@ func TestNativeHistogramsInRecordingRules(t *testing.T) { | |||
|
|||
expHist := hists[0].ToFloat(nil) | |||
for _, h := range hists[1:] { | |||
expHist = expHist.Add(h.ToFloat(nil)) | |||
expHist, _ = expHist.Add(h.ToFloat(nil)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add an easy assertion that the error is nil?
@@ -366,15 +366,15 @@ type bucketLimitAppender struct { | |||
func (app *bucketLimitAppender) AppendHistogram(ref storage.SeriesRef, lset labels.Labels, t int64, h *histogram.Histogram, fh *histogram.FloatHistogram) (storage.SeriesRef, error) { | |||
if h != nil { | |||
for len(h.PositiveBuckets)+len(h.NegativeBuckets) > app.limit { | |||
if h.Schema == -4 { | |||
if h.Schema <= histogram.ExponentialSchemaMin || h.Schema > histogram.ExponentialSchemaMax { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it really make sense to test for h.Schema > histogram.ExponentialSchemaMax
in this loop? We only reduce the schema in the loop.
return 0, errBucketLimit | ||
} | ||
h = h.ReduceResolution(h.Schema - 1) | ||
} | ||
} | ||
if fh != nil { | ||
for len(fh.PositiveBuckets)+len(fh.NegativeBuckets) > app.limit { | ||
if fh.Schema == -4 { | ||
if fh.Schema <= histogram.ExponentialSchemaMin || fh.Schema > histogram.ExponentialSchemaMax { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
// we can convert the value to a float64 by subtracting 1 and dividing by 1000. | ||
func putCustomBound(b *bstream, f float64) { | ||
tf := f * 1000 | ||
// 33554431-1 comes from the maximum that can be stored in a varint in 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
varint → varbit
Implements parts of #13485
#13486
#13487
#13508
In short: implements the internal model, storage (TSDB) and basic unit tests for the storage and query.
No user visible impact yet. No scraping/remote write/configuration change.