Native histograms custom buckets storage #13662

krajorama · 2024-02-28T13:08:05Z

Implements parts of #13485
#13486
#13487
#13508

In short: implements the internal model, storage (TSDB) and basic unit tests for the storage and query.
No user visible impact yet. No scraping/remote write/configuration change.

* add custom buckets to native histogram model * simple copy for custom bounds * return errors for unsupported add/sub operations * add test cases for string and update appendhistogram in scrape to account for new schema * check fields which are supposed to be unused but may affect results in equals * allow appending custom buckets histograms regardless of max schema Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

krajorama · 2024-02-28T14:18:03Z

/prombench main

prombot · 2024-02-28T14:18:06Z

@krajorama is not a org member nor a collaborator and cannot execute benchmarks.

krajorama · 2024-02-28T14:36:48Z

/prombench main

prombot · 2024-02-28T14:36:51Z

⏱️ Welcome to Prometheus Benchmarking Tool. ⏱️

Compared versions: PR-13662 and main

After successful deployment, the benchmarking results can be viewed at:

Other Commands:
To stop benchmark: /prombench cancel
To restart benchmark: /prombench restart main

bboreham · 2024-02-29T10:09:39Z

I don't see any difference. Also, prombench data doesn't have any native histograms.

prombot · 2024-03-02T14:41:37Z

Benchmark tests are running for 3 days! If this is intended ignore this message otherwise you can cancel it by commenting: /prombench cancel

bboreham · 2024-03-02T16:50:23Z

/prombench cancel

prombot · 2024-03-02T16:50:25Z

Benchmark cancel is in progress.

krajorama · 2024-03-04T09:33:47Z

I don't see any difference. Also, prombench data doesn't have any native histograms.

We made modifications to the engine, was wondering about any regression on the non-experimental part in floats.
Also my apologies, didn't realize I had to manually cancel.

* add custom bounds to chunks encoding * change custom buckets schema number * rename custom bounds to custom values Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

The size of histogram points are now bigger by 24 bytes due to the custom values slice. When histograms are loaded into partial results in vector selectors we use HPoint type where the size is calculated as (size of histogram + 8 for timestamp)/16. https://github.com/prometheus/prometheus/blob/a3d1a46eda682590a80fb1f15959457dad1e5d91/promql/value.go#L176 When histograms are put into Sample type in range evaluations, the Sample has more overhead and the size is calculated differently: (size of histogram / 16) + 1 for time stamp. https://github.com/prometheus/prometheus/blob/a3d1a46eda682590a80fb1f15959457dad1e5d91/promql/engine.go#L1928 When the size of the histogram is 16k, then the first calculation gives k but the second gives k+1 for the sample count. If the histogram size is 16k+8, then both would give k+1. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

…s and floats (#13828) * Use single bit to differentiate between optimized bounds and floats Use one bit to decide what kind of data to read/write. This reduces storage need of floats from 72 bits to 65 bits and makes the integers store in 5 to 32 bits instead of 16. Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com> Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com> Co-authored-by: Jeanette Tan <jeanette.tan@grafana.com>

…buckets converted from classic histograms (#13794) * modify unit test framework to automatically generate native histograms with custom buckets from classic histogram series * add very basic tests for classic histogram converted into native histogram with custom bounds * fix histogram_quantile for native histograms with custom buckets * make loading with nhcb explicit * evaluate native histograms with custom buckets on queries with explicit keyword * use regex replacer * use temp histogram struct for automatically loading converted nhcb Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com> Signed-off-by: George Krajcsovits <krajorama@users.noreply.github.com>

[nhcb] Merge main at bd18787

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

[nhcb branch] Merge main

* process custom values in histogram unit test framework * check for warnings when evaluating in unit test framework * add test cases for custom buckets in test framework Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

model/histogram/float_histogram.go

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

[nhcb branch] Merge from main

beorn7 · 2024-05-15T15:34:56Z

I need a larger chunk of uninterrupted time to review this. I haven't got to carving that out. Will try ASAP.

fatsheep9146 · 2024-05-17T01:17:03Z

model/histogram/histogram.go

@@ -356,26 +417,55 @@ func resize[T any](items []T, n int) []T {
 	return items[:n]
 }

+func clearIfNotNil[T any](items []T) []T {


How about move this function into generic.go, since it is called both in float_histogram.go and histogram.go?

And it's literally a generic function.

beorn7 · 2024-05-23T14:45:42Z

FYI: I'm reviewing this now, but it will take me a while to get through all of it. Truly epic!

beorn7

Epic work! And very nicely done. I have only one substantial note (about the implicit testing of NHCB in the testing framework, see detailed comments), otherwise just nitpicking.

Note: I haven't looked at most of the Go test code. I'll do that once the other code is in its final state.

beorn7 · 2024-05-23T12:30:54Z

model/histogram/float_histogram.go

-	// then each power of two is divided into 2^n logarithmic buckets.  Or
-	// in other words, each bucket boundary is the previous boundary times
-	// 2^(2^-n).
+	// Currently valid schema numbers are -4 <= n <= 8 for exponential buckets,


Suggested change

// Currently valid schema numbers are -4 <= n <= 8 for exponential buckets,

// Currently valid schema numbers are -4 <= n <= 8 for exponential buckets.

beorn7 · 2024-05-23T12:32:04Z

model/histogram/float_histogram.go

+	// This slice is interned, to be treated as immutable and copied by reference.
+	// These numbers should be strictly increasing. This field is only used when the
+	// schema is for custom buckets, and the ZeroThreshold, ZeroCount, NegativeSpans
+	// and NegativeBuckets fields are not used.


Suggested change

// and NegativeBuckets fields are not used.

// and NegativeBuckets fields are not used in that case.

beorn7 · 2024-05-23T12:38:27Z

model/histogram/float_histogram.go

+	if h.UsesCustomBuckets() {
+		panic(fmt.Errorf("cannot reduce resolution to %d when there are custom buckets", targetSchema))
+	}
+	if IsCustomBucketsSchema(targetSchema) {
+		panic("cannot reduce resolution to custom buckets schema")
+	}


The caller should know that these panics happen upon certain inputs. I would add them to the doc comment above, along the lines: "This method panics if a custom buckets schema is used in the receiving FloatHistogram or as the provided targetSchema."

beorn7 · 2024-05-23T12:39:19Z

model/histogram/float_histogram.go

@@ -212,12 +255,16 @@ func (h *FloatHistogram) TestExpression() string {

 // ZeroBucket returns the zero bucket.


As above, note here that this will panic for custom buckets.

beorn7 · 2024-05-23T12:51:38Z

model/histogram/float_histogram.go

+// value of any upper bucket boundary, the iterator starts with the first bucket.
+// Otherwise, it will skip all buckets with an absolute value of their upper boundary ≤
+// absoluteStartValue. For custom bucket schemas, absoluteStartValue is ignored and
+// no buckets are skipped.
 //
 // targetSchema must be ≤ the schema of FloatHistogram (and of course within the


This should also mention that custom buckets don't merge with other schemas.

beorn7 · 2024-05-30T11:20:29Z

promql/promqltest/testdata/native_histograms.test

+
+# Test native histograms with custom buckets.
+load 5m
+	custom_buckets_histogram	{{schema:-53 sum:5 count:4 custom_values:[5 10] buckets:[1 2 1]}}x10


tabs vs spaces issues makes this look weird with the "wrong" tab settings. As above, I'd recommend to just de-tabify all the *.test files.

beorn7 · 2024-05-30T11:21:50Z

rules/manager_test.go

@@ -1400,7 +1400,7 @@ func TestNativeHistogramsInRecordingRules(t *testing.T) {

 	expHist := hists[0].ToFloat(nil)
 	for _, h := range hists[1:] {
-		expHist = expHist.Add(h.ToFloat(nil))
+		expHist, _ = expHist.Add(h.ToFloat(nil))


Let's add an easy assertion that the error is nil?

beorn7 · 2024-05-30T11:56:15Z

scrape/target.go

@@ -366,15 +366,15 @@ type bucketLimitAppender struct {
 func (app *bucketLimitAppender) AppendHistogram(ref storage.SeriesRef, lset labels.Labels, t int64, h *histogram.Histogram, fh *histogram.FloatHistogram) (storage.SeriesRef, error) {
 	if h != nil {
 		for len(h.PositiveBuckets)+len(h.NegativeBuckets) > app.limit {
-			if h.Schema == -4 {
+			if h.Schema <= histogram.ExponentialSchemaMin || h.Schema > histogram.ExponentialSchemaMax {


Does it really make sense to test for h.Schema > histogram.ExponentialSchemaMax in this loop? We only reduce the schema in the loop.

beorn7 · 2024-05-30T11:56:21Z

scrape/target.go

 				return 0, errBucketLimit
 			}
 			h = h.ReduceResolution(h.Schema - 1)
 		}
 	}
 	if fh != nil {
 		for len(fh.PositiveBuckets)+len(fh.NegativeBuckets) > app.limit {
-			if fh.Schema == -4 {
+			if fh.Schema <= histogram.ExponentialSchemaMin || fh.Schema > histogram.ExponentialSchemaMax {


Same as above.

beorn7 · 2024-05-30T12:39:30Z

tsdb/chunkenc/histogram_meta.go

+// we can convert the value to a float64 by subtracting 1 and dividing by 1000.
+func putCustomBound(b *bstream, f float64) {
+	tf := f * 1000
+	// 33554431-1 comes from the maximum that can be stored in a varint in 4


varint → varbit

krajorama requested a review from roidelapluie as a code owner February 28, 2024 13:08

krajorama marked this pull request as draft February 28, 2024 13:08

prombot added the prombench label Feb 28, 2024

zenador and others added 11 commits March 22, 2024 14:36

Add custom buckets to native histogram chunks encoding (#13706)

4acbb7d

* add custom bounds to chunks encoding * change custom buckets schema number * rename custom bounds to custom values Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

Merge branch 'main' into nhcb

a3d1a46

Merge branch 'main' into nhcb

2a4aa08

Merge remote-tracking branch 'upstream/main' into update-nhcb

bcafa5f

Merge pull request #13985 from krajorama/update-nhcb

9fd3914

[nhcb] Merge main at bd18787

Merge branch 'main' into nhcb

796b1bb

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

Merge pull request #14067 from zenador/nhcb-merge-main

5f153a8

[nhcb branch] Merge main

krajorama marked this pull request as ready for review May 8, 2024 11:58

krajorama requested a review from jesusvazquez as a code owner May 8, 2024 11:58

krajorama changed the title ~~Implement native histograms custom buckets~~ Native histograms custom buckets storage May 8, 2024

beorn7 self-requested a review May 8, 2024 12:50

fatsheep9146 reviewed May 10, 2024

View reviewed changes

model/histogram/float_histogram.go Show resolved Hide resolved

Merge branch 'main' into nhcb

f028496

Signed-off-by: Jeanette Tan <jeanette.tan@grafana.com>

Merge pull request #14098 from zenador/nhcb-merge-main

4a5d846

[nhcb branch] Merge from main

fatsheep9146 reviewed May 17, 2024

View reviewed changes

beorn7 mentioned this pull request May 30, 2024

promql: extend test scripting language to support asserting on expected error message #14038

Open

beorn7 requested changes May 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native histograms custom buckets storage #13662

Native histograms custom buckets storage #13662

krajorama commented Feb 28, 2024 •

edited

krajorama commented Feb 28, 2024

prombot commented Feb 28, 2024

krajorama commented Feb 28, 2024

prombot commented Feb 28, 2024

bboreham commented Feb 29, 2024

prombot commented Mar 2, 2024

bboreham commented Mar 2, 2024

prombot commented Mar 2, 2024

krajorama commented Mar 4, 2024

beorn7 commented May 15, 2024

fatsheep9146 May 17, 2024

beorn7 May 23, 2024

beorn7 commented May 23, 2024

beorn7 left a comment

beorn7 May 23, 2024

beorn7 May 23, 2024

beorn7 May 23, 2024

beorn7 May 23, 2024

beorn7 May 23, 2024

beorn7 May 30, 2024

beorn7 May 30, 2024

beorn7 May 30, 2024

beorn7 May 30, 2024

beorn7 May 30, 2024

	// Currently valid schema numbers are -4 <= n <= 8 for exponential buckets,
	// Currently valid schema numbers are -4 <= n <= 8 for exponential buckets.

	// and NegativeBuckets fields are not used.
	// and NegativeBuckets fields are not used in that case.

		@@ -212,12 +255,16 @@ func (h *FloatHistogram) TestExpression() string {

		// ZeroBucket returns the zero bucket.

Native histograms custom buckets storage #13662

Are you sure you want to change the base?

Native histograms custom buckets storage #13662

Conversation

krajorama commented Feb 28, 2024 • edited

krajorama commented Feb 28, 2024

prombot commented Feb 28, 2024

krajorama commented Feb 28, 2024

prombot commented Feb 28, 2024

bboreham commented Feb 29, 2024

prombot commented Mar 2, 2024

bboreham commented Mar 2, 2024

prombot commented Mar 2, 2024

krajorama commented Mar 4, 2024

beorn7 commented May 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beorn7 commented May 23, 2024

beorn7 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

krajorama commented Feb 28, 2024 •

edited