Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve reset hint during histogram deduplication #7092

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

fpetkovski
Copy link
Contributor

@fpetkovski fpetkovski commented Jan 26, 2024

Penalty based deduplication can over-skip samples when switching between replicas because it uses a penalty of 2 times the estimated scrape interval. This works okay for float counters because the reset is detected based on the value of the counter. With native histograms, detecting a reset is much more expensive which is why there is a hint stored in the first sample of a chunk, indicating whether the chunk has been created because of a reset.

It can easily happen that the dedup iterator switches replicas, and skips the first sample of a chunk because of the added penalty. In this case we will not read the hint and will assume that no histogram reset happened.

This commit fixes the issue by explicitly calling DetectReset if a replica stream has been switched. The result is then stored and set in the histogram returned by AtHistogram and AtFloatHistogram.

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Fix counter reset detection during deduplication of native histograms.

Verification

I added a unit test, and also verified this against a case we saw in production.

@fpetkovski fpetkovski force-pushed the dedup-histogram-reset branch 4 times, most recently from f8cf2fe to 6cd1b77 Compare January 29, 2024 15:14
@pull-request-size pull-request-size bot added size/M and removed size/L labels Jan 29, 2024
Copy link
Member

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add a test case? 🤔

func newHistogramResetDetector(iterator chunkenc.Iterator) *histogramResetDetector {
return &histogramResetDetector{
Iterator: iterator,
i: -1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the only purpose of this is to check whether Next() has been called? Maybe we could convert it into a simple boolean? Otherwise this might overflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable tracks whether we are at the first sample. We could use a boolean pointer because we need to represent 3 states, one being Next not being called at all. AFAIK a chunk has 120 samples so we should not have a risk of overflow, and even if it does the side-effect will be that we recalculate a reset one more time in the engine.

@fpetkovski
Copy link
Contributor Author

Will do, looks like the test got lost when I moved the struct to a new package.

@GiedriusS
Copy link
Member

Any update on this?

@fpetkovski
Copy link
Contributor Author

I'll get it done this week, got tangled up in something else.

@pull-request-size pull-request-size bot added size/L and removed size/M labels Feb 23, 2024
Penalty based deduplication can over-skip samples because when switching
between replicas, because it uses a penalty of 2 times the estimated scrape interval.
This works okay for float counters because the reset is detected based on the value of the counter.
With native histograms, detecting a reset is much more expensive which is why there is a
hint stored in the first sample of a chunk, which indicates whether the chunk has been created
because of a reset.

It can easily happen that the dedup iterator switches replicas, and skips the first sample of a chunk
because of the added penalty. In this case we will not read the hint and will assume that no
histogram reset happened.

This commit fixes the issue by explicitly calling DetectReset if a replica stream has been switched.
The result is then stored and set in the histogram returned by AtHistogram and AtFloatHistogram.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
@fpetkovski fpetkovski force-pushed the dedup-histogram-reset branch 2 times, most recently from 6491ab9 to 759ce0c Compare February 23, 2024 09:46
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
@@ -1033,7 +1036,8 @@ func (s *mockedSeriesIterator) At() (t int64, v float64) {

// TODO(rabenhorst): Needs to be implemented for native histogram support.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment is not valid anymore, right?

Copy link
Contributor

@MichaHoffmann MichaHoffmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants