Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSDB: Don't rely on integer overflow in head compaction check #13755

Merged
merged 5 commits into from
Mar 26, 2024

Conversation

56quarters
Copy link
Contributor

@56quarters 56quarters commented Mar 12, 2024

Don't rely on integer overflow for correct behavior in the Head.compactable() check.

Previously, the logic for determining if the head should be compacted relied on the default values for min and max time and integer overflow when they were checked in Head.compactable(). The check in Head.compactable() effectively did math.MinInt64 - math.MaxInt64 which overflowed and wrapped to 1. Since 1 is less than 1.5 times the chunk range, compaction did not happen. This was the correct behavior but relying on overflow wrapping is surprising.

This change adds a method for checking if the min time for the head is unset and uses it to short-circuit compaction in that case. It also replaces several explicit checks for the default value to determine if the head has not yet had any samples added.

Don't compact the Head block if there have not yet been any samples
appended.

Previously, the logic for determining if the head should be compacted
relied on the default values for min and max time and integer overflow
when they were checked in `Head.compactable()`. The check in
`Head.compactable()` effectively did `math.MinInt64 - math.MaxInt64`
which overflowed and wrapped to `1`. Since `1` is less than `1.5`
times the chunk range, compaction did not happen. This was the correct
behavior but relying on overflow wrapping is surprising.

This change add a method for checking if the min and max time for the
head is unset and uses it to short-circuit compaction in that case.
It also replaces several explicit checks for the default value to
determine if the head has not yet had any samples added.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
tsdb/head.go Outdated Show resolved Hide resolved
Copy link
Contributor

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Copy link
Collaborator

@machine424 machine424 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you adjust the PR title and the commit? They insinuates the compaction is currently triggered when the head isn't initialized, which isn't what you explain in the description.


How about embracing/making use of the overflow instead and just add an explicit regression test for it?

tsdb/head_test.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
@56quarters
Copy link
Contributor Author

56quarters commented Mar 15, 2024

Could you adjust the PR title and the commit? They insinuates the compaction is currently triggered when the head isn't initialized, which isn't what you explain in the description.

Sure.

How about embracing/making use of the overflow instead and just add an explicit regression test for it?

I'm not sure what you mean. Let the check overflow but add a test that ensures Head.compactable() always returns false with default values? I'd rather be explicit about not compacting the head when it hasn't received any samples in Head.compactable() since then looking at the code doesn't require knowing how go handles integer overflow to understand.

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
@56quarters 56quarters changed the title TSDB: Don't compact the head block when empty TSDB: Don't rely on integer overflow in head compaction check Mar 15, 2024
Copy link
Collaborator

@machine424 machine424 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both approaches suit me.
Thanks.
Let's see what the others think.

@krajorama krajorama self-requested a review March 18, 2024 15:06
Copy link
Member

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sensible and makes the code more consistent. The one thing that I would like to see is a testcase in TestNoEmptyBlocks in db_test.go that fails with the old code - due to it not checking that the data is valid.

@56quarters
Copy link
Contributor Author

The one thing that I would like to see is a testcase in TestNoEmptyBlocks in db_test.go that fails with the old code - due to it not checking that the data is valid.

I'm not actually sure I can do this. The old code worked correctly, it just relied on integer overflow which was confusing. In both the old code and the new code, Head.compactable() returns the correct result.

@krajorama
Copy link
Member

The one thing that I would like to see is a testcase in TestNoEmptyBlocks in db_test.go that fails with the old code - due to it not checking that the data is valid.

I'm not actually sure I can do this. The old code worked correctly, it just relied on integer overflow which was confusing. In both the old code and the new code, Head.compactable() returns the correct result.

This synthetic test fails on main I think:

func TestHeadCompactibleCornerCase(t *testing.T) {
	dir := t.TempDir()
	wal, err := wlog.NewSize(nil, nil, filepath.Join(dir, "wal"), 32768, wlog.CompressionSnappy)
	require.NoError(t, err)

	opts := DefaultHeadOptions()
	opts.ChunkRange = 1  // This is the extreme bit
	opts.ChunkDirRoot = dir

	h, err := NewHead(nil, nil, wal, nil, opts, nil)
	require.NoError(t, err)
	defer func() {
		require.NoError(t, h.Close())
	}()

	require.False(t, h.compactable())
}

@56quarters
Copy link
Contributor Author

This synthetic test fails on main I think:

Ah, I see. I'll add something like that. Thank you!

Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Copy link
Member

@krajorama krajorama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krajorama krajorama requested a review from bwplotka March 20, 2024 05:11
tsdb/head_test.go Outdated Show resolved Hide resolved
Signed-off-by: Nick Pillitteri <nick.pillitteri@grafana.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@jesusvazquez jesusvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krajorama krajorama merged commit 481f14e into prometheus:main Mar 26, 2024
24 checks passed
@56quarters 56quarters deleted the unset-min-max-time branch March 26, 2024 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants