Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporal filters have an off-by-one error #27059

Open
frankmcsherry opened this issue May 13, 2024 · 1 comment
Open

Temporal filters have an off-by-one error #27059

frankmcsherry opened this issue May 13, 2024 · 1 comment
Labels
C-bug Category: something is broken

Comments

@frankmcsherry
Copy link
Contributor

frankmcsherry commented May 13, 2024

What version of Materialize are you using?

main

What is the issue?

We introduced in #14463 an off-by-one error, recorded in temporal.slt:

# We don't currently support specifying the max timestamp due to a limitation in linear.rs and the step_mz_timestamp internal function.
# That big number there should be u64::MAX.
statement ok
CREATE VIEW intervals_max (a, b) AS VALUES (0, 18446744073709551615)

The issue is that rather than perform a += 1 in the space of numerics and then determine if the result fits in a u64, we do the increment in a u64 natively and produce a EvalError::MzTimestampStepOverflow if it overflows. This means that the behavior of selecting from a index and a view can be different if the data contains u64::MAX (the index will error, the view will not).

I think in principle one should be able to guard evaluation in linear.rs with the logic looks for this error specifically and when observed performs the pre-#14463 behavior (which depends on whether it is lower or upper, but largely "ignoring the bound" rather than erroring).

Edit: to reframe, the issue is that there is different behavior with and without an index. Our two evaluation strategies, 1. replace mz_now() by a known timestamp and 2. build a dataflow widget, produce different outputs on some inputs. Those inputs affected are few, as off-by-one errors tend to be, but the difference between the two means has the potential to introduce unknown bad behavior.

@frankmcsherry frankmcsherry added the C-bug Category: something is broken label May 13, 2024
@ggevay ggevay removed their assignment May 15, 2024
@ggevay
Copy link
Contributor

ggevay commented May 15, 2024

We'll get back to this later. Or maybe it could be considered as part of #27001, where more general occurrences of this problem are being discussed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: something is broken
Projects
None yet
Development

No branches or pull requests

2 participants