Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapter/compute/storage: cardinality statistics frequently timeout #26986

Open
Tracked by #26869
mgree opened this issue May 8, 2024 · 1 comment
Open
Tracked by #26869

adapter/compute/storage: cardinality statistics frequently timeout #26986

mgree opened this issue May 8, 2024 · 1 comment
Labels
A-ADAPTER Topics related to the ADAPTER layer A-COMPUTE Topics related to the Compute layer A-optimization Area: query optimization and transformation A-STORAGE Topics related to the Storage layer C-bug Category: something is broken

Comments

@mgree
Copy link
Contributor

mgree commented May 8, 2024

What version of Materialize are you using?

v0.97

What is the issue?

The CachedStatsOracle often works locally, but times out even with substantially higher timeouts.

Stats are currently ignored in join implementation, so this is not high impact.

A few thoughts on what could be causing the problem here, from discussion with @bkirwi:

  • timestamp selection (we use query_as_of, which may be too early) and/or interaction between strict serializability and timestamp selection
  • actually slow communication around stats collection
  • programmer error of async/timeouts
@mgree mgree added C-bug Category: something is broken A-optimization Area: query optimization and transformation A-STORAGE Topics related to the Storage layer A-COMPUTE Topics related to the Compute layer A-ADAPTER Topics related to the ADAPTER layer labels May 8, 2024
@bkirwi
Copy link
Contributor

bkirwi commented May 8, 2024

To expand on a couple of things:

  • In general it's extremely possible for the query to select a timestamp that's not available in the sources yet, especially under strict serializability. That means the query will block -- which is fine, but we'd prefer the optimizer didn't.
    • OTOH, we probably don't want the stats at the latest available frontier, incase the sources are way ahead.
    • One possibility we discussed is using the query_as_of as an upper bound.
  • It sounded like we were seeing timeouts even when the timeout was set at multiple seconds... which seems too high for the above to explain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ADAPTER Topics related to the ADAPTER layer A-COMPUTE Topics related to the Compute layer A-optimization Area: query optimization and transformation A-STORAGE Topics related to the Storage layer C-bug Category: something is broken
Projects
None yet
Development

No branches or pull requests

2 participants