Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add size statistics introduced in PARQUET-2261 #5486

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

etseidl
Copy link

@etseidl etseidl commented Mar 8, 2024

Which issue does this PR close?

Closes #5022

Rationale for this change

Implements new page and column chunk size statistics introduced in PARQUET-2261

What changes are included in this PR?

Adds the necessary structures from the updated parquet.thrift, and adds the code necessary to populate them.

Are there any user-facing changes?

No

@github-actions github-actions bot added the parquet Changes to the parquet crate label Mar 8, 2024
@etseidl
Copy link
Author

etseidl commented Mar 8, 2024

This is my first foray into Rust programming, so I'm not sure everything is done as idiomatically as possible. Submitting this now to get early feedback on my approach. I'm also wondering how much testing to add, and whether this should have a configuration parameter to turn generating the statistics off.

@alamb
Copy link
Contributor

alamb commented Mar 12, 2024

I triggered the CI checks and will try and get to review this over the next few days if no one beats me to it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracking for Parquet size statistics
2 participants