Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parquet Writer Ignores "max statistics size" specification in WriterProperties #2033

Open
alamb opened this issue Jul 8, 2022 · 0 comments
Labels
bug parquet Changes to the parquet crate

Comments

@alamb
Copy link
Contributor

alamb commented Jul 8, 2022

Describe the bug

WriterProperties::max_statistics_size is ignored in the writer

https://docs.rs/parquet/17.0.0/parquet/file/properties/struct.WriterProperties.html#method.max_statistics_size

To Reproduce
Set the stats size to 1 (byte) and notice that statistics are still happily created

Expected behavior
The statistics size limit should be respected (and documented more carefully -- like is it the total size of all statistics? If the limit is exceeded will partial statistics have been written, etc). Looking at the java or C++ parquet writer for inspiration is likely a good idea.

Additional context

In #2022 @tustvold fixed the writer to respect the "do/don't compute stats", however the "max size of computed statistics" setting is still ignored

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug parquet Changes to the parquet crate
Projects
None yet
Development

No branches or pull requests

1 participant