Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default parquet compression format from Snappy to LZ4 #10647

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mrocklin
Copy link
Member

Snappy's status as default is maybe just due to history. Snappy had better Java support and LZ4 wasn't always available in systems like Spark. Today Spark and other systems support LZ4 as well, and LZ4 generally performs a bit better, especially on decompression.

This is a significant change, but I think the only reason not to do it is historical, which I think maybe isn't a good enough reason these days.

Snappy's status as default is maybe just due to history.  Snappy had
better Java support and LZ4 wasn't always available in systems like
Spark.  Today Spark and other systems support LZ4 as well, and LZ4
generally performs a bit better, especially on decompression.

This is a significant change, but I think the only reason not to do it
is historical, which I think maybe isn't a good enough reason these
days.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant