Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support lz4 block format #53

Merged
merged 2 commits into from Apr 14, 2021
Merged

Support lz4 block format #53

merged 2 commits into from Apr 14, 2021

Conversation

milesgranger
Copy link
Owner

To help with dask/fastparquet#580

@martindurant
Copy link

benchmarks look good

@milesgranger
Copy link
Owner Author

milesgranger commented Apr 14, 2021

This will add block format for lz4, but not (yet) _into options as lz4 doesn't support de/compression into slices, only outputting new Vecs.

There is a possibility _intos could be added later as lz4_flex may be adding this API in the future, trade off there being no guarantee their the API would follow a similar Python API as lz4 crate does now; so would almost certainly require re-implementing the store_size component. Could always get creative and use some unsafe code to trick whatever slice we have into a Vec if it's really desired; as I'm pretty sure lz4 crate won't be changing its API anytime soon; but have opened an issue anyway

@milesgranger milesgranger marked this pull request as ready for review April 14, 2021 13:29
@milesgranger
Copy link
Owner Author

@martindurant is this good with you? I'll come back for the _intos depending on what becomes of that issue I made.

@martindurant
Copy link

Yes, totally good.
It turns out there is also a hadoop-specific version of lz4 that was used by map-reduce, but I think we can safely ignore it.

@milesgranger milesgranger merged commit 484cddd into master Apr 14, 2021
@milesgranger milesgranger deleted the lz4-block-support branch April 14, 2021 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants