Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow obtaining and restoring the streaming hashes' internal state. #112

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rgallardo-netflix
Copy link

This feature enables a "very large stream" hashing use case: Instead of trying to read a large remote object in a single pass, one can fetch chunks of it at a time and save the state after each chunk as a checkpoint. In the case of a failure, the saved checkpoint can be used to retry from that point of the stream onwards.

The state classes should be serializable by any json library. I did not add any annotations or such, to avoid polluting this library with extra dependencies.

I did not even try to implement this for the JNI versions. For my use case, the java implementations are sufficient, since for very large objects the limiting factor is not CPU, but download bandwidth.

This enables a "very large stream" hashing use case: Instead of trying
to read a large object in a single pass, one can fetch chunks of it at
a time and save the state after each chunk as a checkpoint. In the case
of a failure, the saved checkpoint can be used to retry from that point
of the stream onwards.
@odaira
Copy link
Member

odaira commented Jan 18, 2018

Interesting feature. I can imagine its usefulness. Can I have a small sample code to try it out?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants