New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Consider caching decrypted and decompressed data #158

Open

commial opened this issue Apr 26, 2023 · 0 comments

Labels

Contributor

commial commented Apr 26, 2023

When an archive is seeked, the reader has to:

go back to the beginning of the corresponding encrypted block
decrypt it
decompress from the start of the compressed block

If another seek occurs, the uncompressed data and the decrypted data are lost, and the work must be done again.

This is not a problem when performing linear extraction, but some pattern suffers from this behavior, for instance:

Create an archive with sparsed files, ie. [File 1 content 1][File 2 content 1][File 1 content 2][File 2 content 2] etc.
Iterate over the file list (File 1, File 2) and read them

In the worst case of n tiny part for n files , every block could be decrypted and decompress n-times.

To avoid it, a cache could be use between layers. The implementation:

can be provided to each layer
could be a layer of its own
must have a strategy to avoid consuming the whole RAM in case of very large file (maybe with a LRU?)

Implementing #156 would be a way to check for the reality of performance increase

The text was updated successfully, but these errors were encountered:

commial added the enhancement label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment