Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further optimizations #25

Open
2 of 5 tasks
KillingSpark opened this issue May 27, 2022 · 0 comments
Open
2 of 5 tasks

Further optimizations #25

KillingSpark opened this issue May 27, 2022 · 0 comments

Comments

@KillingSpark
Copy link
Owner

KillingSpark commented May 27, 2022

I need a place to put down some ideas for further optimizing this crate:

  1. We only need to call reserve once for each block of sequences. We can calculate how many bytes will be added to the decode buffer by a list of sequences. This might save some re-allocations.
  2. The way the zstd_streaming binary works is not optimal. It should just use the drain_to_writer() functions instead of reading into an intermediary buffer. That's why we have these functions.
  3. Read https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far-too-many-ways-part-1/ and https://fgiesen.wordpress.com/2018/02/20/reading-bits-in-far-too-many-ways-part-2/ again carefully and optimize the bitreaders further
  4. The ReversedBitreader performance can be enhanced quite a bit by being less useful in the generic case. Just returning wrong values for requests of >56 bits eliminates the need for error handling on calls to the get_bits_(triple) started in don't return errors on too large requests on a reversed bitreader #58
  5. The RingBuffer::extend_from_within does a lot of small memcpy calls. These can be sped up a lot by not caring about precise copying of values behind the range we want to copy. Copying a/multiple u128 (where possible) speeds this up by a lot.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant