Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd decoder not re-using memory effectively between Reset() and Read() #666

Closed
bboreham opened this issue Sep 23, 2022 · 2 comments · Fixed by #668
Closed

zstd decoder not re-using memory effectively between Reset() and Read() #666

bboreham opened this issue Sep 23, 2022 · 2 comments · Fixed by #668

Comments

@bboreham
Copy link

I've been stepping through some decompression code to try to figure out why zstd memory use is much higher than other compression types.

I see that Reset() uses the buffer d.current.b:

dst = d.current.b

Then in my case DecodeAll runs and the buffer grows to ~300KB.

Each time Read() is called, it advances the start of the buffer:

d.current.b = d.current.b[filled:]

After we read the whole thing, b is still pointing into a 300KB buffer, but towards the end.
When we re-use the Decoder, this buffer has len zero and cap ~35KB. Because 300KB is needed, a new block is allocated.

Could we have the Decoder hold a reference to the beginning of the buffer, so the whole size can be re-used?

Apologies if I'm mis-reading this.

@bboreham
Copy link
Author

It's not doing exactly the same as my case, but I think BenchmarkDecoder_DecoderSmall is a fair illustration:

BenchmarkDecoder_DecoderSmall/kppkn.gtb.zst-4                178          10176051 ns/op         144.90 MB/s     4872625 B/op         61 allocs/op
BenchmarkDecoder_DecoderSmall/geo.protodata.zst-4            648           2767123 ns/op         342.85 MB/s     3110672 B/op         35 allocs/op
BenchmarkDecoder_DecoderSmall/plrabn12.txt.zst-4              78          19421308 ns/op         198.49 MB/s    12835731 B/op        113 allocs/op
BenchmarkDecoder_DecoderSmall/lcet10.txt.zst-4               103          11378717 ns/op         300.04 MB/s    11309187 B/op        112 allocs/op
BenchmarkDecoder_DecoderSmall/asyoulik.txt.zst-4             385           4217998 ns/op         237.42 MB/s     3286064 B/op         35 allocs/op
BenchmarkDecoder_DecoderSmall/alice29.txt.zst-4              238           4390828 ns/op         277.10 MB/s     4000791 B/op         64 allocs/op
BenchmarkDecoder_DecoderSmall/html_x_4.zst-4                 231           5616465 ns/op         583.43 MB/s    10776598 B/op         42 allocs/op
BenchmarkDecoder_DecoderSmall/paper-100k.pdf.zst-4          5319           1187369 ns/op         689.93 MB/s     2689064 B/op         30 allocs/op
BenchmarkDecoder_DecoderSmall/fireworks.jpeg.zst-4          1124           1954033 ns/op         503.95 MB/s     3232699 B/op          5 allocs/op
BenchmarkDecoder_DecoderSmall/urls.10K.zst-4                  75          16159240 ns/op         347.58 MB/s    18710788 B/op        164 allocs/op

It's calling Reset(); the Bytes/op should be close to zero.

@klauspost
Copy link
Owner

You are correct. This is a regression from #649 (and a smaller one from #598)

klauspost added a commit that referenced this issue Sep 24, 2022
Adds `WithDecodeBuffersBelow` to tweak buffer switch-over.

Fixes #666 and generally reduces Reader allocations.
klauspost added a commit that referenced this issue Sep 25, 2022
Adds `WithDecodeBuffersBelow` to tweak buffer switch-over.

Fixes #666 and generally reduces Reader allocations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants