Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd: Improve zstd best efficiency #784

Merged
merged 4 commits into from
Mar 23, 2023
Merged

Conversation

klauspost
Copy link
Owner

@klauspost klauspost commented Mar 21, 2023

@greatroar

Before/after compared to master:

enwik8	zskp	4	100000000	29798035	5596	17.04
enwik8	zskp	4	100000000	29298617	5678	16.79

silesia.tar	zskp	4	211947520	59904436	10622	19.03
silesia.tar	zskp	4	211947520	59311059	10818	18.68

TS40.txt	zskp	4	400000000	123504912	22005	17.34
TS40.txt	zskp	4	400000000	121367006	23661	16.12

apache.log	zskp	4	2622574440	113537595	21239	117.76
apache.log	zskp	4	2622574440	110106407	24301	102.92

github-ranks-backup.bin	zskp	4	1862623243	377801114	75884	23.41
github-ranks-backup.bin	zskp	4	1862623243	373322075	76540	23.21

Doubles table size.

greatroar and others added 3 commits March 19, 2023 12:06
The SpeedBestCompression encoder now extends matches backwards before
estimating their encoded size, rather than doing this after selecting
the best match. This is a bit slower, but produces smaller output.

Benchmarks on amd64:

name                              old speed      new speed      delta
Encoder_EncodeAllSimple/best-8    20.7MB/s ± 3%  19.0MB/s ± 1%  -8.04%  (p=0.000 n=19+18)
Encoder_EncodeAllSimple4K/best-8  19.2MB/s ± 6%  17.9MB/s ± 1%  -6.86%  (p=0.000 n=20+20)

Output sizes on Silesia and enwik9:

dickens    3220994    3179697 (× 0.987179)
enwik9   259846164  257481474 (× 0.990900)
mozilla   16912437   16895142 (× 0.998977)
mr         3502823    3473770 (× 0.991706)
nci        2306320    2300580 (× 0.997511)
ooffice    2896907    2888715 (× 0.997172)
osdb       3390548    3368411 (× 0.993471)
reymont    1657380    1639490 (× 0.989206)
samba      4329898    4315020 (× 0.996564)
sao        5416648    5383855 (× 0.993946)
webster    9972808    9887560 (× 0.991452)
xml         542277     541018 (× 0.997678)
x-ray      5733121    5681186 (× 0.990941)
total    319728325  317035918 (× 0.991579)

Wall clock time for compressing enwik9 goes up a bit, but is still close
to what is was before #776.
@greatroar
Copy link
Contributor

I'd reconstructed most of e283cac and I can confirm that it produces about 1.6% smaller files with acceptable speed loss (though almost double the memory use, I guess).

@@ -205,7 +205,22 @@ encodeLoop:
panic(fmt.Sprintf("first match mismatch: %v != %v, first: %08x", src[s:s+4], src[offset:offset+4], first))
}
}

// Try to quick reject if we already have a long match.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this help? I'd tried to check for overlap between the best match and the candidate before the first load3232, but found that it slows compression down. From memory, what I did was

overlaps := m.rep > 0 && offset >= m.offset && offset < m.offset+m.length
if s-offset >= e.maxMatchOff || overlaps || load3232(src, offset) != first {
    return
}

Copy link
Owner Author

@klauspost klauspost Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cheap version of your check is to check if the offset is the same. I had it at some point but very few hits. Also tried keeping all previously tested offsets - but horrible.

This piece of code does something else. It is a bit easier to explain with an example.

Say we have a match of 50 bytes already.
When we check if a new match is better, we start by checking if bytes [42...46] match. If they don't, we will at best get a match that is 45 bytes, which will always be worse than the 50 byte match, without further tests.

In most cases we can reject on this. Some do the entire matchlen in reverse first and forward afterwards. Too tedious for me.

We could check 8 bytes - but I don't think that will make much difference.

@klauspost
Copy link
Owner Author

I tried

		improve := func(m *match, offset int32, s int32, first uint32, rep int32) {
			if offset == m.offset {
				if m.rep > 0 || rep < 0 {
					// Existing is repeat or new isn't...
					return
				}
				// If base offset matches we can use it as is.
				if m.s == s {
					// We can just use the repeat value
					m.rep = rep
					// Recalc...
					m.estBits(bitsPerByte)
					return
				}
			}

Only "apache.log" improved (it has very long matches). The rest were slower.

I would rather improve the ones that are slow than the ones that are already fast.

@greatroar
Copy link
Contributor

LGTM.

@klauspost klauspost merged commit 2f99358 into master Mar 23, 2023
@klauspost klauspost deleted the great-zstd-match-backward branch March 23, 2023 08:59
klauspost added a commit that referenced this pull request Mar 26, 2023
Since we expand backwards early, we may be in a situation where best.s+2 has already been indexed.

This will result in picking up a 0 or negative offset, which leads to corrupted data.

Skip this check if best.s is less than or equal to s-2.

Regression from #784 (not released)
klauspost added a commit that referenced this pull request Mar 26, 2023
Since we expand backwards early, we may be in a situation where best.s+2 has already been indexed.

This will result in picking up a 0 or negative offset, which leads to corrupted data.

Skip this check if best.s is less than or equal to s-2.

Regression from #784 (not released)
kodiakhq bot pushed a commit to cloudquery/filetypes that referenced this pull request May 1, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/klauspost/compress](https://togithub.com/klauspost/compress) | indirect | patch | `v1.16.3` -> `v1.16.5` |

---

### Release Notes

<details>
<summary>klauspost/compress</summary>

### [`v1.16.5`](https://togithub.com/klauspost/compress/releases/tag/v1.16.5)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.4...v1.16.5)

#### What's Changed

-   zstd: readByte needs to use io.ReadFull by [@&#8203;jnoxon](https://togithub.com/jnoxon) in [klauspost/compress#802
-   gzip: Fix WriterTo after initial read by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#804

#### New Contributors

-   [@&#8203;jnoxon](https://togithub.com/jnoxon) made their first contribution in [klauspost/compress#802

**Full Changelog**: klauspost/compress@v1.16.4...v1.16.5

### [`v1.16.4`](https://togithub.com/klauspost/compress/releases/tag/v1.16.4)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.3...v1.16.4)

#### What's Changed

-   s2: Fix huge block overflow by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#779
-   s2: Allow CustomEncoder fallback by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#780
-   zstd: Fix amd64 not always detecting corrupt data by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#785
-   zstd: Improve zstd best efficiency by [@&#8203;klauspost](https://togithub.com/klauspost) and [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#784
-   zstd: Make load(32|64)32 safer and smaller by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#788
-   zstd: Fix quick reject on long backmatches by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#787
-   zstd: Revert table size change  by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#789
-   zstd: Respect WithAllLitEntropyCompression by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#792
-   zstd: Fix back-referenced offset by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#793
-   zstd: Load source value at start of loop by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#794
-   zstd: Shorten checksum code by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#795
-   zstd: Fix fallback on incompressible block by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#798
-   gzhttp: Suppport ResponseWriter Unwrap() in gzhttp handler by [@&#8203;jgimenez](https://togithub.com/jgimenez) in [klauspost/compress#799

#### New Contributors

-   [@&#8203;jgimenez](https://togithub.com/jgimenez) made their first contribution in [klauspost/compress#799

**Full Changelog**: klauspost/compress@v1.16.3...v1.16.4

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 3am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS42Ni4zIiwidXBkYXRlZEluVmVyIjoiMzUuNjYuMyIsInRhcmdldEJyYW5jaCI6Im1haW4ifQ==-->
kodiakhq bot pushed a commit to cloudquery/plugin-sdk that referenced this pull request Jul 1, 2023
)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/klauspost/compress](https://togithub.com/klauspost/compress) | indirect | patch | `v1.16.0` -> `v1.16.6` |

---

### Release Notes

<details>
<summary>klauspost/compress (github.com/klauspost/compress)</summary>

### [`v1.16.6`](https://togithub.com/klauspost/compress/releases/tag/v1.16.6)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.5...v1.16.6)

#### What's Changed

-   zstd: correctly ignore WithEncoderPadding(1) by [@&#8203;ianlancetaylor](https://togithub.com/ianlancetaylor) in [klauspost/compress#806
-   gzhttp: Handle informational headers by [@&#8203;rtribotte](https://togithub.com/rtribotte) in [klauspost/compress#815
-   zstd: Add amd64 match length assembly by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#824
-   s2: Improve Better compression slightly by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#663
-   s2: Clean up matchlen assembly by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#825

#### New Contributors

-   [@&#8203;rtribotte](https://togithub.com/rtribotte) made their first contribution in [klauspost/compress#815
-   [@&#8203;dveeden](https://togithub.com/dveeden) made their first contribution in [klauspost/compress#816

**Full Changelog**: klauspost/compress@v1.16.5...v1.16.6

### [`v1.16.5`](https://togithub.com/klauspost/compress/releases/tag/v1.16.5)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.4...v1.16.5)

#### What's Changed

-   zstd: readByte needs to use io.ReadFull by [@&#8203;jnoxon](https://togithub.com/jnoxon) in [klauspost/compress#802
-   gzip: Fix WriterTo after initial read by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#804

#### New Contributors

-   [@&#8203;jnoxon](https://togithub.com/jnoxon) made their first contribution in [klauspost/compress#802

**Full Changelog**: klauspost/compress@v1.16.4...v1.16.5

### [`v1.16.4`](https://togithub.com/klauspost/compress/releases/tag/v1.16.4)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.3...v1.16.4)

#### What's Changed

-   s2: Fix huge block overflow by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#779
-   s2: Allow CustomEncoder fallback by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#780
-   zstd: Fix amd64 not always detecting corrupt data by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#785
-   zstd: Improve zstd best efficiency by [@&#8203;klauspost](https://togithub.com/klauspost) and [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#784
-   zstd: Make load(32|64)32 safer and smaller by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#788
-   zstd: Fix quick reject on long backmatches by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#787
-   zstd: Revert table size change  by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#789
-   zstd: Respect WithAllLitEntropyCompression by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#792
-   zstd: Fix back-referenced offset by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#793
-   zstd: Load source value at start of loop by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#794
-   zstd: Shorten checksum code by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#795
-   zstd: Fix fallback on incompressible block by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#798
-   gzhttp: Suppport ResponseWriter Unwrap() in gzhttp handler by [@&#8203;jgimenez](https://togithub.com/jgimenez) in [klauspost/compress#799

#### New Contributors

-   [@&#8203;jgimenez](https://togithub.com/jgimenez) made their first contribution in [klauspost/compress#799

**Full Changelog**: klauspost/compress@v1.16.3...v1.16.4

### [`v1.16.3`](https://togithub.com/klauspost/compress/releases/tag/v1.16.3)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.2...v1.16.3)

**Full Changelog**: klauspost/compress@v1.16.2...v1.16.3

### [`v1.16.2`](https://togithub.com/klauspost/compress/releases/tag/v1.16.2)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.1...v1.16.2)

#### What's Changed

-   Fix Goreleaser permissions by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#777

**Full Changelog**: klauspost/compress@v1.16.1...v1.16.2

### [`v1.16.1`](https://togithub.com/klauspost/compress/releases/tag/v1.16.1)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.0...v1.16.1)

#### What's Changed

-   zstd: Speed up + improve best encoder by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#776
-   s2: Add Intel LZ4s converter by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#766
-   gzhttp: Add BREACH mitigation by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#762
-   gzhttp: Remove a few unneeded allocs by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#768
-   gzhttp: Fix crypto/rand.Read usage by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#770
-   gzhttp: Use SHA256 as paranoid option by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#769
-   gzhttp: Use strings for randomJitter to skip a copy by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#767
-   zstd: Fix ineffective block size check by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#771
-   zstd: Check FSE init values by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#772
-   zstd: Report EOF from byteBuf.readBig by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#773
-   huff0: Speed up compress1xDo by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#774
-   tests: Remove fuzz printing by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#775
-   tests: Add CICD Fuzz testing by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#763
-   ci: set minimal permissions to GitHub Workflows by [@&#8203;diogoteles08](https://togithub.com/diogoteles08) in [klauspost/compress#765

#### New Contributors

-   [@&#8203;diogoteles08](https://togithub.com/diogoteles08) made their first contribution in [klauspost/compress#765

**Full Changelog**: klauspost/compress@v1.16.0...v1.16.1

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS4xNTEuMCIsInVwZGF0ZWRJblZlciI6IjM1LjE1MS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->
kodiakhq bot pushed a commit to cloudquery/plugin-pb-go that referenced this pull request Aug 1, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/klauspost/compress](https://togithub.com/klauspost/compress) | indirect | minor | `v1.15.15` -> `v1.16.7` |

---

### Release Notes

<details>
<summary>klauspost/compress (github.com/klauspost/compress)</summary>

### [`v1.16.7`](https://togithub.com/klauspost/compress/releases/tag/v1.16.7)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.6...v1.16.7)

#### What's Changed

-   zstd: Fix default level first dictionary encode by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#829
-   docs: Fix typo in security advisory URL by [@&#8203;vcabbage](https://togithub.com/vcabbage) in [klauspost/compress#830
-   s2: add GetBufferCapacity() method by [@&#8203;GiedriusS](https://togithub.com/GiedriusS) in [klauspost/compress#832

#### New Contributors

-   [@&#8203;vcabbage](https://togithub.com/vcabbage) made their first contribution in [klauspost/compress#830
-   [@&#8203;GiedriusS](https://togithub.com/GiedriusS) made their first contribution in [klauspost/compress#832

**Full Changelog**: klauspost/compress@v1.16.6...v1.16.7

### [`v1.16.6`](https://togithub.com/klauspost/compress/releases/tag/v1.16.6)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.5...v1.16.6)

#### What's Changed

-   zstd: correctly ignore WithEncoderPadding(1) by [@&#8203;ianlancetaylor](https://togithub.com/ianlancetaylor) in [klauspost/compress#806
-   gzhttp: Handle informational headers by [@&#8203;rtribotte](https://togithub.com/rtribotte) in [klauspost/compress#815
-   zstd: Add amd64 match length assembly by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#824
-   s2: Improve Better compression slightly by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#663
-   s2: Clean up matchlen assembly by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#825

#### New Contributors

-   [@&#8203;rtribotte](https://togithub.com/rtribotte) made their first contribution in [klauspost/compress#815
-   [@&#8203;dveeden](https://togithub.com/dveeden) made their first contribution in [klauspost/compress#816

**Full Changelog**: klauspost/compress@v1.16.5...v1.16.6

### [`v1.16.5`](https://togithub.com/klauspost/compress/releases/tag/v1.16.5)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.4...v1.16.5)

#### What's Changed

-   zstd: readByte needs to use io.ReadFull by [@&#8203;jnoxon](https://togithub.com/jnoxon) in [klauspost/compress#802
-   gzip: Fix WriterTo after initial read by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#804

#### New Contributors

-   [@&#8203;jnoxon](https://togithub.com/jnoxon) made their first contribution in [klauspost/compress#802

**Full Changelog**: klauspost/compress@v1.16.4...v1.16.5

### [`v1.16.4`](https://togithub.com/klauspost/compress/releases/tag/v1.16.4)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.3...v1.16.4)

#### What's Changed

-   s2: Fix huge block overflow by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#779
-   s2: Allow CustomEncoder fallback by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#780
-   zstd: Fix amd64 not always detecting corrupt data by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#785
-   zstd: Improve zstd best efficiency by [@&#8203;klauspost](https://togithub.com/klauspost) and [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#784
-   zstd: Make load(32|64)32 safer and smaller by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#788
-   zstd: Fix quick reject on long backmatches by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#787
-   zstd: Revert table size change  by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#789
-   zstd: Respect WithAllLitEntropyCompression by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#792
-   zstd: Fix back-referenced offset by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#793
-   zstd: Load source value at start of loop by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#794
-   zstd: Shorten checksum code by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#795
-   zstd: Fix fallback on incompressible block by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#798
-   gzhttp: Suppport ResponseWriter Unwrap() in gzhttp handler by [@&#8203;jgimenez](https://togithub.com/jgimenez) in [klauspost/compress#799

#### New Contributors

-   [@&#8203;jgimenez](https://togithub.com/jgimenez) made their first contribution in [klauspost/compress#799

**Full Changelog**: klauspost/compress@v1.16.3...v1.16.4

### [`v1.16.3`](https://togithub.com/klauspost/compress/releases/tag/v1.16.3)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.2...v1.16.3)

**Full Changelog**: klauspost/compress@v1.16.2...v1.16.3

### [`v1.16.2`](https://togithub.com/klauspost/compress/releases/tag/v1.16.2)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.1...v1.16.2)

#### What's Changed

-   Fix Goreleaser permissions by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#777

**Full Changelog**: klauspost/compress@v1.16.1...v1.16.2

### [`v1.16.1`](https://togithub.com/klauspost/compress/releases/tag/v1.16.1)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.16.0...v1.16.1)

#### What's Changed

-   zstd: Speed up + improve best encoder by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#776
-   s2: Add Intel LZ4s converter by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#766
-   gzhttp: Add BREACH mitigation by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#762
-   gzhttp: Remove a few unneeded allocs by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#768
-   gzhttp: Fix crypto/rand.Read usage by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#770
-   gzhttp: Use SHA256 as paranoid option by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#769
-   gzhttp: Use strings for randomJitter to skip a copy by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#767
-   zstd: Fix ineffective block size check by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#771
-   zstd: Check FSE init values by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#772
-   zstd: Report EOF from byteBuf.readBig by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#773
-   huff0: Speed up compress1xDo by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#774
-   tests: Remove fuzz printing by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#775
-   tests: Add CICD Fuzz testing by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#763
-   ci: set minimal permissions to GitHub Workflows by [@&#8203;diogoteles08](https://togithub.com/diogoteles08) in [klauspost/compress#765

#### New Contributors

-   [@&#8203;diogoteles08](https://togithub.com/diogoteles08) made their first contribution in [klauspost/compress#765

**Full Changelog**: klauspost/compress@v1.16.0...v1.16.1

### [`v1.16.0`](https://togithub.com/klauspost/compress/releases/tag/v1.16.0)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.15...v1.16.0)

#### What's Changed

-   s2: Add Dictionary support by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#685
-   s2: Add Compression Size Estimate by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#752
-   s2: Add support for custom stream encoder by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#755
-   s2: Add LZ4 block converter by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#748
-   s2: Support io.ReaderAt in ReadSeeker by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#747
-   s2c/s2sx: Use concurrent decoding by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#746
-   tests: Upgrade to Go 1.20 by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#749
-   Update all (command) dependencies by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#758

**Full Changelog**: klauspost/compress@v1.15.15...v1.16.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi4yNi4xIiwidXBkYXRlZEluVmVyIjoiMzYuMjYuMSIsInRhcmdldEJyYW5jaCI6Im1haW4ifQ==-->
klauspost added a commit that referenced this pull request Oct 22, 2023
Regression from #784 and followup #793

Fixes #875

A 0 offset backreference was possible when "improve" was successful twice in a row in the "skipBeginning" part, only finding 2 (previously unmatches) length 4 matches, but where start offset decreased by 2 in both cases.

This would result in output where the end offset would equal to the next 's', thereby doing a self-reference.

Add a general check in "improve" and just reject these. Will also guard against similar issues in the future.

This also hints at some potentially suboptimal hash indexing - but I will take that improvement separately.

Fuzz test set updated.
klauspost added a commit that referenced this pull request Oct 22, 2023
* zstd: Fix corrupted output in "best"

Regression from #784 and followup #793

Fixes #875

A 0 offset backreference was possible when "improve" was successful twice in a row in the "skipBeginning" part, only finding 2 (previously unmatches) length 4 matches, but where start offset decreased by 2 in both cases.

This would result in output where the end offset would equal to the next 's', thereby doing a self-reference.

Add a general check in "improve" and just reject these. Will also guard against similar issues in the future.

This also hints at some potentially suboptimal hash indexing - but I will take that improvement separately.

Fuzz test set updated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants