Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deflate: Improve level 7-9 #739

Merged
merged 5 commits into from
Jan 14, 2023
Merged

deflate: Improve level 7-9 #739

merged 5 commits into from
Jan 14, 2023

Conversation

klauspost
Copy link
Owner

@klauspost klauspost commented Jan 12, 2023

Check post-match alternative at a 2 byte offset to allow longer matches and add up to 2 literals.

Adjust minimum match cost on level 9.

Before/after pairs:

file	out	level	insize	outsize	millis	mb/s
github-june-2days-2019.json	gzkp	7	6273951764	919314231	42313	141.41
github-june-2days-2019.json	gzkp	7	6273951764	924473679	42317	141.39

github-june-2days-2019.json	gzkp	8	6273951764	904763581	53796	111.22
github-june-2days-2019.json	gzkp	8	6273951764	905294390	53747	111.32

github-june-2days-2019.json	gzkp	9	6273951764	897031618	105876	56.51
github-june-2days-2019.json	gzkp	9	6273951764	895561157	105651	56.63

nyc-taxi-data-10M.csv	gzkp	7	3325605752	741407017	46656	67.98
nyc-taxi-data-10M.csv	gzkp	7	3325605752	731632472	45047	70.40

nyc-taxi-data-10M.csv	gzkp	8	3325605752	725159779	69767	45.46
nyc-taxi-data-10M.csv	gzkp	8	3325605752	718753419	69238	45.81

nyc-taxi-data-10M.csv	gzkp	9	3325605752	709819738	130439	24.31
nyc-taxi-data-10M.csv	gzkp	9	3325605752	702234731	130544	24.29


gob-stream	gzkp	7	1911399616	298204449	12916	141.12
gob-stream	gzkp	7	1911399616	295193952	12497	145.86

gob-stream	gzkp	8	1911399616	286865198	17153	106.27
gob-stream	gzkp	8	1911399616	285374252	16652	109.46

gob-stream	gzkp	9	1911399616	277771119	41745	43.67
gob-stream	gzkp	9	1911399616	276757881	41077	44.38


silesia.tar	gzkp	7	211947520	69605688	3034	66.62
silesia.tar	gzkp	7	211947520	69331085	2934	68.88

silesia.tar	gzkp	8	211947520	68745247	4108	49.2
silesia.tar	gzkp	8	211947520	68555973	3989	50.67

silesia.tar	gzkp	9	211947520	68254323	11843	17.07
silesia.tar	gzkp	9	211947520	68158666	12278	16.46


enwik9	gzkp	7	1000000000	325327111	17413	54.77
enwik9	gzkp	7	1000000000	324323925	17014	56.05

enwik9	gzkp	8	1000000000	323311871	21067	45.27
enwik9	gzkp	8	1000000000	322268755	20626	46.24

enwik9	gzkp	9	1000000000	321637537	34016	28.04
enwik9	gzkp	9	1000000000	320812830	33955	28.09

webdevdata.org-2015-01-07-subset	TRUE	gzkp	7	53927	4014735833	746309058	34562	110.78
webdevdata.org-2015-01-07-subset	true	gzkp	7	53927	4014735833	746308934	34062	112.40

webdevdata.org-2015-01-07-subset	TRUE	gzkp	8	53927	4014735833	735523067	43270	88.48
webdevdata.org-2015-01-07-subset	true	gzkp	8	53927	4014735833	734840811	42613	89.85

webdevdata.org-2015-01-07-subset	TRUE	gzkp	9	53927	4014735833	727900632	94305	40.6
webdevdata.org-2015-01-07-subset	true	gzkp	9	53927	4014735833	726973932	95443	40.12

@klauspost klauspost marked this pull request as ready for review January 12, 2023 20:02
@klauspost klauspost merged commit 781b247 into master Jan 14, 2023
@klauspost klauspost deleted the improve-gzip-high-levels branch January 14, 2023 10:16
kodiakhq bot pushed a commit to cloudquery/cloudquery that referenced this pull request Feb 1, 2023
…7575)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/klauspost/compress](https://togithub.com/klauspost/compress) | indirect | patch | `v1.15.11` -> `v1.15.15` |

---

### Release Notes

<details>
<summary>klauspost/compress</summary>

### [`v1.15.15`](https://togithub.com/klauspost/compress/releases/tag/v1.15.15)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.14...v1.15.15)

##### What's Changed

-   zstd: Add delta encoding support by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#728
-   huff0: Reduce bounds checking by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#734
-   huff0: Assembler improvements by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#736
-   deflate: Improve level 7-9 by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#739
-   gzhttp: Add SuffixETag() and DropETag() options to prevent ETag collisions on compressed responses by [@&#8203;willbicks](https://togithub.com/willbicks) in [klauspost/compress#740
-   zstd: Don't allocate dataStorage when using byteBuf by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#741
-   huff0: Speed up compression of short blocks by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#744
-   zstd: Handle dicts by pointer, always by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#743
-   fse: Optimize compression by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#745
-   Retract v1.14.1-v.1.14.3 by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#742

##### New Contributors

-   [@&#8203;willbicks](https://togithub.com/willbicks) made their first contribution in [klauspost/compress#740

**Full Changelog**: klauspost/compress@v1.15.14...v1.15.15

### [`v1.15.14`](https://togithub.com/klauspost/compress/releases/tag/v1.15.14)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.13...v1.15.14)

#### What's Changed

-   flate: Improve speed in big stateless blocks. by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#718
-   zstd: Trigger BCE by switching on lengths by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#716
-   zstd: Shave some instructions off the amd64 asm by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#720
-   export NoGzipResponseWriter for custom ResponseWriter wrappers by [@&#8203;harshavardhana](https://togithub.com/harshavardhana) in [klauspost/compress#722
-   s2: Add example for indexing and existing stream by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#723
-   tests: Tweak fuzz tests by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#719

#### New Contributors

-   [@&#8203;harshavardhana](https://togithub.com/harshavardhana) made their first contribution in [klauspost/compress#722

**Full Changelog**: klauspost/compress@v1.15.13...v1.15.14

### [`v1.15.13`](https://togithub.com/klauspost/compress/releases/tag/v1.15.13)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.12...v1.15.13)

#### What's Changed

-   zstd: Add MaxEncodedSize to encoder by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#691
-   zstd: Improve "best" end search by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#693
-   zstd: Replace bytes.Equal with smaller comparisons by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#695
-   zstd: Faster CRC checking/skipping by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#696
-   zstd: Rewrite matchLen to make it inlineable by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#701
-   zstd: Write table clearing in a way that the compiler recognizes by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#702
-   zstd: Use individual reset threshold by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#703
-   huff0: Check for zeros earlier in Scratch.countSimple by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#704
-   zstd: Improve best compression's match selection by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#705
-   zstd: Select best match using selection trees by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#706
-   zstd: sync xxhash with final accepted patch upstream by [@&#8203;lizthegrey](https://togithub.com/lizthegrey) in [klauspost/compress#707
-   zstd: Import xxhash v2.2.0 by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#708

**Full Changelog**: klauspost/compress@v1.15.12...v1.15.13

### [`v1.15.12`](https://togithub.com/klauspost/compress/releases/tag/v1.15.12)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.11...v1.15.12)

##### What's Changed

-   zstd: Tweak decoder allocs. by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#680
-   gzhttp: Always delete `HeaderNoCompression` by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#683

**Full Changelog**: klauspost/compress@v1.15.11...v1.15.12

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 3am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC4xMDkuMSIsInVwZGF0ZWRJblZlciI6IjM0LjEwOS4xIn0=-->
kodiakhq bot pushed a commit to cloudquery/filetypes that referenced this pull request Mar 1, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/klauspost/compress](https://togithub.com/klauspost/compress) | indirect | minor | `v1.15.11` -> `v1.16.0` |

---

### ⚠ Dependency Lookup Warnings ⚠

Warnings were logged while processing this repo. Please check the Dependency Dashboard for more information.

---

### Release Notes

<details>
<summary>klauspost/compress</summary>

### [`v1.16.0`](https://togithub.com/klauspost/compress/releases/tag/v1.16.0)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.15...v1.16.0)

#### What's Changed

-   s2: Add Dictionary support by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#685
-   s2: Add Compression Size Estimate by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#752
-   s2: Add support for custom stream encoder by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#755
-   s2: Add LZ4 block converter by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#748
-   s2: Support io.ReaderAt in ReadSeeker by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#747
-   s2c/s2sx: Use concurrent decoding by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#746
-   tests: Upgrade to Go 1.20 by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#749
-   Update all (command) dependencies by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#758

**Full Changelog**: klauspost/compress@v1.15.15...v1.16.0

### [`v1.15.15`](https://togithub.com/klauspost/compress/releases/tag/v1.15.15)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.14...v1.15.15)

#### What's Changed

-   zstd: Add delta encoding support by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#728
-   huff0: Reduce bounds checking by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#734
-   huff0: Assembler improvements by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#736
-   deflate: Improve level 7-9 by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#739
-   gzhttp: Add SuffixETag() and DropETag() options to prevent ETag collisions on compressed responses by [@&#8203;willbicks](https://togithub.com/willbicks) in [klauspost/compress#740
-   zstd: Don't allocate dataStorage when using byteBuf by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#741
-   huff0: Speed up compression of short blocks by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#744
-   zstd: Handle dicts by pointer, always by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#743
-   fse: Optimize compression by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#745
-   Retract v1.14.1-v.1.14.3 by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#742

#### New Contributors

-   [@&#8203;willbicks](https://togithub.com/willbicks) made their first contribution in [klauspost/compress#740

**Full Changelog**: klauspost/compress@v1.15.14...v1.15.15

### [`v1.15.14`](https://togithub.com/klauspost/compress/releases/tag/v1.15.14)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.13...v1.15.14)

#### What's Changed

-   flate: Improve speed in big stateless blocks. by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#718
-   zstd: Trigger BCE by switching on lengths by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#716
-   zstd: Shave some instructions off the amd64 asm by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#720
-   export NoGzipResponseWriter for custom ResponseWriter wrappers by [@&#8203;harshavardhana](https://togithub.com/harshavardhana) in [klauspost/compress#722
-   s2: Add example for indexing and existing stream by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#723
-   tests: Tweak fuzz tests by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#719

#### New Contributors

-   [@&#8203;harshavardhana](https://togithub.com/harshavardhana) made their first contribution in [klauspost/compress#722

**Full Changelog**: klauspost/compress@v1.15.13...v1.15.14

### [`v1.15.13`](https://togithub.com/klauspost/compress/releases/tag/v1.15.13)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.12...v1.15.13)

#### What's Changed

-   zstd: Add MaxEncodedSize to encoder by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#691
-   zstd: Improve "best" end search by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#693
-   zstd: Replace bytes.Equal with smaller comparisons by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#695
-   zstd: Faster CRC checking/skipping by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#696
-   zstd: Rewrite matchLen to make it inlineable by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#701
-   zstd: Write table clearing in a way that the compiler recognizes by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#702
-   zstd: Use individual reset threshold by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#703
-   huff0: Check for zeros earlier in Scratch.countSimple by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#704
-   zstd: Improve best compression's match selection by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#705
-   zstd: Select best match using selection trees by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#706
-   zstd: sync xxhash with final accepted patch upstream by [@&#8203;lizthegrey](https://togithub.com/lizthegrey) in [klauspost/compress#707
-   zstd: Import xxhash v2.2.0 by [@&#8203;greatroar](https://togithub.com/greatroar) in [klauspost/compress#708

**Full Changelog**: klauspost/compress@v1.15.12...v1.15.13

### [`v1.15.12`](https://togithub.com/klauspost/compress/releases/tag/v1.15.12)

[Compare Source](https://togithub.com/klauspost/compress/compare/v1.15.11...v1.15.12)

#### What's Changed

-   zstd: Tweak decoder allocs. by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#680
-   gzhttp: Always delete `HeaderNoCompression` by [@&#8203;klauspost](https://togithub.com/klauspost) in [klauspost/compress#683

**Full Changelog**: klauspost/compress@v1.15.11...v1.15.12

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 3am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC4xMDkuMSIsInVwZGF0ZWRJblZlciI6IjM0LjE1NC4wIn0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant