Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compression: support zstd with skippable frame #42862

Merged
merged 1 commit into from Oct 21, 2021

Conversation

dkkb
Copy link
Contributor

@dkkb dkkb commented Sep 17, 2021

Signed-off-by: Da McGrady dabkb@aol.com

- What I did

As a matter of fact, there are two frame formats defined by Zstandard: Zstandard frames and Skippable frames.
So we should probably support zstd algorithms with skippable frames.

See https://tools.ietf.org/id/draft-kucherawy-dispatch-zstd-00.html#rfc.section.2 for more details.
The structure of a single Zstandard frame is as follows, the magic number of Zstandard frame is 0xFD2FB528.

  +--------------------+------------+
  |    Magic_Number    | 4 bytes    |
  +--------------------+------------+
  |    Frame_Header    | 2-14 bytes |
  +--------------------+------------+
  |     Data_Block     | n bytes    |
  +--------------------+------------+
  | [More Data Blocks] |            |
  +--------------------+------------+
  | [Content Checksum] | 0-4 bytes  |
  +--------------------+------------+

Skippable frames allow the insertion of user-defined data into a flow of concatenated frames. Its design is pretty straightforward, with the sole objective to allow the decoder to quickly skip over user-defined data and continue decoding.

  +--------------+------------+-----------+
  | Magic_Number | Frame_Size | User_Data |
  +--------------+------------+-----------+
  |    4 bytes   |   4 bytes  |  n bytes  |
  +--------------+------------+-----------+

Magic_Number: 0x184D2A5?, which means any value from 0x184D2A50 to 0x184D2A5F.
Frame_Size: This is the size n of the following UserData, 4 bytes, little-endian format, unsigned 32-bits.

- How I did it
In order to accommodate more complicated detectors for Zstd, I changed the DetectCompression function.

- How to verify it

To verify this PR, I created a new test named TestDetectCompression.

- Description for the changelog

- A picture of a cute animal (not mandatory but encouraged)

pkg/archive/archive.go Outdated Show resolved Hide resolved
@thaJeztah
Copy link
Member

Left a quick thought above.

Also, would it be possible to amend the commit message to use your (quite informative) PR description in the commit message itself?

@dkkb dkkb force-pushed the feature/zstd_with_skippable_frame branch from ee5e675 to d6ba284 Compare September 17, 2021 13:11
@dkkb
Copy link
Contributor Author

dkkb commented Sep 17, 2021

Left a quick thought above.

Also, would it be possible to amend the commit message to use your (quite informative) PR description in the commit message itself?

Sure.

pkg/archive/archive.go Outdated Show resolved Hide resolved
pkg/archive/archive.go Outdated Show resolved Hide resolved
pkg/archive/archive.go Outdated Show resolved Hide resolved
pkg/archive/archive_test.go Outdated Show resolved Hide resolved
pkg/archive/archive_test.go Outdated Show resolved Hide resolved
pkg/archive/archive_test.go Outdated Show resolved Hide resolved
pkg/archive/archive_test.go Outdated Show resolved Hide resolved
@dkkb dkkb force-pushed the feature/zstd_with_skippable_frame branch 2 times, most recently from d8f3cde to e155411 Compare September 23, 2021 09:20
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@thaJeztah
Copy link
Member

@tonistiigi PTAL (perhaps we need a test image for this as well)

@thaJeztah
Copy link
Member

@dkkb - looking at this with @tonistiigi and he pointed out that DecompressStream takes 10 bytes for the detection, which may not be enough for the detection needed here;

moby/pkg/archive/archive.go

Lines 178 to 182 in 6014c1e

// DecompressStream decompresses the archive and returns a ReaderCloser with the decompressed archive.
func DecompressStream(archive io.Reader) (io.ReadCloser, error) {
p := pools.BufioReader32KPool
buf := p.Get(archive)
bs, err := buf.Peek(10)

So, the test in the PR passes, but detection may not work as expected when used by DecompressStream

@dkkb dkkb force-pushed the feature/zstd_with_skippable_frame branch from e155411 to eb0eb4e Compare October 15, 2021 09:02
@dkkb
Copy link
Contributor Author

dkkb commented Oct 15, 2021

@dkkb - looking at this with @tonistiigi and he pointed out that DecompressStream takes 10 bytes for the detection, which may not be enough for the detection needed here;

moby/pkg/archive/archive.go

Lines 178 to 182 in 6014c1e

// DecompressStream decompresses the archive and returns a ReaderCloser with the decompressed archive.
func DecompressStream(archive io.Reader) (io.ReadCloser, error) {
p := pools.BufioReader32KPool
buf := p.Get(archive)
bs, err := buf.Peek(10)

So, the test in the PR passes, but detection may not work as expected when used by DecompressStream

To avoid misidentifying Zstd, I use an ordered list to detect the compression algorithm for compatibility and performance so that we can quickly identify the skippable frame and remove the unnecessary code.
Do you think it is a feasible plan?

@thaJeztah
Copy link
Member

thaJeztah commented Oct 15, 2021

To avoid misidentifying Zstd, I use an ordered list to detect the compression algorithm for compatibility and performance so that we can quickly identify the skippable frame and remove the unnecessary code. Do you think it is a feasible plan?

@tonistiigi could you have a look?

As a matter of fact, there are two frame formats defined by Zstandard: Zstandard frames and Skippable frames.
So we should probably support zstd algorithms with skippable frames.
See https://tools.ietf.org/id/draft-kucherawy-dispatch-zstd-00.html#rfc.section.2 for more details.

Signed-off-by: Da McGrady <dabkb@aol.com>
@dkkb dkkb force-pushed the feature/zstd_with_skippable_frame branch from eb0eb4e to 23abee4 Compare October 15, 2021 09:24
@tonistiigi
Copy link
Member

@klauspost @giuseppe sgty?

If accepted we need the same change in containerd & buildkit as all of them only check single magic atm.

Copy link

@klauspost klauspost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah merged commit 872c64c into moby:master Oct 21, 2021
@thaJeztah
Copy link
Member

Thank you!

@tonistiigi
Copy link
Member

@dkkb Could you make the same update in containerd and buildkit repos?

@dkkb
Copy link
Contributor Author

dkkb commented Oct 29, 2021

@dkkb Could you make the same update in containerd and buildkit repos?

Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants