Document eStargz with `zstd` compression instead of only `gzip` #1596

aochagavia · 2024-03-06T13:07:10Z

Currently, the docs on the structure of eStargz mention that layers are compressed using gzip. However, as far as I understand, eStargz also supports zstd as a compression mechanism (it is mentioned in the nerdctl docs, though they call it zstdchunked).

It would be great to update the docs to reflect zstd support. Specifically, the following questions need to be answered when using zstd as the compression format:

Are gzip blobs replaced by zstd frames (as defined in the zstd specification)? Are chunks also zstd frames?
How does the footer look like? The docs define it in terms that are closely tied to gzip and it's not clear to me how it translates to zstd.

For context, I'm working on an image bakery application in Rust and I want to support eStargz with zstd compression.

The text was updated successfully, but these errors were encountered:

ktock · 2024-03-06T15:05:38Z

It would be great to update the docs to reflect zstd support.

SGTM

Are gzip blobs replaced by zstd frames (as defined in the zstd specification)? Are chunks also zstd frames?

Yes

How does the footer look like? The docs define it in terms that are closely tied to gzip and it's not clear to me how it translates to zstd.

- zstd skippable frame header (64bits)
- TOC offset (64bits)
- zstd compressed TOC length (64bits)
- Uncompressed TOC length (64bits)
- manifest type (=1) (64bits)
- zstd:chunked magic number (64bits)

Please see also source code :

stargz-snapshotter/estargz/zstdchunked/zstdchunked.go

Lines 186 to 201 in 6e5d5a0

    
           // zstdFooterBytes returns the 40 bytes footer. 
        
           func zstdFooterBytes(tocOff, tocRawSize, tocCompressedSize uint64) []byte { 
        
           	footer := make([]byte, FooterSize) 
        
           	binary.LittleEndian.PutUint64(footer, tocOff) 
        
           	binary.LittleEndian.PutUint64(footer[8:], tocCompressedSize) 
        
           	binary.LittleEndian.PutUint64(footer[16:], tocRawSize) 
        
           	binary.LittleEndian.PutUint64(footer[24:], manifestTypeCRFS) 
        
           	copy(footer[32:40], zstdChunkedFrameMagic) 
        
           	return footer 
        
           } 
        
           func appendSkippableFrameMagic(b []byte) []byte { 
        
           	size := make([]byte, 4) 
        
           	binary.LittleEndian.PutUint32(size, uint32(len(b))) 
        
           	return append(append(skippableFrameMagic, size...), b...) 
        
           }

Or discussion threads: containers/storage#775 #293

aochagavia · 2024-03-06T15:10:21Z

Perfect, thanks! Once I have a clear idea of how this all works I might open a PR to update the docs 👍

aochagavia · 2024-03-15T15:28:59Z

I'm not yet confident enough in my knowledge to update the docs, but here's some relevant information for whoever is interested in creating an independent implementation of eStargz + zstd:

When using zstd, the TOC is not included in the layer's tar archive. Instead, it's included as a raw string inside a skippable frame. This diverges from what gzip does (according to the diagram in the docs).
As a consequence of the previous point, the tocOffset doesn't point to a tar header (there is none). Instead, it points directly to the start of the TOC's JSON.

aochagavia · 2024-03-27T12:39:53Z

Here's another bit of information (not sure whether it's zstd-specific or also applies to gzip): in the TOC, the offset field for .no.prefetch.landmark doesn't link to the beginning of the header in the tar archive, but links directly to its body instead (header and body are each compressed in their own zstd frame to allow this). This diverges from the way other files are handled (normally the header and the body are inside the same zstd frame and the offset points to the beginning of the frame).

Update: my comment above is incorrect. For all files (including the landmark), the offset indeed links directly to the start of the compressed body, which is obvious because any non-empty body gets its own zstd frame. Sorry for the confusion.

ktock added the documentation Improvements or additions to documentation label Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document eStargz with `zstd` compression instead of only `gzip` #1596

Document eStargz with `zstd` compression instead of only `gzip` #1596

aochagavia commented Mar 6, 2024 •

edited

ktock commented Mar 6, 2024

aochagavia commented Mar 6, 2024

aochagavia commented Mar 15, 2024

aochagavia commented Mar 27, 2024 •

edited

Document eStargz with zstd compression instead of only gzip #1596

Document eStargz with zstd compression instead of only gzip #1596

Comments

aochagavia commented Mar 6, 2024 • edited

ktock commented Mar 6, 2024

aochagavia commented Mar 6, 2024

aochagavia commented Mar 15, 2024

aochagavia commented Mar 27, 2024 • edited

Document eStargz with `zstd` compression instead of only `gzip` #1596

Document eStargz with `zstd` compression instead of only `gzip` #1596

aochagavia commented Mar 6, 2024 •

edited

aochagavia commented Mar 27, 2024 •

edited