Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor blob copying and compression #1579

Merged
merged 35 commits into from Jun 16, 2022

Conversation

mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented Jun 15, 2022

This is the last refactor PR for #1574 , I promise.

copyBlobFromStream has grown into a 300-line function carrying a lot of variables, and it’s become increasingly untenable to add more logic to it, even if that logic is conceptually quite local. It’s well past the time it should have been split up. I have hoped to do that after unit tests exist, sadly that’s not panned out.

Major themes:

  • Move copyBlobFromStream and closest helpers to copy/blob.go.
  • Use a much more regular structure for the copy pipeline, with the stream processing pipeline maintained as a sourceStream. The copy pipeline is still not just ignorant plumbing, it carries around all the state (e.g. how encryption and decryption interact, and how the outcome of modifying the stream needs to be reflected in the updated blob descriptor); but it’s much more encapsulated now:
    • Actually modifying the stream (e.g. compression) happens in various blobPipeline…Step functions; those functions return blobPipeline…StepData structures. The top-level pipeline, and the following pipeline stops, thus still have access to the same data as before, but with the …StepData wrappers, it’s much fewer variables to track, and the step interdependencies are clearly visible.
    • Other changes (e.g. updates to the descriptor or the blob info cache) are now methods on the …StepData objects, allowing more encapsulation. The top-level code is not completely ignorant, it e.g. it needs to be aware of how the various edits interact (like the inability to compress and decompress at the same time).
  • Encryption/decryption are now in copy/encryption.go, mostly unchanged.
  • The compression/decompression step, as the largest pipeline element, ends up in copy/compression.go. The broad outline is the same, but it’s split even further:
    • Every case (compress, decompress, re-compress, …) is now a separate function, which checks whether that case is applicable/possible, and if so, performs it and returns the right metadata.
    • Ultimately — this is the goal: we can build more complex logic for the “applicable/possible” conditions, like “is the requested compression format supported in the relevant manifest schema”?, without cluttering the top-level of copyBlobFromStream with extra variables.

Various corner cases of the copy pipeline are somewhat dubious, and that’s been made maybe more visible by these changes (e.g. how we don’t immediately update MediaType on encryption/decryption, but make checks based on its value anyway; how some decisions are done on the original data even if we have decrypted it in the meantime; and more). For the most part, this PR leans towards not changing those things, to decrease risk of destabilizing the codebase.

See the individual commits for details. I’m afraid this is large, and with minimal test coverage. Ultimately, I think this is a good longer-term direction, but I could accept an opinion that this refactoring is currently not worth it and we should just live with making copyBlobFromStream uglier for some(?) more time.

The over-300-line-long function would benefit from splitting up,
and copy/copy.go has grown entirely too long.

As a first step, move it to a new copy/blob.go.

Only moves unmodified code, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... so that the file reads from top-level
callers down to implementation details.

Only moves unchanged code, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We will use srcStream for another purpose.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The purpose of sourceStream is to make it a bit easier
to split the blob copy pipeline stages into separate functions;
they will update a *sourceStream, which we can pass a single parameter.

For now, build it and use it to initialize a digestingReader; then
update other code that should consume the current pipeline data.

NOTE: That is a judgement call, it is not always obvious what data
was intended to be used and what is a bug.

As a geneeral principle, use srcInfo (the original unmodified data)
in user-visible error reports, notably using the original digest.
Otherwise, lean towards using srcStream.info.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
One part of moving to use stream throughout the pipeline.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Use stream.info throughout.

Should not change behavior.  (In some cases that's
dubious, we are currently losing data like annotations
compression/decompression.  This commit doesn't
change that, only adds a FIXME? to eventually review.)

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…cryptionStep

Only moves the code with minimal necessary adjustments, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added 22 commits June 16, 2022 01:31
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This smaller function can use simpler identifiers.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…cryptionStep

Only moves the code with minimal necessary adjustments, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Make the "not encrypting" case truly special, and then we
don't need extra variables across the function.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
It contains decompression code as well.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…nStep

Only moves the code with minimal necessary adjustments, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
We can use shorter variable names in this short function now.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…n edits

Only moves code, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Only moves unchanged code, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Only moves the code with minimal necessary adjustments, should not change behavior.

Those adjustments are not completely trivial in this case: we could just do a
"defer stream.Close()" in copyBlobFromStream, now we need a separate closers field in
blobPipelineCompressionStepData.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The compression meaning is clear in context here.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Except for the encrypted content case, we already have the data available,
and this will allow us to just pass around a *blobPipelineDetectCompressionStepData
without an extra parameter for srcCompressorName.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
That's actually shorter because in most cases we just
use a field initializer instead of setting a variable;
and we don't need that variable in the first place.

For now, do just the minimal edit, we will clean this up
a bit later.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Don't use else after each step that returns early.

The function nows looks like a set of ~special cases,
with the general fallback of doing nothing.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
to avoid ambiguity.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
…nStep

... so that we only implement the pipe handling once.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The logic is a simple sequence of condition + decision,
which is fairly regullar, and doesn't have to be in a single function.

So, split it up.

Eliminate the recently-introduced closers array again, in most cases.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... now that it only could effectively apply to a single element.

Shoud not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
The alternatives are, mostly, things to _try_; we can typically
fall back to just using the original.

So, just use a loop to take advantage of the regular structure.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@vrothberg
Copy link
Member

Wooooow, I need to grab a coffee before reviewing 😄

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work, @mtrmac.

Thank you for splitting up in smaller commits. Surgical precision.

LGTM

@vrothberg vrothberg merged commit 08e961f into containers:main Jun 16, 2022
@mtrmac mtrmac deleted the copy-layers-refactor branch June 16, 2022 15:13
@mtrmac mtrmac mentioned this pull request Jun 16, 2022
mtrmac added a commit to mtrmac/image that referenced this pull request Jul 21, 2022
copy.Image can now copy non-image OCI artifacts.

Added support for sigstore signatures: they (and related cosign
attachments) can be copied along with images after opt-in in registries.d.
Signatures can be created by copy.Image and enforced via policy.json
(currently with public/private key pairs only).

Now requires Go 1.17.
GPGMe now must be new enough to be visible via pkg-config.

github.com/pkg/errors is no longer used; that might affect
caller-observable error types (in particular, errors.{As,Is}
might need to be used instead of pkg/errors.Cause).

Changes default paths on FreeBSD.

- Remove unused Makefile variables
- Config files should live in /usr/local on FreeBSD
- docker: validate received parts
- Use go env to fetch the go path
- docker: add workaround for CloudFront
- Improve errors messages when image missing from list
- Stop calling gpgme-config
- Fix codespell errors
- Make sure github.com/opencontainers/runc >= 1.1.2 is used
- Cirrus: use Ubuntu 22.04 LTS
- Merge pull request containers#1576 from mtrmac/private-image
- Merge pull request containers#1577 from mtrmac/mocks
- Merge pull request containers#1571 from mtrmac/go1.17
- Merge pull request containers#1578 from mtrmac/sourced-image-struct
- Fix error on parallel multiple image pullings with additionallayerstore
- Merge pull request containers#1579 from mtrmac/copy-layers-refactor
- Reject OCI artifacts in manifest.OCI1.ImageID
- Reject OCI artifacts in manifest.OCI1.Inspect
- Refuse to convert non-image OCI artifacts to Docker formats
- Reject OCI artifacts in image.manifestOCI1.OCIConfig
- Introduce SourcedImage.CanChangeLayerCompression, use it in copy.Image
- Use an updated CI image
- Use strings.ReplaceAll instead of strings.Replace(..., -1)
- Move the main helper removal case to the main path on RemoveAllAuthentication
- Merge pull request containers#1588 from mtrmac/pkg_errors
- Merge pull request containers#1589 from mtrmac/private-dest-impls
- Merge pull request containers#1590 from mtrmac/private-src-impls
- Merge pull request containers#1592 from mtrmac/blobcache-wrap-private
- Use "io.ReadAll" instead of "os.ReadAll"
- Merge pull request containers#1596 from mtrmac/cosign-payload
- Generalize copy.Image to be able to copy signatures with any format
- Merge pull request containers#1593 from mtrmac/cosign-sigs
- Introduce signature.Cosign as a format
- Add use-cosign-attachments to registries.d/*.yaml
- Add support for reading and writing Cosign attachments, incl. signatures
- Merge pull request containers#1595 from mtrmac/cosign-docker
- Add support for creating Cosign signatures
- Fix a long-standing incorrect comment
- Fix JSON syntax in the policy.json(5) man page
- Correctly decode Cosign-generated payloads
- Add Cosign verification support
- s/sigstore/lookaside/g in comments and documentation
- Refer to lookasideStorage instead of signatureStorage in code
- Add lookaside and lookaside-staging, hide sigstore and sigstore-staging
- Merge pull request containers#1605 from mtrmac/sigstore
- Fix a typo in error messages
- Remove a copy&pasted test entry
- Add context to some test failures
- Use more valid data in TestPRSignedByIsSignatureAuthorAccepted
- Generalize keyPath/keyData exclusivity checks
- Remove repetition in tests
- Accept multiple keyrings in newEphemeralGPGSigningMechanism
- Allow accepting multiple GPG keyrings via signedBy.keyPaths
- Switch to golang native error wrapping
- Point out use-sigstore-registries in sigstoreSigned documentation
- Use .pub extension for public keys in sigstoreSigned examples
- copy: print copy info once when writer==io.Discard
- Silence a "potentially unused parameter" warning
- Read signatures from UnparsedImage instead of ImageSource directly
- Consolidate reading messages, and checking for support, into a helper
- build(deps): bump github.com/containers/storage from 1.40.0 to 1.40.2
- build(deps): bump github.com/docker/docker
- build(deps): bump github.com/klauspost/compress from 1.15.2 to 1.15.3
- build(deps): bump github.com/klauspost/compress from 1.15.3 to 1.15.4
- build(deps): bump github.com/docker/docker
- build(deps): bump github.com/proglottis/gpgme from 0.1.1 to 0.1.2
- build(deps): bump github.com/vbauerster/mpb/v7 from 7.4.1 to 7.4.2
- build(deps): bump github.com/imdario/mergo from 0.3.12 to 0.3.13
- build(deps): bump github.com/klauspost/compress from 1.15.4 to 1.15.5
- build(deps): bump github.com/sylabs/sif/v2 from 2.7.0 to 2.7.1
- build(deps): bump github.com/klauspost/compress from 1.15.5 to 1.15.6
- build(deps): bump github.com/stretchr/testify from 1.7.1 to 1.7.2
- build(deps): bump github.com/docker/docker
- build(deps): bump github.com/stretchr/testify from 1.7.2 to 1.7.4
- build(deps): bump github.com/stretchr/testify from 1.7.4 to 1.7.5
- build(deps): bump github.com/stretchr/testify from 1.7.5 to 1.8.0
- build(deps): bump github.com/klauspost/compress from 1.15.6 to 1.15.7
- build(deps): bump github.com/proglottis/gpgme from 0.1.2 to 0.1.3
- build(deps): bump github.com/klauspost/compress from 1.15.7 to 1.15.8
- build(deps): bump github.com/sirupsen/logrus from 1.8.1 to 1.9.0
- build(deps): bump github.com/theupdateframework/go-tuf
- build(deps): bump github.com/BurntSushi/toml from 1.1.0 to 1.2.0

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac mtrmac mentioned this pull request Jul 21, 2022
mtrmac added a commit to mtrmac/image that referenced this pull request Jul 21, 2022
copy.Image can now copy non-image OCI artifacts.

Added support for sigstore signatures: they (and related cosign
attachments) can be copied along with images after opt-in in registries.d.
Signatures can be created by copy.Image and enforced via policy.json
(currently with public/private key pairs only).

Now requires Go 1.17.
GPGMe now must be new enough to be visible via pkg-config.

github.com/pkg/errors is no longer used; that might affect
caller-observable error types (in particular, errors.{As,Is}
might need to be used instead of pkg/errors.Cause).

Changes default paths on FreeBSD.

- Remove unused Makefile variables
- Config files should live in /usr/local on FreeBSD
- docker: validate received parts
- Use go env to fetch the go path
- docker: add workaround for CloudFront
- Improve errors messages when image missing from list
- Stop calling gpgme-config
- Fix codespell errors
- Make sure github.com/opencontainers/runc >= 1.1.2 is used
- Cirrus: use Ubuntu 22.04 LTS
- Merge pull request containers#1576 from mtrmac/private-image
- Merge pull request containers#1577 from mtrmac/mocks
- Merge pull request containers#1571 from mtrmac/go1.17
- Merge pull request containers#1578 from mtrmac/sourced-image-struct
- Fix error on parallel multiple image pullings with additionallayerstore
- Merge pull request containers#1579 from mtrmac/copy-layers-refactor
- Reject OCI artifacts in manifest.OCI1.ImageID
- Reject OCI artifacts in manifest.OCI1.Inspect
- Refuse to convert non-image OCI artifacts to Docker formats
- Reject OCI artifacts in image.manifestOCI1.OCIConfig
- Introduce SourcedImage.CanChangeLayerCompression, use it in copy.Image
- Use an updated CI image
- Use strings.ReplaceAll instead of strings.Replace(..., -1)
- Move the main helper removal case to the main path on RemoveAllAuthentication
- Merge pull request containers#1588 from mtrmac/pkg_errors
- Merge pull request containers#1589 from mtrmac/private-dest-impls
- Merge pull request containers#1590 from mtrmac/private-src-impls
- Merge pull request containers#1592 from mtrmac/blobcache-wrap-private
- Use "io.ReadAll" instead of "os.ReadAll"
- Merge pull request containers#1596 from mtrmac/cosign-payload
- Generalize copy.Image to be able to copy signatures with any format
- Merge pull request containers#1593 from mtrmac/cosign-sigs
- Introduce signature.Cosign as a format
- Add use-cosign-attachments to registries.d/*.yaml
- Add support for reading and writing Cosign attachments, incl. signatures
- Merge pull request containers#1595 from mtrmac/cosign-docker
- Add support for creating Cosign signatures
- Fix a long-standing incorrect comment
- Fix JSON syntax in the policy.json(5) man page
- Correctly decode Cosign-generated payloads
- Add Cosign verification support
- s/sigstore/lookaside/g in comments and documentation
- Refer to lookasideStorage instead of signatureStorage in code
- Add lookaside and lookaside-staging, hide sigstore and sigstore-staging
- Merge pull request containers#1605 from mtrmac/sigstore
- Fix a typo in error messages
- Remove a copy&pasted test entry
- Add context to some test failures
- Use more valid data in TestPRSignedByIsSignatureAuthorAccepted
- Generalize keyPath/keyData exclusivity checks
- Remove repetition in tests
- Accept multiple keyrings in newEphemeralGPGSigningMechanism
- Allow accepting multiple GPG keyrings via signedBy.keyPaths
- Switch to golang native error wrapping
- Point out use-sigstore-registries in sigstoreSigned documentation
- Use .pub extension for public keys in sigstoreSigned examples
- copy: print copy info once when writer==io.Discard
- Silence a "potentially unused parameter" warning
- Read signatures from UnparsedImage instead of ImageSource directly
- Consolidate reading messages, and checking for support, into a helper
- build(deps): bump github.com/containers/storage from 1.40.0 to 1.40.2
- build(deps): bump github.com/docker/docker
- build(deps): bump github.com/klauspost/compress from 1.15.2 to 1.15.3
- build(deps): bump github.com/klauspost/compress from 1.15.3 to 1.15.4
- build(deps): bump github.com/docker/docker
- build(deps): bump github.com/proglottis/gpgme from 0.1.1 to 0.1.2
- build(deps): bump github.com/vbauerster/mpb/v7 from 7.4.1 to 7.4.2
- build(deps): bump github.com/imdario/mergo from 0.3.12 to 0.3.13
- build(deps): bump github.com/klauspost/compress from 1.15.4 to 1.15.5
- build(deps): bump github.com/sylabs/sif/v2 from 2.7.0 to 2.7.1
- build(deps): bump github.com/klauspost/compress from 1.15.5 to 1.15.6
- build(deps): bump github.com/stretchr/testify from 1.7.1 to 1.7.2
- build(deps): bump github.com/docker/docker
- build(deps): bump github.com/stretchr/testify from 1.7.2 to 1.7.4
- build(deps): bump github.com/stretchr/testify from 1.7.4 to 1.7.5
- build(deps): bump github.com/stretchr/testify from 1.7.5 to 1.8.0
- build(deps): bump github.com/klauspost/compress from 1.15.6 to 1.15.7
- build(deps): bump github.com/proglottis/gpgme from 0.1.2 to 0.1.3
- build(deps): bump github.com/klauspost/compress from 1.15.7 to 1.15.8
- build(deps): bump github.com/sirupsen/logrus from 1.8.1 to 1.9.0
- build(deps): bump github.com/theupdateframework/go-tuf
- build(deps): bump github.com/BurntSushi/toml from 1.1.0 to 1.2.0

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants