Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --zero-file-timestamps flag #2477

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

zx96
Copy link

@zx96 zx96 commented Apr 22, 2023

Description

This MR adds a new flag (--zero-file-timestamps) to zero timestamps in layer tarballs without making a fully reproducible image. This flag provides a workaround (but not a complete solution) for #862 and #1960.

My use case for this is maintaining a large image with build tooling. I have a multi-stage Dockerfile that generates an image containing several toolchains for cross-compilation, with each toolchain being prepared in a separate stage before being COPY'd into the final image. This is a very large image, and while it's incredibly convenient for development, making a change as simple as adding one new tool tends to invalidate caches and force the devs to download another 10+ GB image.

If timestamps were removed from each layer, these images would be mostly unchanged with each minor update, greatly reducing disk space needed for keeping old versions around and time spent downloading updated images.

I wanted to use Kaniko's --reproducible flag to help with this, but ran into issues with memory consumption (#862) and build time (#1960). Additionally, I didn't really care about reproducibility - I mainly cared about the layers having identical contents so Docker could skip pulling and storing redundant layers from a registry.

This solution works around these problems by stripping out timestamps as the layer tarballs are built. It removes the need for a separate postprocessing step, and preserves image metadata so we can still see when the image itself was built.

An alternative solution would be to use mutate.Time much like Kaniko currently uses mutate.Canonical to implement --reproducible, but that would not be a satisfactory solution for me until issue 1168 is addressed by go-containerregistry. Given my lack of Go experience, I don't feel comfortable tackling that myself, and this seems like a simple and useful workaround in the meantime.

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you review them:

  • Includes unit tests
  • Adds integration tests if needed.

See the contribution guide for more details.

Reviewer Notes

  • The code flow looks good.
  • Unit tests and or integration tests added.

Release Notes

- Add `--zero-file-timestamps` flag to make layer contents reproducible without stripping metadata

This change adds a new flag to zero timestamps in layer tarballs without
making a fully reproducible image.

My use case for this is maintaining a large image with build tooling.
I have a multi-stage Dockerfile that generates an image containing
several toolchains for cross-compilation, with each toolchain being
prepared in a separate stage before being COPY'd into the final image.
This is a very large image, and while it's incredibly convenient for
development, making a change as simple as adding one new tool tends to
invalidate caches and force the devs to download another 10+ GB image.

If timestamps were removed from each layer, these images would be mostly
unchanged with each minor update, greatly reducing disk space needed for
keeping old versions around and time spent downloading updated images.

I wanted to use Kaniko's --reproducible flag to help with this, but ran
into issues with memory consumption (GoogleContainerTools#862) and build time (GoogleContainerTools#1960).
Additionally, I didn't really care about reproducibility - I mainly
cared about the layers having identical contents so Docker could skip
pulling and storing redundant layers from a registry.

This solution works around these problems by stripping out timestamps as
the layer tarballs are built. It removes the need for a separate
postprocessing step, and preserves image metadata so we can still see
when the image itself was built.

An alternative solution would be to use mutate.Time much like Kaniko
currently uses mutate.Canonical to implement --reproducible, but that
would not be a satisfactory solution for me until
[issue 1168](google/go-containerregistry#1168)
is addressed by go-containerregistry. Given my lack of Go experience, I
don't feel comfortable tackling that myself, and this seems like a
simple and useful workaround in the meantime.
This allows the behavior of --zero-timestamps to better match the
behavior of --reproducible. Layers generated using both methods should
now have the same digest.
@peter-volkov
Copy link

peter-volkov commented Jun 19, 2023

Dear maintainers. @aaron-prindle @chuangw6
Take a look at this PR please
It would be good if --reproducible=true would not require lots of memory for big images.

@zx96
Copy link
Author

zx96 commented Aug 9, 2023

It's been a few months and I still would really like this feature; is there anything I can do to help the chances of this getting merged?

This change solves an immediate problem that I (and several others) have, and it seems like a useful feature to me in general (I think that having reproducible layer contents but preserving the image metadata itself would be valuable).

@derari
Copy link

derari commented Aug 10, 2023

I didn't really care about reproducibility - I mainly cared about the layers having identical contents

Isn't this exactly what "reproducible" means?

But I, too, would prefer this implementation over the existing one

@jkalez
Copy link

jkalez commented Aug 10, 2023

@zx96, has this been tested? I'm attempting to use your branch and am still getting cache misses in use-cases where --reproducible gets cache hits. Granted the time overhead of --reproducible completely negates any cache performance benefits, so a fix like you're proposing is absolutely necessary.

@zx96
Copy link
Author

zx96 commented Aug 11, 2023

@derari In my head, "reproducible" generally means you get the exact same thing out and can verify it with the hashes; this flag doesn't give you the exact same image, just identical layer contents (which is good enough for Docker to avoid storing the same thing multiple times).

@jkalez I did test it back before I initially opened the PR, but my testing was mostly just ensuring I got the same layer hashes out reliably and that they matched the layer hashes I got with --reproducible. Could you tell me more about how you're using the cache (what command are you running Kaniko with) so I can look into it?

@jkalez
Copy link

jkalez commented Aug 11, 2023

I'm using commit bd1dd69b59f164e3702493b66c8edb7ae7059614 from your repository, and built a kaniko image from that commit via docker build -t kaniko_zero_file_timestamps:debug -f deploy/Dockerfile_debug .

My kaniko arguments look as follows:

      /kaniko/executor
      --cache=true
      --cache-copy-layers=true
      --cache-run-layers=true
      --cache-ttl=24h
      --cache-repo="$CI_REGISTRY_IMAGE/container-cache"
      --compressed-caching=false
      --zero-file-timestamps
      --snapshot-mode=redo
      --context="$CI_PROJECT_DIR/images/$IMAGE_NAME"
      --build-arg BASE_IMAGE
      --destination "$CI_REGISTRY_IMAGE/$IMAGE_NAME:$CI_COMMIT_REF_SLUG"
      --registry-certificate "$CI_REGISTRY=$CI_SERVER_TLS_CA_FILE"

All the environment variables are appropriately populated. For this test, I'm just using a local container registry like this: https://docs.docker.com/registry/deploying/

The context directory "$CI_PROJECT_DIR/images/$IMAGE_NAME" contains a setup.sh script and a very small Dockerfile

ARG BASE_IMAGE
FROM $BASE_IMAGE

RUN apt update
RUN apt install -y xvfb libxcb1-dev
COPY script.sh .
ENTRYPOINT ["./script.sh"]

If I run kaniko like this twice with a cold cache, I expect the following:

  • 1st run: build executes in its entirety and pushes layers to the cache
  • 2nd run: build pulls from cached layers and runs significantly faster

However, the 2nd run never actually pulls anything from the cache. I get logs like the following:

INFO[0000] Checking for cached layer <$CI_REGISTRY_IMAGE>/container-cache:fd283800bb67e1fd789bf395e0297aacaee9c8682640ef3c2fdb37175a77745b...
INFO[0000] No cached layer found for cmd RUN apt update
INFO[0000] Unpacking rootfs as cmd RUN apt update requires it.

Note in the 1st run, I got the following:

INFO[0000] Checking for cached layer <$CI_REGISTRY_IMAGE>/container-cache:bcd9d2b1730856f75ee1324198f7176c6d67ea896ba8a0f9e7a567f16c8f14ee...
INFO[0000] No cached layer found for cmd RUN apt update
INFO[0000] Unpacking rootfs as cmd RUN apt update requires it.
...
INFO[0030] Pushing layer <$CI_REGISTRY_IMAGE>/container-cache:bcd9d2b1730856f75ee1324198f7176c6d67ea896ba8a0f9e7a567f16c8f14ee to cache now
INFO[0030] Pushing image to <$CI_REGISTRY_IMAGE>container-cache:bcd9d2b1730856f75ee1324198f7176c6d67ea896ba8a0f9e7a567f16c8f14ee

Obviously those hashes do not match. Note if I run the exact same test with --reproducible instead of --zero-file-timestamps, the second run pulls from the cache:

INFO[0000] Checking for cached layer <$CI_REGISTRY_IMAGE>/container-cache:352dbd5bf4eb089c4c765d74b1d32b86dc2c2299f2cadf74348fee35beb2b665...
INFO[0000] Using caching version of cmd: RUN apt update

And this is the first run when run with --reproducible:

INFO[0031] Taking snapshot of full filesystem...
INFO[0032] Pushing layer <$CI_REGISTRY_IMAGE>/container-cache:352dbd5bf4eb089c4c765d74b1d32b86dc2c2299f2cadf74348fee35beb2b665 to cache now
INFO[0032] Pushing image to <$CI_REGISTRY_IMAGE>/container-cache:352dbd5bf4eb089c4c765d74b1d32b86dc2c2299f2cadf74348fee35beb2b665

Note here that the hashes match.

Let me know if there's anything else I can provide to you to help you repro this!

@jkalez
Copy link

jkalez commented Aug 24, 2023

@zx96 any luck reproducing what I saw in your branch?

@zx96
Copy link
Author

zx96 commented Aug 24, 2023

@jkalez I've been busy and haven't had a chance to look into it yet... 😅

I might get a chance to look at this weekend; I'll update here if I figure something out.

@bh-tt
Copy link

bh-tt commented Nov 6, 2023

@zx96 could this perhaps be merged with a note that setting this flag can have some issues with layer caching? That would at least enable the feature, which likely provides a larger benefit to speed than the cache does in the first place.

@philippe-granet
Copy link

philippe-granet commented Dec 23, 2023

The reproducible community seems to be settling on having the environment variable SOURCE_DATE_EPOCH as something that can be set to a value, that is then used for all date-specific operations so that the dates stay the same with the same builds.

Buildkit use it:
https://www.docker.com/blog/highlights-buildkit-v0-11-release/
https://github.com/moby/buildkit/blob/master/docs/build-repro.md#source_date_epoch

Could Kaniko use this convention?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants