Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multistage build in same container fails because cross stage deps are not cleaned up ( symlinks ) #1406

Open
RoSk0 opened this issue Aug 31, 2020 · 20 comments · May be fixed by #3130
Open

multistage build in same container fails because cross stage deps are not cleaned up ( symlinks ) #1406

RoSk0 opened this issue Aug 31, 2020 · 20 comments · May be fixed by #3130
Labels
area/multi-stage builds issues related to kaniko multi-stage builds area/symlinks categorized differs-from-docker feat/cleanup gitlab kind/bug Something isn't working priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. priority/p1 Basic need feature compatibility with docker build. we should be working on this next. work-around-available works-with-docker

Comments

@RoSk0
Copy link

RoSk0 commented Aug 31, 2020

Actual behavior
I want to set up image building for our project as part of CI pipeline using GitLab CI capabilities.
Following https://docs.gitlab.com/13.2/ee/ci/docker/using_kaniko.html#building-a-docker-image-with-kaniko I done CI configuration and it works perfect if you build one image per job. It is not GitLab issue, just bear with me.

We have a multi stage Dockerfile to build our images. So if you try build multiple targets inside same ( and this is crucial ) container it will fail with:

error building image: could not save file: symlink ../chi-teck/drupal-code-generator/bin/dcg /kaniko/0/app/vendor/bin/dcg: file exists

Expected behavior

Two (in my case) images built.

To Reproduce
Output of commands that successfully ran is omitted:

$ docker run --rm  --interactive  --tty  --volume $PWD:/app  --user $(id -u):$(id -g)  composer:1  create-project --ignore-platform-reqs drupal/recommended-project kaniko-test
$ cd kaniko-test
$ docker run --rm  --interactive  --tty  --volume $PWD:/app  --user $(id -u):$(id -g)  composer:1  require --ignore-platform-reqs drush/drush:^10
$ cat <<EOF >> Dockerfile
FROM composer:1 AS full-code-base
WORKDIR /app
COPY composer.json composer.lock /app/
RUN composer install --ignore-platform-reqs --no-dev --working-dir=/app
COPY web /app/web
RUN composer dump-autoload --optimize --working-dir=/app

FROM php:7.4-fpm-buster AS project-php
COPY --from=full-code-base /app /app

FROM nginx:1 AS project-nginx
COPY --from=full-code-base /app /app

EOF
$ docker run --network=host -v $(pwd):/workspace  --entrypoint '' --rm -it gcr.io/kaniko-project/executor:debug sh
inside container $ executor --target project-php --destination kanico-test-image:php-latest --no-push
inside container $ executor --target project-nginx --destination kanico-test-image:nginx-latest --no-push
INFO[0000] Resolved base name composer:1 to full-code-base
INFO[0000] Resolved base name php:7.4-fpm-buster to project-php
INFO[0000] Resolved base name nginx:1 to project-nginx
INFO[0000] Retrieving image manifest composer:1
INFO[0000] Retrieving image composer:1
INFO[0003] Retrieving image manifest composer:1
INFO[0003] Retrieving image composer:1
INFO[0006] Retrieving image manifest php:7.4-fpm-buster
INFO[0006] Retrieving image php:7.4-fpm-buster
INFO[0009] Retrieving image manifest php:7.4-fpm-buster
INFO[0009] Retrieving image php:7.4-fpm-buster
INFO[0012] Retrieving image manifest nginx:1
INFO[0012] Retrieving image nginx:1
INFO[0014] Retrieving image manifest nginx:1
INFO[0014] Retrieving image nginx:1
INFO[0017] Built cross stage deps: map[0:[/app /app]]
INFO[0017] Retrieving image manifest composer:1
INFO[0017] Retrieving image composer:1
INFO[0019] Retrieving image manifest composer:1
INFO[0019] Retrieving image composer:1
INFO[0022] Executing 0 build triggers
INFO[0022] Unpacking rootfs as cmd COPY composer.json composer.lock /app/ requires it.
INFO[0033] WORKDIR /app
INFO[0033] cmd: workdir
INFO[0033] Changed working directory to /app
INFO[0033] No files changed in this command, skipping snapshotting.
INFO[0033] COPY composer.json composer.lock /app/
INFO[0033] Taking snapshot of files...
INFO[0033] RUN composer install --ignore-platform-reqs --no-dev --working-dir=/app
INFO[0033] Taking snapshot of full filesystem...
INFO[0036] cmd: /bin/sh
INFO[0036] args: [-c composer install --ignore-platform-reqs --no-dev --working-dir=/app]
INFO[0036] Running: [/bin/sh -c composer install --ignore-platform-reqs --no-dev --working-dir=/app]
Loading composer repositories with package information
Installing dependencies from lock file
Nothing to install or update
Generating autoload files
INFO[0036] Taking snapshot of full filesystem...
INFO[0037] Taking snapshot of files...
INFO[0037] COPY web /app/web
INFO[0039] Taking snapshot of files...
INFO[0041] RUN composer dump-autoload --optimize --working-dir=/app
INFO[0041] cmd: /bin/sh
INFO[0041] args: [-c composer dump-autoload --optimize --working-dir=/app]
INFO[0041] Running: [/bin/sh -c composer dump-autoload --optimize --working-dir=/app]
Generating optimized autoload files
Generated optimized autoload files containing 4906 classes
INFO[0042] Taking snapshot of full filesystem...
INFO[0046] Saving file app for later use
error building image: could not save file: symlink ../chi-teck/drupal-code-generator/bin/dcg /kaniko/0/app/vendor/bin/dcg: file exists

I've tried to raise verbosity level to debug - nothing useful. With trace it shows way too much to digest.

Directory content of vendor/bin is:

$ ll vendor/bin/
total 8
drwxr-xr-x  2 kirill kirill 4096 Aug 31 16:45 ./
drwxr-xr-x 31 kirill kirill 4096 Aug 31 16:45 ../
lrwxrwxrwx  1 kirill kirill   41 Aug 31 16:45 dcg -> ../chi-teck/drupal-code-generator/bin/dcg*
lrwxrwxrwx  1 kirill kirill   20 Aug 31 16:45 drush -> ../drush/drush/drush*
lrwxrwxrwx  1 kirill kirill   33 Aug 31 16:45 php-parse -> ../nikic/php-parser/bin/php-parse*
lrwxrwxrwx  1 kirill kirill   22 Aug 31 16:45 psysh -> ../psy/psysh/bin/psysh*
lrwxrwxrwx  1 kirill kirill   44 Aug 31 16:45 release -> ../consolidation/self-update/scripts/release*
lrwxrwxrwx  1 kirill kirill   26 Aug 31 16:45 robo -> ../consolidation/robo/robo*
lrwxrwxrwx  1 kirill kirill   51 Aug 31 16:41 var-dump-server -> ../symfony/var-dumper/Resources/bin/var-dump-server*

Additional Information

  • Dockerfile
    Included in the steps to reproduce
  • Build Context
    Included in the steps to reproduce
  • Kaniko Image (fully qualified with digest)
$ docker inspect gcr.io/kaniko-project/executor:debug
[
    {
        "Id": "sha256:b0070f18add278df20229ce34172fc16a4c76392fc28d33df7837396a2b882c0",
        "RepoTags": [
            "gcr.io/kaniko-project/executor:debug"
        ],
        "RepoDigests": [
            "gcr.io/kaniko-project/executor@sha256:0f27b0674797b56db08010dff799c8926c4e9816454ca56cc7844df228c53485"
        ],
        "Created": "2020-08-18T02:40:08.570969026Z",
        "DockerVersion": "19.03.8",
        "Architecture": "amd64",
        "Os": "linux",
}

Triage Notes for the Maintainers

Description Yes/No
Please check if this a new feature you are proposing
Please check if the build works in docker but not in kaniko
Please check if this error is seen when you use --cache flag
Please check if your dockerfile is a multistage dockerfile
@alanhughes
Copy link

I had the same problem, you need to run --cleanup if you wish to reuse the same kaniko container

https://github.com/GoogleContainerTools/kaniko#--cleanup

@RoSk0
Copy link
Author

RoSk0 commented Nov 15, 2020

Thanks for the suggestion @alanhughes . I've tested with the image from original report (repo digest gcr.io/kaniko-project/executor@sha256:0f27b0674797b56db08010dff799c8926c4e9816454ca56cc7844df228c53485) by adding adding --cleanup to call, like :

executor --cleanup --target project-php --destination kanico-test-image:php-latest --no-push
executor --cleanup --target project-nginx --destination kanico-test-image:php-latest --no-push

but result is the same - error building image: could not save file: symlink ../chi-teck/drupal-code-generator/bin/dcg /kaniko/0/app/vendor/bin/dcg: file exists.

Then I updated Kaniko image:

$ docker inspect gcr.io/kaniko-project/executor:debug
[
    {
        "Id": "sha256:ffca8c9f01a23d0886106b46f9bdd68dc5ca29d3377434bb69020df0cb2982a8",
        "RepoTags": [
            "gcr.io/kaniko-project/executor:debug"
        ],
        "RepoDigests": [
            "gcr.io/kaniko-project/executor@sha256:473d6dfb011c69f32192e668d86a47c0235791e7e857c870ad70c5e86ec07e8c"
        ],
        "Parent": "",
        "Comment": "",
        "Created": "2020-10-29T17:27:40.548045213Z",
        "DockerVersion": "19.03.8",
    }
]

which is Kaniko version : v1.3.0 and run steps to re-produce again - same result, despite the fact that there is additional INFO[0153] Deleting filesystem... output entry when building first container.

@cmorty
Copy link

cmorty commented Nov 18, 2020

@RoSk0 Have you found a solution. - I'm running out of ideas.

@RoSk0
Copy link
Author

RoSk0 commented Nov 19, 2020 via email

@morsik
Copy link

morsik commented Mar 12, 2021

I just went into same problem ;(

I'm using npm for NodeJS package management and it creates node_modules/.bin directory which contains a lot of symlinks to different Node modules scripts and it fails on this ;(

INFO[0076] Saving file code/node_modules for later use  
error building image: could not save file: symlink ../google-p12-pem/build/src/bin/gp12-pem.js /kaniko/0/code/node_modules/.bin/gp12-pem: file exists

@bdols
Copy link

bdols commented Mar 21, 2021

I also have encountered this issue and have split my job into three separate jobs. The --cleanup, --no-push combo did not resolve this for me.

@RoSk0
Copy link
Author

RoSk0 commented May 12, 2021

Unfortunately it is still an issue with the latest release :(

executor version
Kaniko version :  v1.6.0

@AndreKR
Copy link

AndreKR commented Jul 24, 2021

This is basically a duplicate of my (currently closed) issue #1217 and I provided a minimal reproduction there, which I just updated to v1.6.0:

Command line
Dockerfile
Full log

@michaellmonaghan
Copy link

I am also seeing this issue with the latest release. Only occurs when building multiple images in the same container with the --cleanup flag.
gcr.io/kaniko-project/executor:debug 7053f62a27a8

@aeimer
Copy link

aeimer commented Oct 26, 2021

I can replicate this with npm and a two stage build, but it happens in the first stage:

INFO[0048] Saving file app for later use                
INFO[0050] Saving file app/dist for later use           
INFO[0050] Saving file app/node_modules for later use   
error building image: could not save file: symlink ../acorn/bin/acorn /kaniko/0/app/node_modules/.bin/acorn: file exists
Command exited with non-zero status 1

Running on AKS with Gitlab CI.
Using gcr.io/kaniko-project/executor:v1.6.0-debug

time /kaniko/executor \
  --context "${CI_PROJECT_DIR}" \
  --dockerfile "${CI_PROJECT_DIR}/Dockerfile" \
  --cache=true \
  --destination "${IMAGE_TAG}" \
  --build-arg NODE_IMAGE="${NODE_IMAGE}" \
  --build-arg VERSION="${VERSION}" \
  --build-arg VERSION_SEMVER="${VERSION_SEMVER}"
ARG NODE_IMAGE
ARG VERSION="not_set"

FROM $NODE_IMAGE as build

ARG VERSION
ENV APP_VERSION=${VERSION}
ARG VERSION_SEMVER="not_set"

WORKDIR /app

COPY package.json package-lock.json .npmrc ./
RUN npm version "${VERSION_SEMVER}" \
    && npm ci

COPY . .
RUN npm run build:ci

# -----------------------------------------------------

FROM $NODE_IMAGE

# ...

@slavashvets
Copy link

slavashvets commented Jan 14, 2022

The following workaroud works for me. After each execution I add:

rm -rf /kaniko/0

For example:

execute() {
  /kaniko/executor  --context . --build-arg=MYARG=1$ --cleanup --destination myregistry.com/repo:tag-$1
  rm -rf /kaniko/0
}

while read -r line; do
  execute $line
done < my_file

@pandino
Copy link

pandino commented Aug 15, 2022

I had the same issue and on top of that, if you have more than two stages, Kaniko will also create /kaniko/1 and so on.

@sinkr
Copy link

sinkr commented Nov 11, 2022

Having the same problem here, up to, and including 1.9.1.

@aaron-prindle aaron-prindle changed the title Cannot build multiple images inside single container ( symlinks ) multistage build in same container fails because cross stage deps are not cleaned up ( symlinks ) Jun 14, 2023
@aaron-prindle aaron-prindle added area/multi-stage builds issues related to kaniko multi-stage builds feat/cleanup kind/bug Something isn't working labels Jun 22, 2023
@aaron-prindle aaron-prindle added area/symlinks work-around-available differs-from-docker works-with-docker priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. priority/p1 Basic need feature compatibility with docker build. we should be working on this next. categorized labels Jul 5, 2023
@zzhzero
Copy link

zzhzero commented Jul 17, 2023

Is there any repair plan

@dadurex
Copy link

dadurex commented Sep 27, 2023

I can still reproduce it with version gcr.io/kaniko-project/executor:v1.16.0-debug

Using dockerfile as @AndreKR mentioned #1406 (comment)


Also idk whats going under the hood exactly in kaniko but is it desired that when building image we are working on / fs of kaniko container? With such dockerfile

FROM alpine:3.17.1
RUN rm -rf /kaniko

im able to remove kaniko executor on kaniko continer.

➜  ~  docker run -it --rm --name kaniko-test --entrypoint="/busybox/sh" gcr.io/kaniko-project/executor:v1.16.0-debug

/workspace # 
/workspace # vi Dockerfile
/workspace # 
/workspace # cat Dockerfile 
FROM alpine:3.17.1
RUN rm -rf /kaniko
/workspace # 
/workspace # /kaniko/executor \
>     --no-push \
>     --log-format=text \
>     --cleanup
time="2023-09-27T13:26:10Z" level=info msg="Retrieving image manifest alpine:3.17.1"
time="2023-09-27T13:26:10Z" level=info msg="Retrieving image alpine:3.17.1 from registry index.docker.io"
time="2023-09-27T13:26:12Z" level=info msg="Built cross stage deps: map[]"
time="2023-09-27T13:26:12Z" level=info msg="Retrieving image manifest alpine:3.17.1"
time="2023-09-27T13:26:12Z" level=info msg="Returning cached image manifest"
time="2023-09-27T13:26:12Z" level=info msg="Executing 0 build triggers"
time="2023-09-27T13:26:12Z" level=info msg="Building stage 'alpine:3.17.1' [idx: '0', base-idx: '-1']"
time="2023-09-27T13:26:12Z" level=info msg="Unpacking rootfs as cmd RUN rm -rf /kaniko requires it."
time="2023-09-27T13:26:17Z" level=info msg="RUN rm -rf /kaniko"
time="2023-09-27T13:26:17Z" level=info msg="Initializing snapshotter ..."
time="2023-09-27T13:26:17Z" level=info msg="Taking snapshot of full filesystem..."
time="2023-09-27T13:26:17Z" level=info msg="Cmd: /bin/sh"
time="2023-09-27T13:26:17Z" level=info msg="Args: [-c rm -rf /kaniko]"
time="2023-09-27T13:26:17Z" level=info msg="Running: [/bin/sh -c rm -rf /kaniko]"
error building image: error building stage: failed to take snapshot: open /kaniko/2366305773: no such file or directory
/workspace # ls -la /kaniko
ls: /kaniko: No such file or directory
/workspace # 

@dadurex
Copy link

dadurex commented Sep 29, 2023

So I dig for a while and found that /kaniko/0 won't be deleted as path /kaniko and all other children of this directory are on defaultIgnoreList https://github.com/GoogleContainerTools/kaniko/blob/main/pkg/util/fs_util.go#L63 - this list containt path to exclude from deleting in function DeleteFilesystem() https://github.com/GoogleContainerTools/kaniko/blob/main/pkg/util/fs_util.go#L245

Path like /workspace is cleaned up - so i suggest that kaniko should not store data in /kaniko/0 directory but instead in something like /layers/0 ( so it will be config.RootDir/layers/......). The same happens for snapshot files (files with names containing only numbers like /kaniko/0123123 ) - they are also created in /kaniko directory which means they never will be deleted. So snapshot files should be located in other location like config.RootDir/snaphosts/.... https://github.com/GoogleContainerTools/kaniko/blob/main/pkg/snapshot/snapshot.go#L66

After reconsidering the topic i think the best solution for this issue is to distinguish cleanup of the filesystem after the stage and clean filesystem with the flag --cleanup as currently same function is used for both cases. So there should be:

@dadurex
Copy link

dadurex commented Oct 9, 2023

I created a simple PR with changes that remove all leftovers using regex. I'm open to discussion :)
image

@ricardllop
Copy link

I hope this gets merged. Tanks. However wanted to post this in case someone else finds this problem that I managed to solve:

Hello after finding this issue here and trying a lot of things to find a generic fix for my case, I found what I think solves most my error cases. In my case this solved all the failing builds with the different dockerfiles of around 15 projects (not all were failing but the ones with dockerfiles with more stages were more prone to fail).

My use case of kaniko is inside a jenkins pipeline that is using kubernetes plugin to run jobs inside kubernetes agent pods. Those agents have defined 1 single kaniko container and my need was to build the image twice with that single kaniko container, once as a tar to scan it with Trivy (a tool to scan containers) and after some quality checks are met use again the kaniko container to just build the image again and upload it to ECR.

My solution was adding this to my first call of building the image as a tar: && rm -rf /kaniko/*[0-9]* && rm -rf /kaniko/Dockerfile && mkdir -p /workspace

Call ending like this.

/kaniko/executor -f 'pwd'/docker/Dockerfile -c 'pwd' --tar-path='pwd'/image.tar --single-snapshot --no-push --destination=image --cleanup && rm -rf /kaniko/*[0-9]* && rm -rf /kaniko/Dockerfile && mkdir -p /workspace

Not a huge kaniko user myself but found this /kaniko directory was filled with some files after the 1st execution as some people in this thread mentioned. those files were messing the next execution. Those commands after the 1st build remove those problematic files and second execution works as a charm.

Hope this helps other people that find this issue. Thanks.

@aeimer
Copy link

aeimer commented Feb 2, 2024

@ricardllop you can also use crane to upload a container tar. No need to rebuild the image. I mean i don’t know your usecase in detail, but it sounds like.

https://github.com/google/go-containerregistry/blob/main/cmd/crane/doc/crane.md

@BenDenney
Copy link

Is it possible for files in these locations to be overwritten rather than rewritten?
Are all of these files used in the next stage of the image build, or is there a "list" that details the files stashed for that particular run that can be updated so only the relevant files need to be overwritten or pulled into the next stage?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/multi-stage builds issues related to kaniko multi-stage builds area/symlinks categorized differs-from-docker feat/cleanup gitlab kind/bug Something isn't working priority/p0 Highest priority. Break user flow. We are actively looking at delivering it. priority/p1 Basic need feature compatibility with docker build. we should be working on this next. work-around-available works-with-docker
Projects
None yet