Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to build a simple Dockerfile with buildx where userns-remap and the containerd backend is enabled #47377

Open
gordz opened this issue Feb 13, 2024 · 5 comments
Labels
area/builder/buildkit Issues affecting buildkit area/builder area/security/userns containerd-integration Issues and PRs related to containerd integration kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage version/25.0

Comments

@gordz
Copy link

gordz commented Feb 13, 2024

Description

We are unable to build a simple image using buildx, with the docker buildkit driver, where the docker daemon is running with the following configuration:

  • userns remap is enabled
  • storage driver is set to overlayfs
  • the containerd snapshotter is enabled
  • using the built in docker buildx driver

The problem can be reproduced relatively easily with a simple image such as:

FROM alpine:latest RUN echo "hello world"

Reproduce

Command:
docker buildx build -f Dockerfile .

Result:

docker buildx build -f Dockerfile .

#0 building with "default" instance using docker driver
#1 [internal] load build definition from Dockerfile.alpine.buildx
#1 transferring dockerfile: 92B done
#1 DONE 0.0s
#2 [internal] load metadata for docker.io/library/alpine:latest
#2 DONE 0.3s
#3 [internal] load .dockerignore
#3 transferring context: 2B done
#3 DONE 0.0s
#4 [1/2] FROM docker.io/library/alpine:latest@sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b
#4 resolve docker.io/library/alpine:latest@sha256:c5b1261d6d3e43071626931fc004f70149baeba2c8ec672bd4f27761f8e1ad6b done
#4 sha256:4abcf20661432fb2d719aaf90656f55c287f8ca915dc1c92ec14ff61e67fbaf8 3.41MB / 3.41MB 0.1s done
#4 extracting sha256:4abcf20661432fb2d719aaf90656f55c287f8ca915dc1c92ec14ff61e67fbaf8 0.1s done
#4 DONE 0.2s
#5 [2/2] RUN echo "hello world"
#5 0.027 runc run failed: unable to start container process: error during container init: error mounting "/var/lib/docker/165536.165536/buildkit/executor/resolv.conf" to rootfs at "/etc/resolv.conf": open /var/lib/docker/165536.165536/buildkit/executor/30g5og94pc1it3ymc8ymjdpd8/rootfs/etc/resolv.conf: permission denied
#5 ERROR: process "/bin/sh -c echo \"hello world\"" did not complete successfully: exit code: 1
------
 > [2/2] RUN echo "hello world":
0.027 runc run failed: unable to start container process: error during container init: error mounting "/var/lib/docker/165536.165536/buildkit/executor/resolv.conf" to rootfs at "/etc/resolv.conf": open /var/lib/docker/165536.165536/buildkit/executor/30g5og94pc1it3ymc8ymjdpd8/rootfs/etc/resolv.conf: permission denied
------
[Dockerfile.alpine.buildx](Dockerfile.alpine.buildx):2
--------------------
   1 |     FROM alpine:latest
   2 | >>> RUN echo "hello world"
--------------------
ERROR: failed to solve: process "/bin/sh -c echo \"hello world\"" did not complete successfully: exit code: 1

Expected behavior

The container should be built successfully.

docker version

+ docker version
Client:
 Version:           25.0.3
 API version:       1.44
 Go version:        go1.21.6
 Git commit:        4debf41
 Built:             Tue Feb  6 21:13:00 2024
 OS/Arch:           linux/amd64
 Context:           default
Server: Docker Engine - Community
 Engine:
  Version:          25.0.3
  API version:      1.44 (minimum version 1.24)
  Go version:       go1.21.6
  Git commit:       f417435
  Built:            Tue Feb  6 21:13:08 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.7.13
  GitCommit:        7c3aca7a610df76212171d200ca3811ff6096eb8
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

Client:
 Version:    25.0.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.12.1
    Path:     /usr/local/lib/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.24.5
    Path:     /usr/local/libexec/docker/cli-plugins/docker-compose
Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 25.0.3
 Storage Driver: overlayfs
  driver-type: io.containerd.snapshotter.v1
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Authorization: pipelines
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7c3aca7a610df76212171d200ca3811ff6096eb8
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  apparmor
WARNING: API is accessible on http://0.0.0.0:2375 without encryption.
         Access to the remote API is equivalent to root access on the host. Refer
         to the 'Docker daemon attack surface' section in the documentation for
         more information: https://docs.docker.com/go/attack-surface/
  seccomp
   Profile: builtin
  userns
 Kernel Version: 5.15.0-1052-aws
 Operating System: Alpine Linux v3.19 (containerized)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 30.89GiB
 Name: 371bfe59-1b94-4f27-a7ab-c2e3a417c200-rnbdb
 ID: 0f3438f6-24c2-4ed2-b1df-96ff9cdb8cbc
 Docker Root Dir: /var/lib/docker/165536.165536
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Registry Mirrors:
  http://10.201.201.163:5000/
 Live Restore Enabled: false
 Product License: Community Engine

Additional Info

I've ran an experiment and found if run the following on the daemon, the problem disappears, however my knowledge here is currently lacking and I don't yet understand why this is the case without diving deeper:

 chown -R 165536:165536 /var/lib/docker/165536.165536/containerd/daemon/io.containerd.snapshotter.v1.overlayfs/snapshots

We launch our daemon with the following options. There is an auth plugin we have wired up, however than can be disregarded here.

"--authorization-plugin=<redacted>
    --storage-driver=overlayfs 
    --registry-mirror http://<host>:<port> 
    --userns-remap=default 
    --log-level warn"

daemon.json:

{
  "features": {
    "containerd-snapshotter": true
  }
}

I was able to find this runc log file:

/var/lib/docker/165536.165536/buildkit/executor # cat runc-log.json
{"level":"error","msg":"runc run failed: unable to start container process: error during container init: error mounting \"/var/lib/docker/165536.165536/buildkit/executor/resolv.conf\" to rootfs at \"/etc/resolv.conf\": open /var/lib/docker/165536.165536/buildkit/executor/jmzvvv7hgk3inuk6w35gp2pyu/rootfs/etc/resolv.conf: permission denied","time":"2024-02-13T05:00:49Z"}
{"level":"error","msg":"container does not exist","time":"2024-02-13T05:00:49Z"}
@gordz gordz added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels Feb 13, 2024
@vvoland vvoland added area/builder containerd-integration Issues and PRs related to containerd integration area/builder/buildkit Issues affecting buildkit labels Feb 13, 2024
@thaJeztah
Copy link
Member

I recall some recent issue with rootless, but don't think it's directly related here;

ISTR I've seen a discussion elsewhere though, but maybe it was in the BuildKit repo 🤔 /cc @AkihiroSuda do you recall?

@vvoland
Copy link
Contributor

vvoland commented Feb 13, 2024

Thanks! I reproduced this on my side and I'm digging into it. I think I have an initial idea of what's wrong, but still need to confirm it (if that's true, it will probably involve some changes on the buildkit side though).

@vvoland
Copy link
Contributor

vvoland commented Feb 13, 2024

So, the lesser issue is that the containerd worker created by buildkit always sets a nil identity mapping (which means root): https://github.com/moby/buildkit/blob/47d6583cdf58b952c3cb9c719f2f9b45be825c1f/worker/containerd/containerd.go#L156

But, even if it set the desired mapping, it still doesn't change the fact, that all unpacked image rootfs snapshots are stored as-is without any user remapping. The remapping is performed in a separate layer added on top of the image layer when creating a new snapshot for the container to run:

if !i.idMapping.Empty() {
return i.remapSnapshot(ctx, snapshotter, id, id+"-init")
}

This is fine for docker run/create, but not for buildkit as it creates container directly via runc so it doesn't go the same code path.

Not sure what's the best way to solve this, but there might be a couple of options. The easiest solution might be to wrap the buildkit executor to remap users before executing the Run method. It could work, because each executor has its own directory like: /var/lib/docker/165536.165536/buildkit/executor/kh28zau1oz1p6l53lht2r8u67/.

Not sure if that's the best option though.. cc @tonistiigi @crazy-max

@rumpl
Copy link
Member

rumpl commented Mar 11, 2024

ping @tonistiigi @crazy-max

@tonistiigi
Copy link
Member

If moby with containerd storage keeps the storage pre-mapped like before then correct identity mapping needs to be passed during invocation. If it keeps files as full root and remaps on mount (weaker security but more flexible for volume access) then such mode does not exist in BuildKit atm and would need to be implemented before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder/buildkit Issues affecting buildkit area/builder area/security/userns containerd-integration Issues and PRs related to containerd integration kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage version/25.0
Projects
None yet
Development

No branches or pull requests

5 participants