running systemd inside docker arch container hangs or segfaults #3629

flokli · 2014-01-16T14:46:16Z

I tried running the base/arch image in "system container" mode.

However, docker run -i -t base/arch /sbin/init doesn't seem to work like it should.
I detach it (Ctrl-p, Ctrl-q), and with strace I see /sbin/init (which doesn't do anything), however it should normally spawn some other processes (like systemd-journald)

When I run docker run -i -t base/arch /bin/bash, and enter /sbin/init --system, I get the following output:

systemd 208 running in system mode. (+PAM -LIBWRAP -AUDIT -SELINUX -IMA -SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ)
Detected virtualization 'lxc'.
Failed to set hostname to <888e6c612435>: Operation not permitted
Failed to enable kbrequest handling: Operation not permitted
No control group support available, not creating root group.
Cannot add dependency job for unit display-manager.service, ignoring: Unit display-manager.service failed to load: No such file or directory.
Segmentation fault (core dumped)

In the same container (but with strace installed), running

# docker run -i -t 1cff36031b68 /bin/bash
[root@bef7f6801a3d /]# strace -fvv /sbin/init --system

segfaults, leaving the following output: https://gist.github.com/flokli/8456044

Do you have any idea whats wrong here? I'd really like to use docker in system container mode, and according to #223, this should already be possible...

Florian

The text was updated successfully, but these errors were encountered:

s0undt3ch · 2014-01-17T15:04:17Z

I'm suffering from the same issue...

codekoala · 2014-01-18T07:08:53Z

Seeing similar behavior as well.

mait · 2014-01-20T16:46:51Z

I've tried this at digitalocean.com arch64 vm.

Docker host:
3.8.4-1-ARCH (updated, except kernel)

Docker client:
ubuntu 13.10

I'v used socat with openssl for remote api call. But same result for local docker client via sshing.

http://jpetazzo.github.io/2013/10/20/secure-connection-docker-api/

➜ docker version
Client version: 0.7.6
Go version (client): go1.2
Git commit (client): bc3b2ec
Server version: 0.7.6
Git commit (server): bc3b2ec
Go version (server): go1.2
Last stable version: 0.7.6

➜ docker run -d  base/arch /sbin/init
8809f06d66ad288f84d60e077d811c923059a749ec404938a881e8dc0a083d1c
➜ docker attach  880
[nothing happens here]
➜ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
8809f06d66ad        base/arch:latest    /sbin/init          50 seconds ago      Up 48 seconds                           kickass_brown
➜ docker logs 880
Failed to verify GPT partition /dev/dm-2: Operation not permitted
➜ docker stop 880

➜ docker info
Containers: 1
Images: 6
Driver: devicemapper
 Pool Name: docker-254:0-529154-pool
 Data file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 1111.2 Mb
 Data Space Total: 102400.0 Mb
 Metadata Space Used: 1.4 Mb
 Metadata Space Total: 2048.0 Mb
WARNING: No swap limit support

Jan 20 16:12:31 do0 docker[153]: 2014/01/20 16:12:31 POST /v1.8/containers/create
Jan 20 16:12:31 do0 docker[153]: [/var/lib/docker|cbd477d1] +job create()
Jan 20 16:12:31 do0 docker[153]: [/var/lib/docker|cbd477d1] -job create() = OK (0)
Jan 20 16:12:32 do0 docker[153]: 2014/01/20 16:12:32 POST /v1.8/containers/8809f06d66ad288f84d60e077d811c923059a749ec404938a881e8dc0a083d1c/start
Jan 20 16:12:32 do0 docker[153]: [/var/lib/docker|cbd477d1] +job start(8809f06d66ad288f84d60e077d811c923059a749ec404938a881e8dc0a083d1c)
Jan 20 16:12:32 do0 docker[153]: [/var/lib/docker|cbd477d1] -job start(8809f06d66ad288f84d60e077d811c923059a749ec404938a881e8dc0a083d1c) = OK (0)
Jan 20 16:12:41 do0 docker[153]: 2014/01/20 16:12:41 GET /v1.8/containers/json
Jan 20 16:13:12 do0 docker[153]: 2014/01/20 16:13:12 GET /v1.8/containers/880/json
Jan 20 16:13:13 do0 docker[153]: 2014/01/20 16:13:13 POST /v1.8/containers/880/attach?stderr=1&stdout=1&stream=1
Jan 20 16:13:14 do0 docker[153]: 2014/01/20 16:13:14 GET /v1.8/containers/880/json
Jan 20 16:13:21 do0 docker[153]: 2014/01/20 16:13:21 GET /v1.8/containers/json
Jan 20 16:13:36 do0 docker[153]: 2014/01/20 16:13:36 GET /v1.8/containers/880/json
Jan 20 16:13:37 do0 docker[153]: 2014/01/20 16:13:37 POST /v1.8/containers/880/attach?logs=1&stderr=1&stdout=1
Jan 20 16:13:58 do0 docker[153]: 2014/01/20 16:13:58 POST /v1.8/containers/880/stop?t=10
Jan 20 16:13:58 do0 docker[153]: [/var/lib/docker|cbd477d1] +job stop(880)
Jan 20 16:14:00 do0 docker[153]: [error] container.go:468 attach: stderr: write unix @: broken pipe
Jan 20 16:14:00 do0 docker[153]: [error] container.go:499 attach: job 1 returned error write unix @: broken pipe, aborting all jobs
Jan 20 16:14:08 do0 docker[153]: 2014/01/20 16:14:08 Container 8809f06d66ad288f84d60e077d811c923059a749ec404938a881e8dc0a083d1c failed to exit within 10 second
Jan 20 16:14:08 do0 docker[153]: [/var/lib/docker|cbd477d1] -job stop(880) = OK (0)
Jan 20 16:14:12 do0 docker[153]: 2014/01/20 16:14:12 GET /v1.8/containers/json
Jan 20 16:22:11 do0 docker[153]: 2014/01/20 16:22:11 GET /v1.8/info
Jan 20 16:22:11 do0 docker[153]: [/var/lib/docker|cbd477d1] +job info()
Jan 20 16:22:12 do0 docker[153]: [/var/lib/docker|cbd477d1] -job info() = OK (0)

adonm · 2014-01-21T14:14:46Z

Same issue here with systemd 208-10

See https://bitbucket.org/dpaw/dpaw_docker/src/4785d502d806bc002bfc1644adb7d5bbcf7f68c3/arch-base/build.sh?at=default for the build script I've been testing (using archlinux's included lxc-create script), no GPT warning but still hit a segfault (will attach a strace when I get a chance).

ku1ik · 2014-01-28T12:33:34Z

Same problem here.

chrisruffalo · 2014-02-12T05:32:01Z

Same problem on Fedora 20 with 208-9.

My initial impression is that it has something to do with the incomplete mount points in /sys/fs/cgroups and /sys/fs/selinux. When I use gdb to run systemd it fails somewhere in the libselinux around this "../sysdeps/x86_64/strlen.S:106". When you search for that error you tend to get a lot of results centered around missing files. I'm willing to bet that means it can't find some SELinux file it's looking for.

My proposed solution would be one of:

Corrected mount points
Fixed Systemd logic so it doesn't error out when it can't find the file
Custom version of Systemd without +SELINUX

Thoughts?

Edit 01: Rebuilding with the --disable-selinux option leads to a segfault too, but at a different point. I had to remove the fsck and fstab related services to move on.

Edit 02: Hm, it looks like something cgroups related, here's the backtrace:

#0 strlen () at ../sysdeps/x86_64/strlen.S:106
#1 0x00007ffff72543fe in __GI___strdup (s=0x0) at strdup.c:41
#2 0x00000000004ca362 in unit_default_cgroup_path (u=0x5b04f0) at src/core/unit.c:2121
#3 0x00000000004583ad in unit_create_cgroups (u=0x5b04f0, mask=(unknown: 0)) at src/core/cgroup.c:392
#4 0x000000000045882e in unit_realize_cgroup_now (u=0x5b04f0) at src/core/cgroup.c:467
#5 0x0000000000458b86 in unit_realize_cgroup (u=0x5b04f0) at src/core/cgroup.c:567
#6 0x00000000004e687f in slice_start (u=0x5b04f0) at src/core/slice.c:200
#7 0x00000000004c7555 in unit_start (u=0x5b04f0) at src/core/unit.c:1253
#8 0x00000000004d0ede in job_run_and_invalidate (j=0x579b60) at src/core/job.c:497
#9 0x000000000040f704 in manager_dispatch_run_queue (source=0x56c930, userdata=0x56c3d0) at src/core/manager.c:1267
#10 0x00000000004bfaff in source_dispatch (s=0x56c930) at src/libsystemd/sd-event/sd-event.c:1825
#11 0x00000000004c0723 in sd_event_run (e=0x56c7e0, timeout=0) at src/libsystemd/sd-event/sd-event.c:2045
#12 0x0000000000411599 in manager_loop (m=0x56c3d0) at src/core/manager.c:1844
#13 0x000000000040aacd in main (argc=2, argv=0x7fffffffed48) at src/core/main.c:1653

And to save some time, here's the relevant code:

    if (unit_has_name(u, SPECIAL_ROOT_SLICE))
            return strdup(u->manager->cgroup_root);

So it looks like to me either a null reference or just a null string. My C/C++ knowledge ends about right here. Maybe someone could take a go at it from here?

Edit 03: I tried it with disabling other systemd compile options and nothing changed. So, back to figuring out how to mount cgroups in /sys/fs I guess...

Edit 04: Final edit, giving up.

I found a collection of items/ideas that lead me to realize a couple things. The first is that the default docker container does not have the capability (SYS_CAP_ADMIN) to mount or unmount things. In newer builds (0.6 and later) there is the "-privileged" option for "docker run" that allows the container more leeway.

From there I found the option "lxc.mount.auto" that should have allowed me to auto mount sys, proc, and cgroups to the contained operating system. Running the following command

docker run -t -i -privileged -lxc-conf="lxc.mount.auto = proc:rw sys:rw cgroup-full:mixed" fedora /bin/bas

Really didn't do any good as it just makes a bunch of errors.

docker run -t -i -privileged -lxc-conf="lxc.mount.auto = proc:rw sys:rw cgroup-full:mixed" fedora /bin/bash
lxc-start: No such file or directory - failed to use 'proc:rw sys:rw cgroup-full:mixed'
lxc-start: failed to setup the mounts for '14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: failed to setup the container
lxc-start: invalid sequence number 1. expected 2
lxc-start: failed to spawn '14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/cpuset/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/cpu,cpuacct/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/memory/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/devices/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/freezer/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/net_cls/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/blkio/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/perf_event/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
lxc-start: Device or resource busy - failed to remove cgroup '/sys/fs/cgroup/hugetlb/lxc/14975cd2baa2a3d03004f260930cdd88bfd5f24e2c8d062d8ef17f8e64f9436e'
[error] commands.go:2458 Error getting size: bad file descriptor

So... I found some more stuff here: http://blog.docker.io/2013/09/docker-can-now-run-within-docker/

I copied the helper script that he used, or at least parts of it, and I got SELinux and CGROUPS mounted!

But nothing changed. The segfault still happens at the same place. Maybe someone else can figure out what the heck is going on here.

nekinie · 2014-02-19T22:41:59Z

Confirmed same issue on Arch Linux

hunger · 2014-03-05T23:43:42Z

Maybe #4450 will help once it is applied.

Systemd runs fine inside a container when using systemd-nspawn, so my guess is that the one inside the docker container is not told that it is actually inside a container and thus tries to do things that do not make sense.

hunger · 2014-03-05T23:47:28Z

Yeap, running /usr/lib/systemd/systemd-detect-virt does detect a docker container as "none". So it tries to do the full start. Now... how can I make systemd detect docker?

hunger · 2014-03-06T00:02:52Z

Adding --env=container=docker (or lxc) will make systemd recognize that it is inside a container. That stops it from doing some stupid things, but it still core-dumps:-/

hunger · 2014-05-08T08:29:25Z

There is a blog post from somebody that managed to run systemd in docker here: http://rhatdan.wordpress.com/2014/04/30/running-systemd-within-a-docker-container/

Apparently you need --privileged, mount cgroups and then tweak systemd configuration to stop it from bringing up a lot of unnecessary services.

http://lists.freedesktop.org/archives/systemd-devel/2014-May/018998.html

is the first mail in a thread discussing the blog post mentioned above with hints from the systemd people on how the environment expected by systemd looks like. It would rock if docker could implement some of the things suggested there, especially mounting /sys RO (which will stop systemd from starting udev and is also sensible from a security point of view).

rjnagal · 2014-05-08T17:34:02Z

#5445 mounts /sys as read-only.

On Thu, May 8, 2014 at 1:29 AM, Tobias Hunger notifications@github.comwrote:

There is a blog post from somebody that managed to run systemd in docker
here:
http://rhatdan.wordpress.com/2014/04/30/running-systemd-within-a-docker-container/

Apparently you need --privileged, mount cgroups and then tweak systemd
configuration to stop it from bringing up a lot of unnecessary services.

http://lists.freedesktop.org/archives/systemd-devel/2014-May/018998.html

is the first mail in a thread discussing the blog post mentioned above
with hints from the systemd people on how the environment expected by
systemd looks like. It would rock if docker could implement some of the
things suggested there, especially mounting /sys RO (which will stop
systemd from starting udev and is also sensible from a security point of
view).

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3629#issuecomment-42524666
.

kfox1111 · 2014-05-08T17:43:47Z

#5445 says it enables rw mounts when --privileged is enabled, but hunger's comment says you need --privileged so #5445 won't fix it.

rjnagal · 2014-05-08T18:36:04Z

--privileged requires write access to sys and proc. We wouldn't want to do
ro mounts by default.

On Thu, May 8, 2014 at 10:44 AM, kfox1111 notifications@github.com wrote:

#5445 #5445 says it enables rw
mounts when --privileged is enabled, but hunger's comment says you need
--privileged so #5445 https://github.com/dotcloud/docker/pull/5445won't fix it.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3629#issuecomment-42581295
.

hunger · 2014-05-08T18:58:18Z

@kfox1111, rjnagal: Apparently (according to blog post) systemd needs CAP_SYS_ADMIN. That is dropped when running without --privileged. Maybe docker could leave that around?

vmarmol · 2014-05-09T16:50:15Z

@hunger we should be careful about not dropping CAP_SYS_ADMIN, that brings with it a lot of things we probably don't want unprivileged containers to be able to do.

rjnagal · 2014-05-09T16:53:20Z

If we don't drop CAP_SYS_ADMIN, we are almost a privileged container :)

I think this should be handled at the container option level to drop/add
capabilities. We should keep the defaults for unprivileged containers as
secure as possible.

On Fri, May 9, 2014 at 9:50 AM, Victor Marmol notifications@github.comwrote:

@hunger https://github.com/hunger we should be careful about not
dropping CAP_SYS_ADMIN, that brings with it a lot of things we probably
don't want unprivileged containers to be able to do.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3629#issuecomment-42687879
.

hunger · 2014-05-11T21:39:01Z

@victor: You are right But then it would be nice to be able to keep some
capabilities without getting the rest of the stuff that --privileged does.
I do admit that I am not 100% sure what that actually is, which makes me
all the more uneasy about running containers in privileged mode.

I do e.g. have one container that needs to be privileged because it needs
to initiate port-forwarding. I am pretty sure that one only needs a
capability or two and would be fine otherwise.

On Fri, May 9, 2014 at 6:53 PM, Rohit Jnagal notifications@github.comwrote:

If we don't drop CAP_SYS_ADMIN, we are almost a privileged container :)

I think this should be handled at the container option level to drop/add
capabilities. We should keep the defaults for unprivileged containers as
secure as possible.

On Fri, May 9, 2014 at 9:50 AM, Victor Marmol notifications@github.comwrote:

@hunger https://github.com/hunger we should be careful about not
dropping CAP_SYS_ADMIN, that brings with it a lot of things we probably
don't want unprivileged containers to be able to do.

—
Reply to this email directly or view it on GitHub<
https://github.com/dotcloud/docker/issues/3629#issuecomment-42687879>
.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/3629#issuecomment-42688209
.

vmarmol · 2014-05-13T20:16:25Z

I don't think we have a good answer today for something in between unpriviledged and priviledged. I think we'd hope to have something since there are many usecases where you only want some privileges. I'm guessing the hard part is how to expose that in the API in a way that makes sense.

Given the prevalence of systemd, we should find a way to make it work though. I know @alexlarsson has been taking a look at that.

alexlarsson · 2014-05-14T10:29:33Z

Yeah, unprivileged systemd is worked on in: #5773

cpuguy83 · 2014-08-14T04:08:21Z

Closing as resolved in #6968 and #5773

offlinehacker · 2014-08-23T22:11:42Z

This is not really resolved, because this #5773 was reverted in c7d1cb227288fa2174bd601b7214d49955f387e3. I don't know what's going on, i just know that without cgroups and /run as tmpfs systemd can't be started in container, but with these two it can and it works fine.

offlinehacker · 2014-08-23T22:22:31Z

And here is pull requests that breaks it docker-archive/libcontainer#30

offlinehacker · 2014-08-23T22:23:45Z

And this is needed docker-archive/libcontainer#16

jessfraz · 2015-01-14T21:20:59Z

closing as duplicate of #7395

ccaapton mentioned this issue Feb 19, 2014

LXC containers? dnschneid/crouton#364

Closed

cpuguy83 closed this as completed Aug 14, 2014

cpuguy83 reopened this Aug 23, 2014

offlinehacker mentioned this issue Aug 25, 2014

full nixos inside docker NixOS/nixpkgs#3779

Merged

jessfraz closed this as completed Jan 14, 2015

darix mentioned this issue Mar 23, 2015

Running more than one service openSUSE/docker-containers#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

running systemd inside docker arch container hangs or segfaults #3629

running systemd inside docker arch container hangs or segfaults #3629

flokli commented Jan 16, 2014

s0undt3ch commented Jan 17, 2014

codekoala commented Jan 18, 2014

mait commented Jan 20, 2014

adonm commented Jan 21, 2014

ku1ik commented Jan 28, 2014

chrisruffalo commented Feb 12, 2014

nekinie commented Feb 19, 2014

hunger commented Mar 5, 2014

hunger commented Mar 5, 2014

hunger commented Mar 6, 2014

hunger commented May 8, 2014

rjnagal commented May 8, 2014

kfox1111 commented May 8, 2014

rjnagal commented May 8, 2014

hunger commented May 8, 2014

vmarmol commented May 9, 2014

rjnagal commented May 9, 2014

hunger commented May 11, 2014

vmarmol commented May 13, 2014

alexlarsson commented May 14, 2014

cpuguy83 commented Aug 14, 2014

offlinehacker commented Aug 23, 2014

offlinehacker commented Aug 23, 2014

offlinehacker commented Aug 23, 2014

jessfraz commented Jan 14, 2015

running systemd inside docker arch container hangs or segfaults #3629

running systemd inside docker arch container hangs or segfaults #3629

Comments

flokli commented Jan 16, 2014

s0undt3ch commented Jan 17, 2014

codekoala commented Jan 18, 2014

mait commented Jan 20, 2014

adonm commented Jan 21, 2014

ku1ik commented Jan 28, 2014

chrisruffalo commented Feb 12, 2014

nekinie commented Feb 19, 2014

hunger commented Mar 5, 2014

hunger commented Mar 5, 2014

hunger commented Mar 6, 2014

hunger commented May 8, 2014

rjnagal commented May 8, 2014

kfox1111 commented May 8, 2014

rjnagal commented May 8, 2014

hunger commented May 8, 2014

vmarmol commented May 9, 2014

rjnagal commented May 9, 2014

hunger commented May 11, 2014

vmarmol commented May 13, 2014

alexlarsson commented May 14, 2014

cpuguy83 commented Aug 14, 2014

offlinehacker commented Aug 23, 2014

offlinehacker commented Aug 23, 2014

offlinehacker commented Aug 23, 2014

jessfraz commented Jan 14, 2015