Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker fails to remove containers "driver overlay failed to remove root filesystem: readdirent: no such file or directory" #14474

Closed
stevenschlansker opened this issue Jul 8, 2015 · 53 comments
Labels
area/storage/overlay kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. version/1.6

Comments

@stevenschlansker
Copy link

root@mesos-slave6-qa-uswest2:~# docker rm 4aaff2f0d37a
Error response from daemon: Cannot destroy container 4aaff2f0d37a: Driver overlay failed to remove root filesystem 4aaff2f0d37ac87b8d5df8f41217a438b9418af841405f70b4e138a664cfe00e: readdirent: no such file or directory
FATA[0000] Error: failed to remove one or more containers

root@mesos-slave6-qa-uswest2:~# docker inspect !$
docker inspect 4aaff2f0d37a
[{
    "Config": {
        "Env": [
            "INSTANCE_NO=1",
            "TASK_HOST=mesos-slave6-qa-uswest2.qasql.opentable.com",
            "TASK_REQUEST_ID=ci-umami-bbi-simulator",
            "TASK_DEPLOY_ID=teamcity.2015.06.29T23.52.53",
            "ESTIMATED_INSTANCE_COUNT=1",
            "OT_ENV=ci-uswest2",
            "OT_ENV_WHOLE=ci-uswest2",
            "PORT=31366",
            "PORT0=31366",
            "MESOS_SANDBOX=/mnt/mesos/sandbox",
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
            "APP_DIR=/"
        ],
        "Volumes": null,
        "WorkingDir": ""
    },
    "Created": "2015-07-08T01:54:27.639467476Z",
    "Driver": "overlay",
    "ExecDriver": "native-0.2",
    "HostConfig": {
        "Binds": [
            "/mnt/mesos-slave/slaves/20150610-165203-2986755594-5050-1666-S8/frameworks/Singularity/executors/ci-umami-bbi-simulator-teamcity.2015.06.29T23.52.53-1436320326525-1-mesos_slave6_qa_uswest2.qasql.opentable.com-us_west_2c/runs/94a87778-21fe-4055-b607-0efe1f08c14a:/mnt/mesos/sandbox"
        ]
    },
    "Name": "/mesos-94a87778-21fe-4055-b607-0efe1f08c14a",
    "State": {
        "Dead": true,
        "Error": "",
        "ExitCode": 0,
        "FinishedAt": "2015-07-08T04:29:51.220130171Z",
        "OOMKilled": false,
        "Paused": false,
        "Pid": 0,
        "Restarting": false,
        "Running": false,
        "StartedAt": "2015-07-08T01:54:28.883622948Z"
    },
    "Volumes": {
        "/mnt/mesos/sandbox": "/mnt/mesos-slave/slaves/20150610-165203-2986755594-5050-1666-S8/frameworks/Singularity/executors/ci-umami-bbi-simulator-teamcity.2015.06.29T23.52.53-1436320326525-1-mesos_slave6_qa_uswest2.qasql.opentable.com-us_west_2c/runs/94a87778-21fe-4055-b607-0efe1f08c14a"
    },
    "VolumesRW": {
        "/mnt/mesos/sandbox": true
    }
}
]

Indeed, it looks like there has been only partial cleanup of this container:

root@mesos-slave6-qa-uswest2:/mnt/docker/overlay/4aaff2f0d37ac87b8d5df8f41217a438b9418af841405f70b4e138a664cfe00e# ls
merged
root@mesos-slave6-qa-uswest2:/mnt/docker/overlay/4aaff2f0d37ac87b8d5df8f41217a438b9418af841405f70b4e138a664cfe00e# ls merged/
root@mesos-slave6-qa-uswest2:/mnt/docker/overlay/4aaff2f0d37ac87b8d5df8f41217a438b9418af841405f70b4e138a664cfe00e# grep 4aaff2f0d37a /proc/mounts 
overlay /mnt/docker/overlay/4aaff2f0d37ac87b8d5df8f41217a438b9418af841405f70b4e138a664cfe00e/merged overlay rw,relatime,lowerdir=/mnt/docker/overlay/1a347ace77026d6d0d63453adb63b0ca8d130008cf1845166ac1ff7a84d13f50/root,upperdir=/mnt/docker/overlay/4aaff2f0d37ac87b8d5df8f41217a438b9418af841405f70b4e138a664cfe00e/upper,workdir=/mnt/docker/overlay/4aaff2f0d37ac87b8d5df8f41217a438b9418af841405f70b4e138a664cfe00e/work 0 0
root@mesos-slave6-qa-uswest2# docker version
Client version: 1.6.2
Client API version: 1.18
Go version (client): go1.4.2
Git commit (client): 7c8fca2
OS/Arch (client): linux/amd64
Server version: 1.6.2
Server API version: 1.18
Go version (server): go1.4.2
Git commit (server): 7c8fca2
OS/Arch (server): linux/amd64
root@mesos-slave6-qa-uswest2# docker info
Containers: 132
Images: 692
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Kernel Version: 4.0.4
Operating System: Ubuntu 14.04.2 LTS
CPUs: 8
Total Memory: 29.45 GiB
Name: mesos-slave6-qa-uswest2
ID: DQ77:L5SM:2ZH6:NHRP:D72D:SIFN:XJDS:7H7Y:MHRR:UX4Y:HTWK:UVB6
azurezk added a commit to azurezk/docker that referenced this issue Jul 24, 2015
Signed-off-by: Zhang Kun <zkazure@gmail.com>
jessfraz pushed a commit to jessfraz/docker that referenced this issue Jul 24, 2015
Signed-off-by: Zhang Kun <zkazure@gmail.com>
calavera pushed a commit to calavera/docker that referenced this issue Jul 25, 2015
Signed-off-by: Zhang Kun <zkazure@gmail.com>
tiborvass pushed a commit to tiborvass/docker that referenced this issue Jul 27, 2015
Signed-off-by: Zhang Kun <zkazure@gmail.com>
@dnephin
Copy link
Member

dnephin commented Aug 28, 2015

We hit this bug (or something closely related) pretty frequently with the compose test suite on jenkins:

"readdirent: no such file or directory"

Other related failures

We run the full suite against docker 1.7.1 and 1.8.1, but all of these failures happened only during the 1.8.1 run.

@thaJeztah thaJeztah added area/storage/overlay kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. labels Aug 30, 2015
@joelacrisp
Copy link

We're also hitting this. Docker version 1.7.1, build 786b29d, kernel 3.18.3-031803-generic

Could this bug be related to : "Note: The OverlayFS filesystem was merged into the upstream Linux kernel 3.18 and is now Docker's preferred filesystem (instead of AUFS). However, there is a bug in OverlayFS that reports the wrong mnt_id in /proc//fdinfo/ and the wrong symlink target path for /proc//. Fortunately, these bugs have been fixed in the kernel v4.2-rc2. See below for instructions on how to apply the relevant patches."

From the CRIU project @ http://criu.org/Docker ?

They also have a number of other recommended kernel patches

@pwnall
Copy link
Contributor

pwnall commented Sep 11, 2015

I'm also hitting this with Docker 1.8.1 and a 4.1 kernel.

Relevant docker info snip.

Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.6-201.fc22.x86_64
Operating System: Fedora 22 (Twenty Two)
CPUs: 4
Total Memory: 7.797 GiB

docker version output.

Client:
 Version:      1.8.1.fc22
 API version:  1.20
 Package Version: docker-1.8.1-3.git32b8b25.fc22.x86_64
 Go version:   go1.4.2
 Git commit:   32b8b25/1.8.1
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.1.fc22
 API version:  1.20
 Package Version: 
 Go version:   go1.4.2
 Git commit:   32b8b25/1.8.1
 Built:        
 OS/Arch:      linux/amd64

@joelacrisp
Copy link

We went up to 1.8.1 and kernel 4.2, touch wood, it seems like the problem is solved or at last vastly mitigated.

@pwnall
Copy link
Contributor

pwnall commented Sep 11, 2015

@joelacrisp Thank you for the information! I'll figure out a way to get 4.2 on my boxes and see if that fixes things as well.

@joelacrisp
Copy link

If they're ubuntu there is a back-ported 4.2 kernel from the official repos.

@xiaods
Copy link
Contributor

xiaods commented Sep 12, 2015

So this is kernel bug, not docker bug?

@pwnall
Copy link
Contributor

pwnall commented Sep 13, 2015

I am still hitting this with a 4.2.0 kernel, and I tried docker 1.8.2 as well. In my case, reverting to docker 1.7.1 makes the bug go away.

Client version: 1.7.1.fc22
Client API version: 1.19
Package Version (client): docker-1.7.1-8.gitb6416b7.fc22.x86_64
Go version (client): go1.4.2
Git commit (client): b6416b7/1.7.1
OS/Arch (client): linux/amd64
Server version: 1.7.1.fc22
Server API version: 1.19
Package Version (server): docker-1.7.1-8.gitb6416b7.fc22.x86_64
Go version (server): go1.4.2
Git commit (server): b6416b7/1.7.1
OS/Arch (server): linux/amd64

@xiaods
Copy link
Contributor

xiaods commented Sep 13, 2015

@pwnall thanks for your report. what your backend filesystem, such as ext4 or xfs? you env is fedora 22, do you have testing another os on this case?

@pwnall
Copy link
Contributor

pwnall commented Sep 13, 2015

@xiaods Here's my environment. My backend is ext4.

Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.1.6-201.fc22.x86_64
Operating System: Fedora 22 (Twenty Two)
CPUs: 4
Total Memory: 7.797 GiB

If there's interest in fixing this, I can spend some time narrowing things down, e.g. building docker from source and bisecting the commits between 1.7.1 and 1.8.1.

@pwnall
Copy link
Contributor

pwnall commented Sep 13, 2015

@xiaods Also, I have a Vagrantfile + Ansible scripts that reliably build a VM where this issue reproduces.

@xiaods
Copy link
Contributor

xiaods commented Sep 14, 2015

Kernel Version: 4.1.6-201.fc22.x86_64

don't know kernel 4.2 can resolve it. @pwnall do you have some env for kernel 4.2 + docker 1.8.1 testing.

root@omegamaster1:/data# docker info
Containers: 24
Images: 113
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 3.18.7-031807-generic
Operating System: Ubuntu 14.04 LTS
CPUs: 2
Total Memory: 3.86 GiB
Name: omegamaster1.10-3-11-2.omegacloud.ucloud-bj-bgp-c.dm
ID: JQHX:DDMP:GNO5:MAHL:PRZJ:HYKA:UQD6:NR5U:XTSM:RRGV:MMYJ:RSYI
WARNING: No swap limit support

also came info this issues on ops some containers

@pwnall
Copy link
Contributor

pwnall commented Sep 14, 2015

@xiaods Yup, I have Ansible playbooks for deploying docker 1.8.2 from fedora 22 testing and kernel 4.2 from fedora 23. In this case, I ran into some SElinux issues first. After setting SElinux in permissive mode, I'm back to the bug here. FWIW, I tried wiping /var/lib/docker before doing my test, and it doesn't change anything.

[vagrant@localhost ~]$ sudo docker info
Containers: 6
Images: 18
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.0-300.fc23.x86_64
Operating System: Fedora 22 (Twenty Two)
CPUs: 1
Total Memory: 993.4 MiB
Name: localhost.localdomain
ID: GPHJ:3XDJ:WMEQ:CJ67:NPL5:VPTZ:G2TI:GPGP:4GJS:DZIM:FJRC:5DVE
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

[vagrant@localhost ~]$ sudo docker version
Client:
 Version:      1.8.2-fc22
 API version:  1.20
 Package Version: docker-1.8.2-1.gitf1db8f2.fc22.x86_64
 Go version:   go1.4.2
 Git commit:   f1db8f2/1.8.2
 Built:        
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.2-fc22
 API version:  1.20
 Package Version: 
 Go version:   go1.4.2
 Git commit:   f1db8f2/1.8.2
 Built:        
 OS/Arch:      linux/amd64

@h0tbird
Copy link

h0tbird commented Sep 29, 2015

+1

core@core-1 ~ $ docker rm 3eab2afa5b76
Error response from daemon: Cannot destroy container 3eab2afa5b76: Driver overlay failed to remove root filesystem 3eab2afa5b769e8cb70e33c97c2e6c2578286e77fdf221fbeaf06fba168adef2: stat /var/lib/docker/overlay/3eab2afa5b769e8cb70e33c97c2e6c2578286e77fdf221fbeaf06fba168adef2: no such file or directory
Error: failed to remove containers: [3eab2afa5b76]
core@core-1 ~ $ docker info
Containers: 46
Images: 210
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.0-coreos-r1
Operating System: CoreOS 815.0.0
CPUs: 16
Total Memory: 1.863 GiB
Name: core-1.cell-1.ofi.xnood.com
ID: EZIV:45XO:OKLL:KLSF:VPGN:377M:YPML:G2AL:CPKY:KX3J:FGFK:MGJE
core@core-1 ~ $ docker version
Client:
 Version:      1.8.2
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   fae4436-dirty
 Built:        Thu Sep 24 08:01:15 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.2
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   fae4436-dirty
 Built:        Thu Sep 24 08:01:15 UTC 2015
 OS/Arch:      linux/amd64
core@core-1 ~ $ uname -a
Linux core-1.cell-1.ofi.xnood.com 4.2.0-coreos-r1 #2 SMP Thu Sep 24 08:00:18 UTC 2015 x86_64 Intel(R) Xeon(R) CPU E5620 @ 2.40GHz GenuineIntel GNU/Linux

@etoews
Copy link
Contributor

etoews commented Oct 30, 2015

Has anyone had any luck removing these dead containers without having to upgrade/downgrade their Docker/kernel version?

Seeing this on ...

$ docker info
Containers: 2
Images: 54
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.2-coreos-r1
Operating System: CoreOS 835.2.0
CPUs: 4
Total Memory: 3.859 GiB
Name: myhost
ID: 2C66:MDZZ:2ALE:CCRL:2GG3:L3LX:T56A:Q6CW:CQ7F:MP3Y:CP3Q:AKER

$ uname -a
Linux myhost 4.2.2-coreos-r1 #2 SMP Wed Oct 28 07:11:11 UTC 2015 x86_64 Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz GenuineIntel GNU/Linux

$ docker version
Client:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.5.1
 Git commit:   cedd534-dirty
 Built:        Wed Oct 28 07:12:16 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.5.1
 Git commit:   cedd534-dirty
 Built:        Wed Oct 28 07:12:16 UTC 2015
 OS/Arch:      linux/amd64

@andreausu
Copy link

I'm seeing this as well on CoreOS:

~ $ docker ps -a
CONTAINER ID        IMAGE                           COMMAND                  CREATED             STATUS              PORTS                                                                        NAMES
5a4a0821cf4e        mailgun/vulcand:v0.8.0-beta.3   "/go/bin/vulcand -por"   3 months ago        Dead

~ $ docker rm -v 5a4a0821cf4e
Error response from daemon: Cannot destroy container 5a4a0821cf4e: Driver overlay failed to remove root filesystem 5a4a0821cf4e92f5ab190bcdf3ff58d472ecece27b1a591b6c3830b1d930caa3: stat /var/lib/docker/overlay/5a4a0821cf4e92f5ab190bcdf3ff58d472ecece27b1a591b6c3830b1d930caa3: no such file or directory

~ $ uname -a
Linux core-03 4.2.2-coreos-r1 #2 SMP Sat Dec 5 05:56:36 UTC 2015 x86_64 Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz GenuineIntel GNU/Linux

~ $ docker info
Containers: 3
Images: 102
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.2-coreos-r1
Operating System: CoreOS 835.9.0
CPUs: 1
Total Memory: 493.6 MiB
Name: core-03
ID: H2RQ:HJXE:4RNG:KGOD:O7DI:MII4:O5PV:4BZN:Q3AT:LYVI:ZIZ6:O2LV

~ $ docker version
Client:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   cedd534-dirty
 Built:        Sat Dec  5 05:57:26 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   cedd534-dirty
 Built:        Sat Dec  5 05:57:26 UTC 2015
 OS/Arch:      linux/amd64

~ $ mount
/dev/vda9 on / type ext4 (rw,relatime,seclabel,data=ordered)

Then I tried docker rm -v -f 5a4a0821cf4e, same error but the container it's not in docker ps anymore, I'm not sure if it was actually pruned from disk though, I checked in /var/lib/docker/overlays and in /var/lib/docker/containers and it doesn't seem to be there anymore.

@dmcgowan
Copy link
Member

dmcgowan commented Jan 5, 2016

@pwnall do you have a reproducible case of for this bug? I am trying to reproduce with ...

Server Version: 1.9.1
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.2.8-300.fc23.x86_64

@pwnall
Copy link
Contributor

pwnall commented Jan 5, 2016

@dmcgowan thank you very much for looking into this! I'm trying to build a VM with updated software now. I'll make a new post once I have a repro.

@tjlee
Copy link

tjlee commented Sep 5, 2016

@dmcgowan is this issue fixed or not?
steps as described above, remove, pull and run batch of docker containers using docker-compose.

@dmcgowan
Copy link
Member

dmcgowan commented Sep 7, 2016

@tjlee still looking for a reproducible case. There was one fix #18907 which we expected to solve the original report but there may be other causes or issues users are running into. If you are able to reliably reproduce then please send your docker info as well as any image and compose setup which reproduces.

@tjlee
Copy link

tjlee commented Sep 21, 2016

@dmcgowan
We are using TeamCity, and using docker-compose to set up environment for integration tests. In TeamCity we execute the following steps.

Before tests:

  • Stop and remove old docker containers, it this step first error raised Failed to remove container (b25df8d0c075): Error response from daemon: Driver overlay failed to remove root filesystem b25df8d0c0755b733b795d64c012936deec7c404984532581868adb14508ff3b: readdirent: no such file or directory on docker rm b25df8d0c075 command
  • Docker compose step to set up env fails with Image is up to date for ignatov/db2:latest [17:19:13]W: [Step 3/6] open /var/lib/docker/overlay/81cf4239de8e7737ae38f913523431201b1e6f89261ca4439e202f99b3459b2e/lower-id: no such file or directory [17:19:13]W: [Step 3/6] Process exited with code 1

After tests:

  • Docker rm after failed execution also fails with error docker rm b25df8d0c075 [17:19:13]W: [Step 1/1] Failed to remove container (b25df8d0c075): Error response from daemon: Driver overlay failed to remove root filesystem b25df8d0c0755b733b795d64c012936deec7c404984532581868adb14508ff3b: readdirent: no such file or directory

We run this configuration about 30 times per day then we set up another agent. But sometimes it fails with error mentioned earlier and blocks all the process.

FYI:

$ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 8
Server Version: 1.10.3
Storage Driver: overlay
 Backing Filesystem: extfs
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
 Volume: local
 Network: host bridge null
Kernel Version: 3.19.0-61-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64

@pvalsecc
Copy link

pvalsecc commented Dec 1, 2016

Same here in a Jenkins slave that starts and stops a lot of compositions in parallel on the same machine.

$ docker info
Containers: 74
 Running: 30
 Paused: 0
 Stopped: 44
Images: 1192
Server Version: 1.11.2
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: journald
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: host bridge null
Kernel Version: 4.4.0-45-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 28.76 GiB
Name: infra-prd-2
ID: X7Z3:5SZP:LXUJ:J2VF:UMUJ:FIEL:2ORL:5HJP:FKG7:RGXD:TKTY:KOJ7
Docker Root Dir: /var/lib/docker
Debug mode (client): false
Debug mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support

@JustinLivi
Copy link

For anyone encountering this issue on docker for mac, as implied by the error message, my /var/lib/docker folder just did not exist. Solution was simply $ sudo mkdir /var/lib/docker/

@jeffdupont
Copy link

I can confirm that creating the directory on the mac solves the issue.

@thaJeztah
Copy link
Member

@JustinLivi @jeffdupont you created the directory on the mac ? I'm not sure how that could help, because the daemon runs inside a HyperKit VM, so on a different filesystem.

@JustinLivi
Copy link

JustinLivi commented Dec 22, 2016 via email

@jeffdupont
Copy link

jeffdupont commented Dec 22, 2016 via email

@xiaods
Copy link
Contributor

xiaods commented Dec 23, 2016

close it.

@soichih
Copy link

soichih commented Jan 30, 2017

I am having the same problem on CentOS7 + docker 1.13. So far, my work around is to restart docker engine before do docker pull && docker rm -f && docker run... to update my container.

I am using /usr/local/docker for docker graph. I just saw @JustinLivi 's comment about sudo mkdir /var/lib/docker/I did this so I will see if this cures the issue on CentOS7.. This issue occurs if I leave the docker engine running for a couple of days after restart, so I have to wait to see if it works or not.

@soichih
Copy link

soichih commented Feb 6, 2017

OK mkdir /var/lib/docker didn't fix it for CentOS7. I am still seeing following error message.

Error response from daemon: Driver overlay failed to remove root filesystem 4303c0dd8364074311eaae3692ef192676ebf41ca00d5865067374dfc210f72a: remove /usr/local/docker/overlay/7fbdcb54bdd26961cbbb0445025ceaac65ba095a307920a9b6dc691a3d7de307/merged: device or resource busy

systemctl restart docker will cure this for at least few hours, but then eventually I will have to restart docker so that I can remove the container.

@cpuguy83
Copy link
Member

cpuguy83 commented Feb 7, 2017

@soichih

First, let me just say I consider this an issue we need to fix with Docker.
However, it may be some time before it can be fixed, and most likely what's happening is something is mounting /var/lib/docker (or some subdir of /var/lib/docker), causing the mounts to leak into a container namespace, which later causes these device or resource busy errors.

Best thing to do in the short term is to track down anything that might be doing some as stated above.
Also make sure you never use the force flag with docker rm or you'll end up leaking directories.

The issue you mention though is not related to the issue posted here.

@philipn
Copy link

philipn commented Feb 15, 2017

I am consistently seeing this issue when using docker-compose run on top of the minikube docker machine. This is with docker version 1.13.0 and linux 4.7.2 (on the minikube VM).

Related issue: kubernetes/minikube#1130

@cpuguy83
Copy link
Member

@philipn I'm not sure how minikube works, but recently saw someone get this error while trying to share image dirs between two docker daemons.

@philipn
Copy link

philipn commented Feb 15, 2017

@cpuguy83 That's a good lead, thanks for that. It looks like minikube uses a single docker daemon, though. I'm confident that the minikube kubernetes services run on the minikube docker machine are causing the issue, because stopping them allows compose to run as usual. I used docker inspect to make sure none of the running containers (the kubernetes services) are mounting /var/lib/docker/.

@cpuguy83
Copy link
Member

So what happens here is Docker calls os.RemoveAll(rootfsPath).
This ends up traversing the path to try calling Remove on everything below which will wind up calling readdirent... There seems to be something that happens between calling Remove, getting an error about not empty, and calling readdirent on that dir that causes the no such file or directory.

@philipn Is this specifically happening with overlay for you?

@philipn
Copy link

philipn commented Feb 16, 2017

@cpuguy83 Unfortunately, the only storage drivers available on the minikube VM are overlay and vfs. I tried out vfs but it was prohibitive to testing (e.g. a build took 20 hours on our dev stack).

@yapartase
Copy link

@soichih having the same issue on CentOS7 too. Did you figure it out? Cheers.

@matejzero
Copy link

Another CentOS 7 users.

docker version

Client:
Version: 17.03.0-ce
API version: 1.26
Go version: go1.7.5
Git commit: 60ccb22
Built: Thu Feb 23 10:54:03 2017
OS/Arch: linux/amd64

Server:
Version: 17.03.0-ce
API version: 1.26 (minimum version 1.12)
Go version: go1.7.5
Git commit: 60ccb22
Built: Thu Feb 23 10:54:03 2017
OS/Arch: linux/amd64
Experimental: false

@yapartase
Copy link

@matejzero
In the meantime we updated our Kernel to the latest stable one (4.10.8) using elrepo. So far we have not encountered this issue again. Maybe that will help you too.

@matejzero
Copy link

I noticed that if I do:
docker stop container_id
docker rm container_id

then I don't get an error. If I rebuild the container with docker-composer:
docker-compose -f /path/to/docker-compose.yml up -d --no-deps --build container_id

it gives me an error.

@cognifloyd
Copy link

See also #22260. I commented there about how I got around this issue without restarting the machine.

@yunghoy
Copy link

yunghoy commented Jul 18, 2017

It seems it still happens.

$ docker -v
Docker version 17.06.0-ce, build 02c1d87

$ docker rm -f 83326
Error response from daemon: driver "aufs" failed to remove root filesystem for 833261ee3ff41664fd5ec86b1187950fd610f200c785ca4c2b68e03598f01729: no such file or directory

$ docker ps -a
298d557770cf ??????????/rabbitmq:3.6.9-management "docker-entrypoint..." 7 minutes ago Up 7 minutes 4369/tcp, 5671/tcp, 15671/tcp, 25672/tcp, 0.0.0.0:4672->5672/tcp, 0.0.0.0:14672->15672/tcp temp_login-rabbitmq_1
833261ee3ff4 900c2a647020 "/bin/sh -c '#(nop..." 3 hours ago Dead laughing_benz
3563909527d2 f584d2149c60 "/bin/sh -c 'npm i..." 4 hours ago Dead elegant_volhard
75a479e254b9 737a7456fe8d "/bin/sh -c 'npm i..." 4 hours ago Dead sleepy_williams
b94b8feb0112 45e87f85b3c0 "/bin/sh -c '#(nop..." 4 hours ago Dead serene_kilby
589d3c61ee47 b849ce789b5c "/bin/sh -c '#(nop..." 19 hours ago Dead tender_heyrovsky
5ec95787a652 812b48acbff3 "/bin/sh -c 'npm i..." 20 hours ago Dead vigilant_bohr

$ uname -a
Linux APSEO-EG-LINUX8 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty

/var/lib/docker# ls
total 52
drwx--x--x 11 root root 4096 Jul 18 14:37 .
drwxr-xr-x 47 root root 4096 Jul 17 18:18 ..
drwx------ 5 root root 4096 Jul 17 18:18 aufs
drwx------ 57 root root 12288 Jul 18 14:30 containers
drwx------ 3 root root 4096 Jul 17 18:18 image
drwxr-x--- 3 root root 4096 Jul 17 18:18 network
drwx------ 4 root root 4096 Jul 17 18:18 plugins
drwx------ 2 root root 4096 Jul 17 18:18 swarm
drwx------ 2 root root 4096 Jul 18 14:28 tmp
drwx------ 2 root root 4096 Jul 17 18:18 trust
drwx------ 12 root root 4096 Jul 18 14:30 volumes

@thaJeztah
Copy link
Member

@yunghoy see #33960

@cpuguy83
Copy link
Member

Closing since the OP's issue is fixed by https://github.com/moby/moby/pull/31012/files#diff-723dfe6d49672e6220c7b87d40f7fdd6R24 and is in 17.06.

@yunghoy your issue is different and is addressed by #33960

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/storage/overlay kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. version/1.6
Projects
None yet
Development

No branches or pull requests