New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complete system lock-up Ubuntu 14.04 #21179
Comments
The last few lines in syslog before system lock-up:
Unfortunately the log messages in /var/log/upstart/docker.log doesn't contain timestamps so I can't provide any accurate output from the docker daemon. |
From https://wiki.ubuntu.com/TrustyTahr/ReleaseNotes#Kernel Is that really ubuntu 14.04.4 ? Thanks |
@HackToday Ubuntu 14.04.4 ships with 3.13.0 but it is indeed possible to use one of the backported kernels from other NON LTS releases. Is this what the docker team recommends? I think this is what you intended to link: https://wiki.ubuntu.com/TrustyTahr/ReleaseNotes#Updated_Packages Reading here: "Those running virtual or cloud images should not need this newer hardware enablement stack and thus it is recommended they remain on the original Trusty stack." |
no @rbjorklin I just not have ubuntu 14.04.4 on hand, Maybe could try later to find if kernel matched what you given. Thanks |
My Ubuntu 15.10 always came to a complete halt when starting 12 containers using docker-compose, most of which where java processes. It turned out that my machine simply ran out of memory which I learned when Eclipse crashed (not running in a container obviously):
My machine has 16G and another 8G swap so this came as a bit of a surprise. The containers had no restrictions set on CPU or memory and neither did the java processes. I've now added a generous limit of 512m (docker) and 256m (java/Xmx) for all java containers to solve the issue. My gut feeling tells me that java is using more memory inside an unbounded container than it is supposed to. Some more output:
|
The problem hasn't reoccurred since we upgraded to docker 1.10.3. Will wait a few more days but this could possibly have been solved. |
Thanks @rbjorklin, keep us posted |
@rbjorklin wondering if it's resolved after upgrading to docker 1.10.3; are you still seeing this issue, or can we mark this resolved? |
@thaJeztah sorry completely forgot about this! |
No problem, glad it's resolved! |
unfortunatly, I met the same issue on ubuntu 14.04.4 with docker 1.10.3: #uname -a #docker info kernel log: |
@wangyumi I don't recall all the details but I think you need update your kernel. There were some fairly important AUFS fix in 3.13.0-79 and it looks like you're running an old 3.13.0-24 kernel. |
okay, I will try. Thanks! |
This might be a duplicate of #10355 but I was asked by @thaJeztah to open a separate issue in this comment. My original report can be seen here.
Running Ubuntu 14.04.4 all patched up with docker 1.10.2 we had 6 out of 7 virtual machines (VmWare) completely freeze at pretty much the same time in our dev environment sometime between 15.00-16.00CET 2014-03-11. This happened again for 2 machines ~3 hours later. The console provided by VmWare was completely unresponsive. We have hundreds of VMs running and I've never observed this behavior before. I'd like to blame docker but currently I have no hard proof.
Output of
uname -a
:Output of
docker version
:Output of
docker info
:Additional environment details:
We are running Marathon on top of Mesos so containers are started by the Mesos slave. All containers are running the official tomcat image with a bash script as ENTRYPOINT that traps sigterm to handle signals nicely. Inside the container we are also running the zabbix-agent to poll JMX values and report back. Pretty much all logging is sent out of the container to logstash with gelf. Tomcat is using this to get it's logs out.
The Mesos slaves are virtual machines in VmWare. We were using Marathon version 0.13.0, Mesos 0.27.1 and Docker 1.10.2 when this issue occurred but have since upgraded to Mesos 0.27.2 and Docker 1.10.3.
Additional information you deem important (e.g. issue happens only occasionally):
Have seen this message logged a few times:
The text was updated successfully, but these errors were encountered: