Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containers on overlay network cannot reach other containers #35807

Open
alexhexabeam opened this issue Dec 15, 2017 · 22 comments
Open

Containers on overlay network cannot reach other containers #35807

alexhexabeam opened this issue Dec 15, 2017 · 22 comments

Comments

@alexhexabeam
Copy link

alexhexabeam commented Dec 15, 2017

Description

Overlay network randomly has specific machines unreachable. Containers on these nodes are unable to reach containers on other nodes, and vice-versa. This happens much more frequently on larger clusters, but we've seen it on smaller ones as well.

We are using consul for our kv store.

Steps to reproduce the issue:

  1. Create a large overlay network, spanning many nodes. I can trigger this ~80% of the time with 25 nodes.
  2. Launch containers on multiple nodes, which need to communicate. In our case, mongodb, kafka, elasticsearch, and others.
  3. Watch them fail.

Describe the results you received:
Containers on some nodes are unable to connect to containers on other nodes. They can connect to containers on the same node.

[exabeam@ip-10-10-2-39 ~]$ sudo docker exec -it mongodb-configsvr-host1 bash
(docker) [root@mongodb-configsvr-host1 /]# ping mongodb-configsvr-host22
PING mongodb-configsvr-host22 (10.50.48.23) 56(84) bytes of data.
From mongodb-configsvr-host1 (10.50.48.32) icmp_seq=1 Destination Host Unreachable
From mongodb-configsvr-host1 (10.50.48.32) icmp_seq=2 Destination Host Unreachable
From mongodb-configsvr-host1 (10.50.48.32) icmp_seq=3 Destination Host Unreachable
From mongodb-configsvr-host1 (10.50.48.32) icmp_seq=4 Destination Host Unreachable
^C
--- mongodb-configsvr-host22 ping statistics ---
5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4006ms

Journal logs:

-- Logs begin at Fri 2017-12-15 00:32:45 UTC, end at Fri 2017-12-15 01:00:37 UTC. --
Dec 15 00:40:12 ip-10-10-2-39 systemd[1]: Starting Docker Application Container Engine...
-- Subject: Unit docker.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has begun starting up.
Dec 15 00:40:12 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:12.125750601Z" level=info msg="libcontainerd: new containerd process, pid: 15804"
Dec 15 00:40:13 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:13.168203097Z" level=warning msg="failed to rename /var/lib/docker/tmp for background deletion: rename /var/lib/docker/tmp /var/lib/docker/tmp-old: no such file or directory. Deleting synchronously"
Dec 15 00:40:13 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:13.356143183Z" level=warning msg="devmapper: Usage of loopback devices is strongly discouraged for production use. Please use `--storage-opt dm.thinpooldev` or use `man dockerd` to refer to dm.thinpooldev section."
Dec 15 00:40:13 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:13.415855355Z" level=info msg="devmapper: Creating filesystem xfs on device docker-202:1-352328870-base"
Dec 15 00:40:13 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:13.551678692Z" level=info msg="devmapper: Successfully created filesystem xfs on device docker-202:1-352328870-base"
Dec 15 00:40:13 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:13.674373157Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Dec 15 00:40:13 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:13.674413916Z" level=info msg="Initializing discovery with TLS"
Dec 15 00:40:13 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:13.722352220Z" level=info msg="Loading containers: start."
Dec 15 00:40:13 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:13.722438367Z" level=info msg="Option Initializing KV with TLS"
Dec 15 00:40:14 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:14.992434302Z" level=info msg="2017/12/15 00:40:14 [INFO] serf: EventMemberJoin: ip-10-10-2-39 10.10.2.39
Dec 15 00:40:14 ip-10-10-2-39 dockerd[15797]: "
Dec 15 00:40:14 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:14.999344260Z" level=error msg="joining serf neighbor 10.10.2.39 failed: Failed to join the cluster at neigh IP 10.10.2.127: 1 error(s) occurred:
Dec 15 00:40:14 ip-10-10-2-39 dockerd[15797]: * Failed to join 10.10.2.127: dial tcp 10.10.2.127:7946: getsockopt: connection refused"
Dec 15 00:40:15 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:15.000971589Z" level=info msg="2017/12/15 00:40:15 [INFO] serf: EventMemberJoin: ip-10-10-2-77 10.10.2.77
Dec 15 00:40:15 ip-10-10-2-39 dockerd[15797]: "
Dec 15 00:40:15 ip-10-10-2-39 dockerd[15797]: time="2017-12-15T00:40:15.001047174Z" level=info msg="2017/12/15 00:40:15 [INFO] serf: EventMemberJoin: ip-10-10-2-122 10.10.2.122
Dec 15 00:40:15 ip-10-10-2-39 dockerd[15797]: "
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:40:15.351785574Z" level=info msg="Loading containers: done."
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:40:15.351834341Z" level=info msg="2017/12/15 00:40:15 [INFO] serf: EventMemberJoin: ip-10-10-2-186 10.10.2.186
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: "
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:40:15.364524778Z" level=info msg="Docker daemon" commit=19e2cf6 graphdriver(s)=devicemapper version=17.09.1-ce
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:40:15.364643045Z" level=info msg="Daemon has completed initialization"
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:40:15.372985688Z" level=info msg="2017/12/15 00:40:15 [INFO] serf: EventMemberJoin: ip-10-10-2-127 10.10.2.127
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: "
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:40:15.374887725Z" level=info msg="API listen on [::]:2376"
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal systemd[1]: Started Docker Application Container Engine.
-- Subject: Unit docker.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit docker.service has finished starting up.
--
-- The start-up result is done.
Dec 15 00:40:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:40:15.374958646Z" level=info msg="API listen on /var/run/docker.sock"
Dec 15 00:43:40 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:40.581427400Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:43:41 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:41.581709777Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:43:42 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:42.582011510Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:43:43 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:43.582278440Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:43:44 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:44.582540380Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:43:45 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:45.582809896Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:43:46 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:46.584500194Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:43:47 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:47.586258805Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:43:47 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:47.586875346Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.26 02:42:0a:32:30:1a cardinality:2 db state:Set{{948b9200505f4f702d5c745b161c941537ff16d8735cced81ded356091209a1d 10.10.2.39 22 32 true}, {dummy 10.10.2.39 22 32 false}}"
Dec 15 00:43:48 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:43:48.588299858Z" level=error msg="could not resolve peer \"10.50.48.12\": timed out resolving peer by querying the cluster"
Dec 15 00:44:42 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:44:42.228925625Z" level=error msg="could not resolve peer \"10.50.48.9\": timed out resolving peer by querying the cluster"
Dec 15 00:44:43 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:44:43.235079630Z" level=error msg="could not resolve peer \"10.50.48.9\": timed out resolving peer by querying the cluster"
Dec 15 00:44:44 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:44:44.235249108Z" level=error msg="could not resolve peer \"10.50.48.9\": timed out resolving peer by querying the cluster"
Dec 15 00:44:45 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:44:45.235502623Z" level=error msg="could not resolve peer \"10.50.48.9\": timed out resolving peer by querying the cluster"
Dec 15 00:44:46 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:44:46.236353923Z" level=error msg="could not resolve peer \"10.50.48.9\": timed out resolving peer by querying the cluster"
Dec 15 00:44:46 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:44:46.237015773Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.51 02:42:0a:32:30:33 cardinality:2 db state:Set{{a83296771927045209a3f4986b8d54819f861b70ae9357f813bbd6beffb4a350 10.10.2.39 22 32 true}, {dummy 10.10.2.39 22 32 false}}"
Dec 15 00:44:47 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:44:47.238280566Z" level=error msg="could not resolve peer \"10.50.48.9\": timed out resolving peer by querying the cluster"
Dec 15 00:45:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:21.813035003Z" level=warning msg="peerDbDelete transient condition - Key:10.50.48.51 02:42:0a:32:30:33 cardinality:1 db state:Set{{dummy 10.10.2.39 22 32 false}}"
Dec 15 00:45:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:21.813113462Z" level=warning msg="Neighbor entry already present for IP 10.50.48.51, mac 02:42:0a:32:30:33 neighbor:&{dstIP:[10 50 48 51] dstMac:[2 66 10 50 48 51] linkName:vx-000100-dcf48 linkDst:vxlan0 family:0} forceUpdate:false"
Dec 15 00:45:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:21.813139381Z" level=warning msg="Peer operation failed:could not add neighbor entry for nid:dcf48923f107727f755869f05ab10ba24a56444737f981c898ec14b2032ea952 eid:dummy into the sandbox:Search neighbor failed for IP 10.50.48.51, mac 02:42:0a:32:30:33, present in db:true op:&{2 dcf48923f107727f755869f05ab10ba24a56444737f981c898ec14b2032ea952 a83296771927045209a3f4986b8d54819f861b70ae9357f813bbd6beffb4a350 [10 50 48 51] [255 255 252 0] [2 66 10 50 48 51] [0 0 0 0 0 0 0 0 0 0 255 255 10 10 2 39] false false true Leave}"
Dec 15 00:45:24 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:24.555935217Z" level=warning msg="peerDbDelete transient condition - Key:10.50.48.26 02:42:0a:32:30:1a cardinality:1 db state:Set{{dummy 10.10.2.39 22 32 false}}"
Dec 15 00:45:29 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:29.061280932Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.26 02:42:0a:32:30:1a cardinality:2 db state:Set{{8384c22dd90f28304cbff4b50d59a22ea1e89cef573e3f68328d9238b35b3362 10.10.2.186 22 32 false}, {dummy 10.10.2.39 22 32 false}}"
Dec 15 00:45:32 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:32.281175806Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.51 02:42:0a:32:30:33 cardinality:2 db state:Set{{16035419d19fdb10db6da6d90924f0c630ce279cc0fd7aa9003b3ce7df30e09a 10.10.2.122 22 32 false}, {dummy 10.10.2.39 22 32 false}}"
Dec 15 00:45:32 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:32.281235119Z" level=warning msg="Neighbor entry already present for IP 10.50.48.51, mac 02:42:0a:32:30:33 neighbor:&{dstIP:[0 0 0 0 0 0 0 0 0 0 255 255 10 50 48 51] dstMac:[2 66 10 50 48 51] linkName:vx-000100-dcf48 linkDst:vxlan0 family:0} forceUpdate:false"
Dec 15 00:45:36 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:36.037319650Z" level=error msg="could not resolve peer \"10.50.48.5\": timed out resolving peer by querying the cluster"
Dec 15 00:45:37 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:37.037589884Z" level=error msg="could not resolve peer \"10.50.48.13\": timed out resolving peer by querying the cluster"
Dec 15 00:45:38 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:38.037839646Z" level=error msg="could not resolve peer \"10.50.48.5\": timed out resolving peer by querying the cluster"
Dec 15 00:45:39 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:39.038093412Z" level=error msg="could not resolve peer \"10.50.48.13\": timed out resolving peer by querying the cluster"
Dec 15 00:45:40 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:40.038438277Z" level=error msg="could not resolve peer \"10.50.48.13\": timed out resolving peer by querying the cluster"
Dec 15 00:45:41 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:41.038748530Z" level=error msg="could not resolve peer \"10.50.48.5\": timed out resolving peer by querying the cluster"
Dec 15 00:45:42 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:42.039029344Z" level=error msg="could not resolve peer \"10.50.48.5\": timed out resolving peer by querying the cluster"
Dec 15 00:45:43 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:43.040062307Z" level=error msg="could not resolve peer \"10.50.48.13\": timed out resolving peer by querying the cluster"
Dec 15 00:45:44 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:44.040738594Z" level=error msg="could not resolve peer \"10.50.48.14\": timed out resolving peer by querying the cluster"
Dec 15 00:45:45 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:45.041500982Z" level=error msg="could not resolve peer \"10.50.48.20\": timed out resolving peer by querying the cluster"
Dec 15 00:45:46 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:46.041736518Z" level=error msg="could not resolve peer \"10.50.48.14\": timed out resolving peer by querying the cluster"
Dec 15 00:45:47 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:47.041944529Z" level=error msg="could not resolve peer \"10.50.48.5\": timed out resolving peer by querying the cluster"
Dec 15 00:45:48 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:48.042211415Z" level=error msg="could not resolve peer \"10.50.48.13\": timed out resolving peer by querying the cluster"
Dec 15 00:45:49 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:49.042602371Z" level=error msg="could not resolve peer \"10.50.48.20\": timed out resolving peer by querying the cluster"
Dec 15 00:45:50 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:50.042877823Z" level=error msg="could not resolve peer \"10.50.48.14\": timed out resolving peer by querying the cluster"
Dec 15 00:45:51 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:51.043060362Z" level=error msg="could not resolve peer \"10.50.48.5\": timed out resolving peer by querying the cluster"
Dec 15 00:45:52 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:52.043401234Z" level=error msg="could not resolve peer \"10.50.48.13\": timed out resolving peer by querying the cluster"
Dec 15 00:45:53 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:53.043678780Z" level=error msg="could not resolve peer \"10.50.48.20\": timed out resolving peer by querying the cluster"
Dec 15 00:45:54 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:54.044024680Z" level=error msg="could not resolve peer \"10.50.48.14\": timed out resolving peer by querying the cluster"
Dec 15 00:45:55 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:55.044331507Z" level=error msg="could not resolve peer \"10.50.48.20\": timed out resolving peer by querying the cluster"
Dec 15 00:45:56 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:56.044669511Z" level=error msg="could not resolve peer \"10.50.48.25\": timed out resolving peer by querying the cluster"
Dec 15 00:45:57 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:57.044938297Z" level=error msg="could not resolve peer \"10.50.48.24\": timed out resolving peer by querying the cluster"
Dec 15 00:45:58 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:58.045236075Z" level=error msg="could not resolve peer \"10.50.48.14\": timed out resolving peer by querying the cluster"
Dec 15 00:45:59 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:45:59.045511083Z" level=error msg="could not resolve peer \"10.50.48.20\": timed out resolving peer by querying the cluster"
Dec 15 00:46:00 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:00.045775091Z" level=error msg="could not resolve peer \"10.50.48.25\": timed out resolving peer by querying the cluster"
Dec 15 00:46:01 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:01.046082853Z" level=error msg="could not resolve peer \"10.50.48.24\": timed out resolving peer by querying the cluster"
Dec 15 00:46:02 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:02.046398348Z" level=error msg="could not resolve peer \"10.50.48.14\": timed out resolving peer by querying the cluster"
Dec 15 00:46:03 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:03.046680381Z" level=error msg="could not resolve peer \"10.50.48.20\": timed out resolving peer by querying the cluster"
Dec 15 00:46:04 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:04.046902046Z" level=error msg="could not resolve peer \"10.50.48.25\": timed out resolving peer by querying the cluster"
Dec 15 00:46:05 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:05.047049988Z" level=error msg="could not resolve peer \"10.50.48.24\": timed out resolving peer by querying the cluster"
Dec 15 00:46:05 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:05.047751298Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.32 02:42:0a:32:30:20 cardinality:2 db state:Set{{6e13374231d5792b3b16d8c551c9567b37b8e61744feb5f495fba7b66b368b6e 10.10.2.39 22 32 true}, {dummy 10.10.2.39 22 32 false}}"
Dec 15 00:46:06 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:06.047943322Z" level=error msg="could not resolve peer \"10.50.48.25\": timed out resolving peer by querying the cluster"
Dec 15 00:46:07 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:07.048273526Z" level=error msg="could not resolve peer \"10.50.48.24\": timed out resolving peer by querying the cluster"
Dec 15 00:46:08 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:08.048642394Z" level=error msg="could not resolve peer \"10.50.48.4\": timed out resolving peer by querying the cluster"
Dec 15 00:46:09 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:09.050425793Z" level=error msg="could not resolve peer \"10.50.48.17\": timed out resolving peer by querying the cluster"
Dec 15 00:46:10 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:10.053972757Z" level=error msg="could not resolve peer \"10.50.48.23\": timed out resolving peer by querying the cluster"
Dec 15 00:46:11 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:11.054348470Z" level=error msg="could not resolve peer \"10.50.48.25\": timed out resolving peer by querying the cluster"
Dec 15 00:46:12 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:12.054840861Z" level=error msg="could not resolve peer \"10.50.48.24\": timed out resolving peer by querying the cluster"
Dec 15 00:46:13 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:13.061006391Z" level=error msg="could not resolve peer \"10.50.48.4\": timed out resolving peer by querying the cluster"
Dec 15 00:46:14 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:14.061270824Z" level=error msg="could not resolve peer \"10.50.48.17\": timed out resolving peer by querying the cluster"
Dec 15 00:46:15 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:15.061522089Z" level=error msg="could not resolve peer \"10.50.48.23\": timed out resolving peer by querying the cluster"
Dec 15 00:46:16 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:16.061839113Z" level=error msg="could not resolve peer \"10.50.48.25\": timed out resolving peer by querying the cluster"
Dec 15 00:46:17 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:17.062076822Z" level=error msg="could not resolve peer \"10.50.48.24\": timed out resolving peer by querying the cluster"
Dec 15 00:46:18 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:18.062385094Z" level=error msg="could not resolve peer \"10.50.48.4\": timed out resolving peer by querying the cluster"
Dec 15 00:46:19 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:19.062696554Z" level=error msg="could not resolve peer \"10.50.48.17\": timed out resolving peer by querying the cluster"
Dec 15 00:46:20 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:20.064790638Z" level=error msg="could not resolve peer \"10.50.48.23\": timed out resolving peer by querying the cluster"
Dec 15 00:46:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:21.067539938Z" level=error msg="could not resolve peer \"10.50.48.4\": timed out resolving peer by querying the cluster"
Dec 15 00:46:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:21.068683195Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.32 02:42:0a:32:30:20 cardinality:2 db state:Set{{dummy 10.10.2.39 22 32 false}, {6e13374231d5792b3b16d8c551c9567b37b8e61744feb5f495fba7b66b368b6e 10.10.2.39 22 32 true}}"
Dec 15 00:46:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:21.068714041Z" level=warning msg="Entry already present in db: nid:dcf48923f107727f755869f05ab10ba24a56444737f981c898ec14b2032ea952 eid:dummy peerIP:10.50.48.32 peerMac:02:42:0a:32:30:20 isLocal:false vtep:10.10.2.39"
Dec 15 00:46:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:21.068752275Z" level=warning msg="Neighbor entry already present for IP 10.50.48.32, mac 02:42:0a:32:30:20 neighbor:&{dstIP:[10 50 48 32] dstMac:[2 66 10 50 48 32] linkName:vx-000100-dcf48 linkDst:vxlan0 family:0} forceUpdate:true"
Dec 15 00:46:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:21.068868897Z" level=warning msg="Neighbor entry already present for IP 10.10.2.39, mac 02:42:0a:32:30:20 neighbor:&{dstIP:[0 0 0 0 0 0 0 0 0 0 255 255 10 10 2 39] dstMac:[2 66 10 50 48 32] linkName:vx-000100-dcf48 linkDst:vxlan0 family:7} forceUpdate:false"
Dec 15 00:46:21 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:21.068903013Z" level=warning msg="Peer operation failed:could not add fdb entry for nid:dcf48923f107727f755869f05ab10ba24a56444737f981c898ec14b2032ea952 eid:dummy into the sandbox:Search neighbor failed for IP 10.10.2.39, mac 02:42:0a:32:30:20, present in db:true op:&{1 dcf48923f107727f755869f05ab10ba24a56444737f981c898ec14b2032ea952 dummy [10 50 48 32] [255 255 252 0] [2 66 10 50 48 32] [0 0 0 0 0 0 0 0 0 0 255 255 10 10 2 39] false true false watchMiss}"
Dec 15 00:46:22 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:22.070154351Z" level=error msg="could not resolve peer \"10.50.48.17\": timed out resolving peer by querying the cluster"
Dec 15 00:46:22 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:22.070956149Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.72 02:42:0a:32:30:48 cardinality:2 db state:Set{{1a3ef204892e76e1cbfcca7576f28bbfba022cd1e49c6e897a62e2a0738911a3 10.10.2.39 22 32 true}, {dummy 10.10.2.39 22 32 false}}"
Dec 15 00:46:23 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:23.086049272Z" level=error msg="could not resolve peer \"10.50.48.23\": timed out resolving peer by querying the cluster"
Dec 15 00:46:24 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:24.086365853Z" level=error msg="could not resolve peer \"10.50.48.4\": timed out resolving peer by querying the cluster"
Dec 15 00:46:25 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:25.086694403Z" level=error msg="could not resolve peer \"10.50.48.17\": timed out resolving peer by querying the cluster"
Dec 15 00:46:26 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:26.087032562Z" level=error msg="could not resolve peer \"10.50.48.23\": timed out resolving peer by querying the cluster"
Dec 15 00:46:27 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:27.090327610Z" level=error msg="could not resolve peer \"10.50.48.4\": timed out resolving peer by querying the cluster"
Dec 15 00:46:28 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:28.100048345Z" level=error msg="could not resolve peer \"10.50.48.17\": timed out resolving peer by querying the cluster"
Dec 15 00:46:29 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:46:29.112959868Z" level=error msg="could not resolve peer \"10.50.48.23\": timed out resolving peer by querying the cluster"
Dec 15 00:47:23 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:23.283265485Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.80 02:42:0a:32:30:50 cardinality:2 db state:Set{{9fe2a0870a6f14dac6751c83b62cce30c18082eaae11dea5e2d43d5a168ad776 10.10.2.127 22 32 false}, {76c8bf940ed1f0422345bcb9d37c8db2067a48161c8b6ff6689038cd6c05ebcb 10.10.2.127 22 32 false}}"
Dec 15 00:47:23 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:23.283330673Z" level=warning msg="Neighbor entry already present for IP 10.50.48.80, mac 02:42:0a:32:30:50 neighbor:&{dstIP:[0 0 0 0 0 0 0 0 0 0 255 255 10 50 48 80] dstMac:[2 66 10 50 48 80] linkName:vx-000100-dcf48 linkDst:vxlan0 family:0} forceUpdate:false"
Dec 15 00:47:25 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:25.677710734Z" level=warning msg="peerDbDelete transient condition - Key:10.50.48.80 02:42:0a:32:30:50 cardinality:1 db state:Set{{9fe2a0870a6f14dac6751c83b62cce30c18082eaae11dea5e2d43d5a168ad776 10.10.2.127 22 32 false}}"
Dec 15 00:47:28 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:28.854566986Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.80 02:42:0a:32:30:50 cardinality:2 db state:Set{{9fe2a0870a6f14dac6751c83b62cce30c18082eaae11dea5e2d43d5a168ad776 10.10.2.127 22 32 false}, {78c2e91119b59a60216c8aea38319bbf7d70ca5b7288f427dfc4cd319cfc8fe1 10.10.2.186 22 32 false}}"
Dec 15 00:47:28 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:28.854643458Z" level=warning msg="Neighbor entry already present for IP 10.50.48.80, mac 02:42:0a:32:30:50 neighbor:&{dstIP:[0 0 0 0 0 0 0 0 0 0 255 255 10 50 48 80] dstMac:[2 66 10 50 48 80] linkName:vx-000100-dcf48 linkDst:vxlan0 family:0} forceUpdate:false"
Dec 15 00:47:30 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:30.860523220Z" level=warning msg="peerDbDelete transient condition - Key:10.50.48.80 02:42:0a:32:30:50 cardinality:1 db state:Set{{9fe2a0870a6f14dac6751c83b62cce30c18082eaae11dea5e2d43d5a168ad776 10.10.2.127 22 32 false}}"
Dec 15 00:47:32 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:32.068871701Z" level=warning msg="peerDbAdd transient condition - Key:10.50.48.80 02:42:0a:32:30:50 cardinality:2 db state:Set{{fe6cefb9b974800e2917a350b6357aceed271a6dbbbf4b4da45d380196180380 10.10.2.127 22 32 false}, {9fe2a0870a6f14dac6751c83b62cce30c18082eaae11dea5e2d43d5a168ad776 10.10.2.127 22 32 false}}"
Dec 15 00:47:32 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:32.068921895Z" level=warning msg="Neighbor entry already present for IP 10.50.48.80, mac 02:42:0a:32:30:50 neighbor:&{dstIP:[0 0 0 0 0 0 0 0 0 0 255 255 10 50 48 80] dstMac:[2 66 10 50 48 80] linkName:vx-000100-dcf48 linkDst:vxlan0 family:0} forceUpdate:false"
Dec 15 00:47:32 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:47:32.494377690Z" level=warning msg="peerDbDelete transient condition - Key:10.50.48.80 02:42:0a:32:30:50 cardinality:1 db state:Set{{fe6cefb9b974800e2917a350b6357aceed271a6dbbbf4b4da45d380196180380 10.10.2.127 22 32 false}}"
Dec 15 00:55:32 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:55:32.937897501Z" level=warning msg="2017/12/15 00:55:32 [WARN] serf: received old event jl 10.10.2.77 dcf48923f107727f755869f05ab10ba24a56444737f981c898ec14b2032ea952 92afa9f48852c73e1d8c41b904e2eed40e561ac9ed51966d1d7062397548118a from time 373 (current: 894)
Dec 15 00:55:32 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: "
Dec 15 00:56:14 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: time="2017-12-15T00:56:14.118302987Z" level=error msg="Error running exec in container: oci runtime error: exec failed: container_linux.go:265: starting container process caused \"process_linux.go:84: executing setns process caused \\"exit status 15\\"\"
Dec 15 00:56:14 ip-10-10-2-39.us-west-2.compute.internal dockerd[15797]: "

Describe the results you expected:
Containers on all hosts should be able to connect to each other.

Additional information you deem important (e.g. issue happens only occasionally):
Intermittent/randomly occurs. Much more frequent on larger clusters.

Output of docker version:

Tested with 17.05, 17.09.1, and 17.11.0. I don't have a 17.05 up currently to get the docker version output.

Client:
 Version:      17.09.1-ce
 API version:  1.32
 Go version:   go1.8.3
 Git commit:   19e2cf6
 Built:        Thu Dec  7 22:23:40 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.09.1-ce
 API version:  1.32 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   19e2cf6
 Built:        Thu Dec  7 22:25:03 2017
 OS/Arch:      linux/amd64
 Experimental: false
Client:
 Version:      17.11.0-ce
 API version:  1.34
 Go version:   go1.8.3
 Git commit:   1caf76c
 Built:        Mon Nov 20 18:35:47 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.11.0-ce
 API version:  1.34 (minimum version 1.12)
 Go version:   go1.8.3
 Git commit:   1caf76c
 Built:        Mon Nov 20 18:45:06 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 6
 Running: 5
 Paused: 0
 Stopped: 1
Images: 5
Server Version: 17.09.1-ce
Storage Driver: devicemapper
 Pool Name: docker-202:1-352328870-pool
 Pool Blocksize: 65.54kB
 Base Device Size: 10.74GB
 Backing Filesystem: xfs
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data Space Used: 1.729GB
 Data Space Total: 107.4GB
 Data Space Available: 105.6GB
 Metadata Space Used: 3.641MB
 Metadata Space Total: 2.147GB
 Metadata Space Available: 2.144GB
 Thin Pool Minimum Free Space: 10.74GB
 Udev Sync Supported: true
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.140-RHEL7 (2017-05-03)
Logging Driver: none
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 06b9cb35161009dcb7123345749fef02f7cea8e0
runc version: 3f2f8b84a77f73d38244dd690525642a72156c64
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-693.2.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 29.79GiB
Name: ip-10-10-2-39.us-west-2.compute.internal
ID: 73EN:WWPI:2QJ4:2GPV:4I4Z:BHMS:EZBX:BF75:HAJ7:M7CO:LLSA:FYJW
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Cluster Store: consul://10.10.2.39:8080
Cluster Advertise: 10.10.2.39:2376
Insecure Registries:
 docker-exabeam.repo.exabeam.com
 127.0.0.0/8
Live Restore Enabled: true

WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
         Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Containers: 3
 Running: 3
 Paused: 0
 Stopped: 0
Images: 2
Server Version: 17.11.0-ce
Storage Driver: devicemapper
 Pool Name: docker-202:1-486539717-pool
 Pool Blocksize: 65.54kB
 Base Device Size: 10.74GB
 Backing Filesystem: xfs
 Udev Sync Supported: true
 Data file: /dev/loop0
 Metadata file: /dev/loop1
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Data Space Used: 1.23GB
 Data Space Total: 107.4GB
 Data Space Available: 106.1GB
 Metadata Space Used: 2.036MB
 Metadata Space Total: 2.147GB
 Metadata Space Available: 2.145GB
 Thin Pool Minimum Free Space: 10.74GB
 Deferred Removal Enabled: true
 Deferred Deletion Enabled: true
 Deferred Deleted Device Count: 0
 Library Version: 1.02.140-RHEL7 (2017-05-03)
Logging Driver: none
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 992280e8e265f491f7a624ab82f3e238be086e49
runc version: 0351df1c5a66838d0c392b4ac4cf9450de844e2d
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-693.2.2.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 29.79GiB
Name: ip-10-10-2-23.us-west-2.compute.internal
ID: QTRG:VGMN:HOW5:4IJ6:RHTG:3GNF:SVP6:OF53:KI4M:H647:CV64:45VR
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Cluster Store: consul://10.10.2.23:8080
Cluster Advertise: 10.10.2.23:2376
Insecure Registries:
 docker-exabeam.repo.exabeam.com
 127.0.0.0/8
Live Restore Enabled: true

WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
         Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.

Additional environment details (AWS, VirtualBox, physical, etc.):
AWS and physical, running Centos 7. Tested with consul 0.7.2 and 1.0.1.
Restarting the docker daemon on unconnectable hosts sometimes fixes it for that host.

daemon.json

{
    "bip": "172.17.0.1/16",
    "cluster-advertise": "10.10.2.39:2376",
    "cluster-store": "consul://10.10.2.39:8080",
    "cluster-store-opts": {
          "kv.cacertfile": "/etc/docker/ssl/exa-ca.pem",
          "kv.certfile": "/etc/docker/ssl/docker-cert.pem",
          "kv.keyfile": "/etc/docker/ssl/docker-key.pem"
    },
    "disable-legacy-registry": true,
    "dns": [],
    "hosts": [
        "unix:///var/run/docker.sock",
        "tcp://0.0.0.0:2376"
    ],
    "insecure-registries": [
        "docker-exabeam.repo.exabeam.com"
    ],
    "live-restore": true,
    "log-driver": "none",
    "shutdown-timeout": 90,
    "storage-driver": "devicemapper",
    "tlsverify": true,
    "tlscacert": "/etc/docker/ssl/exa-ca.pem",
    "tlscert": "/etc/docker/ssl/docker-cert.pem",
    "tlskey": "/etc/docker/ssl/docker-key.pem"
}

consul config file:

{
 "data_dir":"/var/lib/consul",
 "enable_syslog":false,
 "server":true,
 "node_name":"host1",
 "advertise_addr":"10.10.2.39",
 "bind_addr":"10.10.2.39",
 "bootstrap_expect": 5,
 "retry_join": ["10.10.2.39","10.10.2.43","10.10.2.52","10.10.2.58","10.10.2.69"],
 "client_addr": "0.0.0.0",
 "verify_outgoing": true,
 "verify_incoming": true,
 "verify_server_hostname": false,
 "encrypt": "BuZSrsHSx1MCQ9j1kFuQmA==",
 "key_file": "/opt/consul/etc/consul-key.pem",
 "cert_file": "/opt/consul/etc/consul-cert.pem",
 "ca_file": "/opt/consul/etc/exa-ca.pem",
 "ports": {
  "https": 8080,
  "http": -1
 }
}

Possibly related to #32841 , #32195 , or #33721 , but I'm not using swarm, so I figured it warranted its own ticket.

@alexhexabeam
Copy link
Author

This seems to be related to ARP, as if I query the neighbors in the container, I get failed and incomplete statuses.

(docker) [root@mongodb-configsvr-host9 /]# ip neigh
10.50.50.11 dev eth0  FAILED
10.50.50.29 dev eth0 lladdr 02:42:0a:32:32:1d STALE
10.50.50.6 dev eth0 lladdr 02:42:0a:32:32:06 STALE
10.50.50.8 dev eth0  INCOMPLETE
10.50.50.27 dev eth0 lladdr 02:42:0a:32:32:1b STALE
10.50.50.26 dev eth0 lladdr 02:42:0a:32:32:1a STALE
172.18.0.1 dev eth1 lladdr 02:42:09:e5:7e:45 STALE

I unfortunately didn't save the tcpdump for ARP requests I had running at the time, but I remember that I was only seeing requests going to individual container addresses, rather than broadcasting to the whole subnet. My understanding of ARP and skill with tcpdump are not good, though, so maybe that's how it's supposed to work.

I also confirmed that it still affects docker 17.12.0 and consul 1.0.2. The fact that it is at the ARP level makes me think this is solidly in docker, though.

@alexhexabeam
Copy link
Author

I recreated it, and here's some tcpdump:

(docker) [root@mongodb-configsvr-host10 /]# tcpdump -ni eth0 arp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
03:20:26.296329 ARP, Request who-has 10.50.50.7 tell 10.50.50.23, length 28
03:20:27.298380 ARP, Request who-has 10.50.50.7 tell 10.50.50.23, length 28
03:20:28.300335 ARP, Request who-has 10.50.50.7 tell 10.50.50.23, length 28
03:20:29.302348 ARP, Request who-has 10.50.50.7 tell 10.50.50.23, length 28
03:20:30.304381 ARP, Request who-has 10.50.50.7 tell 10.50.50.23, length 28
03:20:31.306353 ARP, Request who-has 10.50.50.7 tell 10.50.50.23, length 28
03:20:32.308357 ARP, Request who-has 10.50.50.7 tell 10.50.50.23, length 28

The 10.50.50.23 address is another container on the same host. Containers on some hosts are connectable, but containers on others are not.

And some arp output:

(docker) [root@mongodb-configsvr-host10 /]# arp -nv
Address                  HWtype  HWaddress           Flags Mask            Iface
172.18.0.1               ether   02:42:0a:50:f3:8e   C                     eth1
10.50.50.7                       (incomplete)                              eth0
10.50.50.3               ether   02:42:0a:32:32:03   C                     eth0
10.50.50.26              ether   02:42:0a:32:32:1a   C                     eth0
10.50.50.23              ether   02:42:0a:32:32:17   C                     eth0
Entries: 5	Skipped: 0	Found: 5

And some ip neigh output for this current failure:

(docker) [root@mongodb-configsvr-host10 /]# ip neigh
172.18.0.1 dev eth1 lladdr 02:42:0a:50:f3:8e STALE
10.50.50.7 dev eth0  FAILED
10.50.50.3 dev eth0 lladdr 02:42:0a:32:32:03 STALE
10.50.50.26 dev eth0 lladdr 02:42:0a:32:32:1a REACHABLE
10.50.50.23 dev eth0 lladdr 02:42:0a:32:32:17 STALE

@alexhexabeam
Copy link
Author

I've confirmed that this state seems to happen more frequently when launching many containers at once.

@thaJeztah
Copy link
Member

ping @ddebroy @fcrisciani PTAL

@lin-zhao
Copy link

We are seeing the same issue too with large cluster deployments. Especially bad with 10 nodes cluster or more. Any advice?

@fcrisciani
Copy link
Contributor

@lin-zhao @alexhexabeam are you still seeing this issue on the new stable releases?

@alexhexabeam
Copy link
Author

@fcrisciani Yes, I was able to reproduce this with the latest stable 18.03.0-ce.

(docker) [root@mongodb-router-host24 /]# telnet mongodb-router-host2 27017
Trying 10.50.48.52...
telnet: connect to address 10.50.48.52: No route to host
(docker) [root@mongodb-router-host24 /]# ip neigh
10.50.48.2 dev eth0  FAILED
10.50.48.3 dev eth0  FAILED
10.50.48.12 dev eth0  FAILED
10.50.48.13 dev eth0  FAILED
10.50.48.14 dev eth0  FAILED
10.50.48.15 dev eth0  FAILED
10.50.48.8 dev eth0  FAILED
10.50.48.9 dev eth0  FAILED
10.50.48.10 dev eth0  FAILED
10.50.48.11 dev eth0  INCOMPLETE
10.50.48.20 dev eth0 lladdr 02:42:0a:32:30:14 REACHABLE
10.50.48.21 dev eth0  INCOMPLETE
10.50.48.22 dev eth0  FAILED
10.50.48.23 dev eth0  FAILED
10.50.48.16 dev eth0  FAILED
10.50.48.52 dev eth0  FAILED
10.50.48.17 dev eth0  FAILED
10.50.48.18 dev eth0  FAILED
10.50.48.19 dev eth0  FAILED
172.18.0.1 dev eth1 lladdr 02:42:57:01:5a:58 REACHABLE
10.50.48.24 dev eth0 lladdr 02:42:0a:32:30:18 REACHABLE
10.50.48.25 dev eth0  FAILED
10.50.48.26 dev eth0  FAILED
10.50.48.4 dev eth0  INCOMPLETE
10.50.48.5 dev eth0  FAILED
10.50.48.6 dev eth0  INCOMPLETE
10.50.48.7 dev eth0  FAILED

The three reachable addresses are the host itself and two other containers on the same host.

@antoinetran
Copy link

antoinetran commented Jun 11, 2018

Dear all and @fcrisciani

I also reproduced the issue with:


- sudo docker version
Client:
 Version:       17.12.1-ce
 API version:   1.35
 Go version:    go1.9.4
 Git commit:    7390fc6
 Built: Tue Feb 27 22:15:20 2018
 OS/Arch:       linux/amd64
- CentOS Linux release 7.4.1708 (Core)
- Swarm Classic 1.2.8

With all containers of an overlay network in one host.
I also have this kind of logs in /var/log/messages:

Jun 11 17:48:26 el2pdck-s00 dockerd: time="2018-06-11T17:48:26.962492176+02:00" level=warning msg="peerDbDelete transient condition - Key:192.168.101.10 02:42:c0:a8:65:0a cardinality:1 db state:Set{{dummy 10.0.2.10 24 32 false}}"
Jun 11 17:48:27 el2pdck-s00 dockerd: time="2018-06-11T17:48:27.763528133+02:00" level=warning msg="peerDbAdd transient condition - Key:192.168.101.10 02:42:c0:a8:65:0a cardinality:2 db state:Set{{dda040575b40c9db595e4767a28155d386194b297f6dc0f8d2e548b12ae5d54d 10.0.2.20 24 32 false}, {dummy 10.0.2.10 24 32 false}}"
Jun 11 17:48:27 el2pdck-s00 dockerd: time="2018-06-11T17:48:27.763650782+02:00" level=warning msg="Neighbor entry already present for IP 192.168.101.10, mac 02:42:c0:a8:65:0a neighbor:&{dstIP:[192 168 101 10] dstMac:[2 66 192 168 101 10] linkName:vx-000100-dca46 linkDst:vxlan0 family:0} forceUpdate:false"
Jun 11 17:48:49 el2pdck-s00 dockerd: time="2018-06-11T17:48:49.575827061+02:00" level=warning msg="peerDbDelete transient condition - Key:192.168.101.10 02:42:c0:a8:65:0a cardinality:1 db state:Set{{dummy 10.0.2.10 24 32 false}}"
Jun 11 17:48:50 el2pdck-s00 dockerd: time="2018-06-11T17:48:50.161782123+02:00" level=warning msg="peerDbAdd transient condition - Key:192.168.101.10 02:42:c0:a8:65:0a cardinality:2 db state:Set{{dummy 10.0.2.10 24 32 false}, {438a166a874d4a74142a62c3c3306a8fad615047d74a527dc15daa907424e6f2 10.0.2.20 24 32 false}}"
Jun 11 17:48:50 el2pdck-s00 dockerd: time="2018-06-11T17:48:50.161888289+02:00" level=warning msg="Neighbor entry already present for IP 192.168.101.10, mac 02:42:c0:a8:65:0a neighbor:&{dstIP:[192 168 101 10] dstMac:[2 66 192 168 101 10] linkName:vx-000100-dca46 linkDst:vxlan0 family:0} forceUpdate:false"

Workaround: do the inverse ping (if container C1 -> C2 cannot work, do C2 -> C1, then the ping works in all directions).

@antoinetran
Copy link

This logs seems to be related because I couldn't ping this IP 192.168.101.36

Jun 11 15:56:49 el2pdck-s00 dockerd: time="2018-06-11T15:56:49.570517158+02:00" level=warning msg="peerDbDelete transient condition - Key:192.168.101.36 02:42:c0:a8:65:24 cardinality:1 db state:Set{{dummy 10.0.2.10 24 32 false}}"
Jun 11 15:56:49 el2pdck-s00 dockerd: time="2018-06-11T15:56:49.570750430+02:00" level=warning msg="Neighbor entry already present for IP 192.168.101.36, mac 02:42:c0:a8:65:24 neighbor:&{dstIP:[192 168 101 36] dstMac:[2 66 192 168 101 36] linkName:vx-000100-dca46 linkDst:vxlan0 family:0} forceUpdate:false"
Jun 11 15:56:49 el2pdck-s00 dockerd: time="2018-06-11T15:56:49.570792227+02:00" level=warning msg="Peer operation failed:could not add neighbor entry for nid:dca46c0689c146a5014ecf10fde61010a0f6746d7cea11159acebf2728e89281 eid:dummy into the sandbox:Search neighbor failed for IP 192.168.101.36, mac 02:42:c0:a8:65:24, present in db:true op:&{2 dca46c0689c146a5014ecf10fde61010a0f6746d7cea11159acebf2728e89281 d6acadce54d12f3e234d55301ce2441b3d91a5e36f52963e1283acfb2dd0b361 [192 168 101 36] [255 255 255 0] [2 66 192 168 101 36] [0 0 0 0 0 0 0 0 0 0 255 255 10 0 2 10] false false true Leave}"

@antoinetran
Copy link

More info: this is related to the restart policy of the container set to restart in case of error. The endpoint keeps changing, thus the warning logs that says Swarm already see the IP.

@lin-zhao
Copy link

@antoinetran in your workaround, what do you mean reverse ping? Do you mean actually pinging C1 -> C2 inside the container?

@antoinetran
Copy link

@lin-zhao yes. Pinging C2 in container C1 sometimes fixes the reverse ping from C2 to C1.

In this case, I managed to understand what happened I think. All of this is normal if we consider the container always restart, some Swarm finds it weird to have the same IP/MAC again. Maybe this warning should be removed in this case.

@eubnara
Copy link

eubnara commented Jul 16, 2018

I'm suffering from this issue. I use 18.03.1-ce.
I think #34957 is related to it.

It seems to happen more frequently when containers in overlay network operate on the same node.

@bluzioo
Copy link

bluzioo commented Sep 13, 2018

I also suffering from this issue. Have anything updated or fixed? @fcrisciani

Docker version 18.06.1-ce, build e68fc7a

@sam0737
Copy link

sam0737 commented Jan 14, 2019

I could also hit this from time to time. No exact way to reproduce. We have 14 bare metals with 18.06 installed, one overlay created on it. Not using swarm, we are using the overlay manually.

Description

I am starting a container on that overlay on host 10.2.1.67. The IP as auto assigned by docker. Yet, it can't ping any other container on the same overlay, nor any other container could ping it.

On ALL syslog on the other hosts, it says

Jan 14 10:50:37 lb1 dockerd[22247]: time="2019-01-14T10:50:37.636470476+08:00" level=warning msg="peerDbAdd transient condition - Key:10.9.228.33 02:42:0a:09:e4:21 cardinality:2 db state:Set{{dummy 10.2.0.64 21 32 false}, {4650024a28f5f2adeff62f386fa32c0b853c165d5deb7bd147f1f58d1bf5f0ba 10.2.1.67 21 32 false}}"
Jan 14 10:50:37 lb1 dockerd[22247]: time="2019-01-14T10:50:37.636543836+08:00" level=warning msg="Neighbor entry already present for IP 10.9.228.33, mac 02:42:0a:09:e4:21 neighbor:&{dstIP:[10 9 228 33] dstMac:[2 66 10 9 228 33] linkName:vx-000100-f785e linkDst:vxlan0 family:0} forceUpdate:false"

I believe cardinality:2 means a previous containeer with the same Virtual IP 10.9.228.33 wasn't deleted correctly when the last host 10.2.0.64 shut it down. Such that the when I am creating a new container on new host 10.2.1.67, it sense a duplicated entry in /var/lib/docker/network/files/local-kv.db.

docker network inspect is has the new container mac and virtual IP. My online-kv db is Consul, I have also pull all the KV data in the it, it is saying the correct host 10.2.1.67 has the 10.9.228.33 container.

Actually if I stop the docker daemon, remove /var/lib/docker/network/files/local-kv.db and restart it, everything works as expected, but only on the host I restart docker daemon. i.e. If I do the same on every 14 hosts I expect it would correct the problem, but apparently this isn't quite feasible every time we hit this bug.

Suggestion

Maybe the old host didn't broadcast a peerDelete for some reasons? AndThe log says forceUpdate:false...

is it possible to make the new container announcement as forceUpdate:true such that it would work even peerDelete were missed for any reason?

@akatiyar
Copy link

@sam0737 I encountered same issue. For a workaround, you can use static IPs for containers. Worked for me. Hope it will be fixed by the docker team soon.

@eubnara
Copy link

eubnara commented Apr 9, 2019

Here's workaround.

  1. find containers with overlay network problem
  2. using docker network disconnect (-f) command, take it out from the overlay network
  3. re-insert the container in the overlay network again using docker network connect command

===> + contents added

I found some cases where this workaround doesn't work.

@retardtank
Copy link

Can inform that I've also encountered something that at least seems very similar, but on 18.09.

It's a swarm running on more than 10 bare metal hosts, and the symptom was an unreachable container. I also did see a few Neighbor entry already present..., and Peer operation failed... log entries on one of the hosts.

@ottes
Copy link

ottes commented Sep 26, 2019

I've got the same problem, running a 12 Node Cluster, Engine Versions differ from 18.09.1 till 18.09.6

Some hosts seem to habe the Problem more often, that others, but this could be false.

I start a service mit replica=6 and in 2 of 3 cases (doing a force-update to "retry" atm) at least one of the 6 containers delivers a connection refused (most of the tries), on the services "shared/virtual" ip.

The ping from inside the container that is not reachable to the container, that is, resolves the problem (but only if i ping the actual local ip that i see on eth0 or docker inspect of the container, it does not work, if i ping it by service name). When the container, that did not work before the ping, pings the container, that is trying to reach, most of the times the problem is resolved, but not always, couldn't find find any pattern till now.

The hosts syslog shows something like this, when starting the service, sometimes

rmServiceBinding bc668378ac128682e2f85f89ba51deb063b883566affccbbda004867aa2c6432 possible transient state ok:false entries:0 set:false 
peerDbAdd transient condition - Key:10.0.0.38 02:42:0a:00:00:26 cardinality:2 db state:Set{{5c4e209ff64f69d3591c713cc5ecfb696105575fa79fc2b917b0144e83b8661a 192.168.201.43 24 32 false}, {1ecd7c078d8fc4e1dbb5d6bc83122a8a7554e384535e914bd7f0eb5c4c1a388f 192.168.201.61 24 32 false}}
Neighbor entry already present for IP 10.0.0.38, mac 02:42:0a:00:00:26 neighbor:&{dstIP:[0 0 0 0 0 0 0 0 0 0 255 255 10 0 0 38] dstMac:[2 66 10 0 0 38] linkName:vx-001001-xlg5g linkDst:vxlan0 family:0} forceUpdate:false
rmServiceBinding 29aad4aa396a5722eae7850a95bbe5b73cab7fd3e283d9392034136e56304678 possible transient state ok:false entries:0 set:false 
rmServiceBinding 1ecd7c078d8fc4e1dbb5d6bc83122a8a7554e384535e914bd7f0eb5c4c1a388f possible transient state ok:false entries:0 set:false 
peerDbDelete transient condition - Key:10.0.0.38 02:42:0a:00:00:26 cardinality:1 db state:Set{{5c4e209ff64f69d3591c713cc5ecfb696105575fa79fc2b917b0144e83b8661a 192.168.201.43 24 32 false}}

@sam0737
Copy link

sam0737 commented Sep 27, 2019

FYI, I have converted my cluster (25+ hosts) from what they called "Overlay (Legacy)" now - which use a user supplied KV service (I picked Consul), to "Overlay Network (Attachable)" that comes with Swarm (and it has a builtin etcd as the KV service).

i.e. I am not using the autoscaling stuff of Swarm, but just its overlay layer https://docs.docker.com/network/overlay/.

Since then, everything has been running good so far.

@antoinetran
Copy link

antoinetran commented Sep 27, 2019

@sam0737 So you meant you used the swarm classic image https://hub.docker.com/_/swarm ? We used the overlay from this mode and we found the overlay to be not working good at all.

@tplueer
Copy link

tplueer commented Oct 23, 2020

I have the same problem currently. I am running three nodes with an overlay network without swarm and using consul
as the K/V Store.

Suddenly the network connection between two containers on different hosts (A->C) is not possible anymore, other containers which run on A and C can still communicate.

ip neigh shows the correct mac address.

journalctl -u docker.service also shows some warnings but the affected ip is not inside.

level=warning msg="peerDbDelete transient condition 
level=warning msg="Neighbor entry already present for IP 10.10.0.22, mac 02:42:0a:0a:00:16 neighbor:&{dstIP:[0 0 0 0 0 0 0 0 0 0 255 255 10 10 0 22] dstMac:[2 66 10 10 0 22] linkName:vx-000100

Docker version 19.03.12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests