Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong behaviour of the DNS resolver within Swarm Mode overlay network #30134

Closed
YouriT opened this issue Jan 13, 2017 · 74 comments
Closed

Wrong behaviour of the DNS resolver within Swarm Mode overlay network #30134

YouriT opened this issue Jan 13, 2017 · 74 comments
Labels
area/networking area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/more-info-needed version/1.12

Comments

@YouriT
Copy link

YouriT commented Jan 13, 2017

I'm trying to setup a mongodb shard with 2 shards and each shards being a replicaset (size=2). I have one mongos router and one replicaset (size=2) of config dbs.

I was getting plenty of errors about chunks migration and after digging I figured out that the target host was sometimes alive and sometimes not. But the containers where not crashed which was strange.

After digging deeper I figured out that the IP addresses got through the resolution were not right.

Please note that each service is running in dnsrr mode.

Steps to reproduce the issue:
hard to get exactly the behaviour

  1. docker network create -d overlay public
  2. Create a service within the network on different machines
  3. Make the containers crash and let them restart, repeat multiple items
  4. Scale the service to 0
  5. Place a container within the same network and nslookup the service your created earlier

Describe the results you received:

nslookup mongo-shard-rs0-2
Server:		127.0.0.11
Address:	127.0.0.11#53

Non-authoritative answer:
Name:	mongo-shard-rs0-2
Address: 10.0.0.2
Name:	mongo-shard-rs0-2
Address: 10.0.0.5
Name:	mongo-shard-rs0-2
Address: 10.0.0.4

Describe the results you expected:

** server can't find mongo-shard-rs0-2: NXDOMAIN

Additional information you deem important (e.g. issue happens only occasionally):
Random issue. If I restart the machine then it works again. Seems the cache is poisoned or something.

Output of docker version:

Client:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.6.4
 Git commit:   78d1802
 Built:        Tue Jan 10 20:38:45 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.6
 API version:  1.24
 Go version:   go1.6.4
 Git commit:   78d1802
 Built:        Tue Jan 10 20:38:45 2017
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 2
 Running: 1
 Paused: 0
 Stopped: 1
Images: 6
Server Version: 1.12.6
Storage Driver: overlay
 Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay null bridge host
Swarm: active
 NodeID: 0466d9iww2nsv3crnj2yd6vk4
 Is Manager: false
 Node Address: 172.16.1.13
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-59-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 6.65 GiB
Name: worker2
ID: ZCIH:LHBN:PVQN:O75R:CVMI:R2PK:35O6:ENF6:UTBN:AY2K:IJZQ:DEQK
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
 provider=generic
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):
4 machines running on Ubuntu within OpenStack.
1 manager DRAIN LEADER
2 for mongod shards and replicas
1 worker which has mongos

@sanimej
Copy link

sanimej commented Jan 13, 2017

@YouriT
In step 3 Make the containers crash and let them restart, repeat multiple items

What exactly are you doing ? Killing the container process from the host or something else ? If the service had replicas, were you killing one of those ? If you don't scale it to 0 do you still see some inconsistency or it seems to happen only if you scale it down to 0 ? I can try to recreate the issue with the same sequence.

@YouriT
Copy link
Author

YouriT commented Jan 14, 2017

@sanimej to be fair and honest with you I have some difficulties to find an exact pattern to reproduce. For example, for one of the services (shards) the nslookup gives me 3 IPs, then another 2, then another 1. Well for all of them I still have IP addresses which shouldn't be the case. I'm going to try to find an exact way to reproduce the bug but that's pretty hard. My feeling is kind of a bug in the overlay network for some unknown reason. If you could maybe point me to some log commands to get the content of this cache or something like that I might be able to give more details.

The "crash the container" was either scale=0 or memory exhausted in my case which resulted in the container being killed 0 or > 0 exit codes.

The services here didn't have any replicas apart from mongos (please not the s). The replicas where different services. All my servies names and latest lookups, scale is still 0:

  • mongo-shard-rs0-1 replicas=1
    => nslookup => Address: 10.0.0.32
  • mongo-shard-rs0-2 replicas=1
    => nslookup => Address: 10.0.0.2 / 5 / 4
  • mongo-shard-rs1-1 replicas=1
    => nslookup => Address: 10.0.0.15 / 4
  • mongo-shard-rs1-2 replicas=1
    => nslookup => Address: 10.0.0.34
  • mongo-cfg1 replicas=1
    => nslookup => Address: 10.0.0.38 / 22
  • mongo-cfg2 replicas=1
    => nslookup => Address: 10.0.0.17 / 22
  • mongos1 replicas=2 (this service I don't really care it's just a router, the critical part is above, even though it should happen wherever it lies)

About not scaling it to 0:
Just made the test:

  • mongo-shard-rs0-1 replicas=1
    => nslookup => Address: 10.0.0.32 / 5
  • mongo-shard-rs0-2 replicas=1
    => nslookup => Address: 10.0.0.2 / 5 / 4 / 3
  • mongo-shard-rs1-1 replicas=1
    => nslookup => Address: 10.0.0.15 / 4
  • mongo-shard-rs1-2 replicas=1
    => nslookup => Address: 10.0.0.34 / 6
  • mongo-cfg1 replicas=1
    => nslookup => Address: 10.0.0.38 / 22 / 11
  • mongo-cfg2 replicas=1
    => nslookup => Address: 10.0.0.17 / 22 / 7

As you can see after scaling back to 1 for reach of those guys, I'm getting new IPs in R-R which is completely wrong :/

I'm willing to provide more informations but that's really hard with this bug.

@YouriT
Copy link
Author

YouriT commented Jan 16, 2017

Btw, restarting docker doesn't change the result.
Even deleting the service keeps giving me IP addresses. Is it possible that a change from VIP to DNSRR is causing this?
I really have difficulties providing with a way to reproduce...
I have the feeling it's somehow a bit the related to this issue #26772
Is there a way to clear DNS entries or force reloading them?

Edit: restarting docker on the host hosting the service fixed the lookup

@videege
Copy link

videege commented Feb 24, 2017

I am experiencing this issue on a swarm hosting multiple service stacks. Occasionally, after removing stacks, containers crashing, or services being scaled down and then back up, DNS resolution inside a container for another service will return additional incorrect results. This completely hoses our setup when it happens to the service hosting our reverse proxy, as requests are proxied to incorrect addresses.

Our swarm is running 1.13.1. Each service has certain containers that connect to a "public" overlay network which also is what our proxy service is connected to. It's within this overlay network that I see this error occurring.

What I typically see is that a service is running at an IP address, say, 10.0.0.3, and then it gets moved (after being scaled or redeployed) to another IP address, like 10.0.0.12. However, DNS lookup on this service (nslookup stack_servicename) still returns the old IP address in addition to the new one.

@thaJeztah
Copy link
Member

ping @sanimej

@velimir
Copy link

velimir commented Apr 20, 2017

I see similar issue. I'm running 4 nodes (1 manager and 3 workers).
When there's a load on node and for some reason services start to crashing causing chain reaction for other services to restart, which makes them placed from one node to the other swarm nodes and back.
I noticed that new (just created ~3 mins ago) containers can't resolve other containers after they've been moved. I tried to resolve name of container, which was not moved (or restarted) and has a placement constraint (so there's no way this container was restarted and DNS record is missing because it was deleted during shutdown).
I can't reproduce it in synthetic environment, just to provide an example, though I see this each time I try to run my services.

# docker version
Client:
 Version:      17.04.0-ce
 API version:  1.28
 Go version:   go1.7.5
 Git commit:   4845c56
 Built:        Mon Apr  3 17:45:49 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.04.0-ce
 API version:  1.28 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   4845c56
 Built:        Mon Apr  3 17:45:49 2017
 OS/Arch:      linux/amd64
 Experimental: true

UPD: It's not really hard to reproduce. Try to deploy using the following docker-compose file: https://gist.github.com/velimir0xff/28da8e16e01475b2a95f9ac74c069aa0

@soar
Copy link

soar commented May 15, 2017

I have the same problem. After removing services, removing network, recreating network with new subnet and mask - I still see old IP address in nslookup in container. Only full reboot for node running problem service helps.

@saabeilin
Copy link

I can confirm the same issue with Swarm, Docker version 17.03.1-ce, build c6d412e and overlay networking.
P.S. Could not reproduce on my laptop with single-node swarm cluster though.

@thaJeztah
Copy link
Member

ping @sanimej @fcrisciani

@fcrisciani
Copy link
Contributor

we are aware of this issue, I'm actively working on a patch

@thaJeztah thaJeztah added area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. labels Jun 6, 2017
@activeperception
Copy link

I seems to have a similar / same issue.

When restoring a database (on 3 mongo containers in replicaset over 3 managers...for what it matters) the host/manager because unavailable. (Docker AWS with 3 t2.medium managers, no workers). Whilst the restoring is in progress I can barely ssh into the manager.

Problem persists with 17.06-rc4.

What I've noticed is that the problem seems to only happens when I deploy a second identical stack (obviously under a different name) and run a mongorestore on the second stack. Initially I thought it would be some kind of conflict between the two stacks but my understanding is that they are completely isolated. Is that correct?

Possibly related to #32841

@fcrisciani
Copy link
Contributor

@activeperception heavy load and aws t2 instances is not a good combination.
Try to do the same test on a different instance type. The problem with t2 is that if your vm don't get scheduled the distributed database starts having issues. Check this out: https://aws.amazon.com/ec2/instance-types/
Also I would give it a shot with the RC5, already available, that fixed another race on SD.

@saabeilin
Copy link

I am facing a bug that could be related. docker network inspect <netid> shows a container that does not seem to exist on the swarm.

@sbrattla
Copy link

We're facing an issue which very much sounds like others have described here. We're running multiple stacks on the same swarm, and it appears that DNS entries gets mixed up / stale. As others also have mentioned, reproducing this issue in a predicatable manner can be challenging.

We've got a swarm with 5 nodes. One of the stacks has two webservers and two databases:

shop_drupalfront (alias: drupalfront / 2 replicas)
shop_drupaldb (alias: drupaldb / 1 replica)
shop_api (alias: api / 2 replicas)
shop_apidb (alias: apidb / 1 replica)

We suddenly saw that drupalfront would resolve drupaldb to the IP of apidb. Scaling drupalfront down to 0 and then up to 1 would resolve the issue after a few attempts.

We have also seen this issue in other stacks.

A few observations:

  1. Even though not completely certain, the issue seems to show itself the more load the swarm is under.
  2. I have only seen this issue for services with more than 1 replica.
  3. I have only seen this issue occuring for services that runs in a stacks where other services also run on the same port (not exposed ports).

@fcrisciani you mentioned that you were working on a patch for this issue. Are you making progress on that patch?

@thaJeztah
Copy link
Member

@sbrattla I think some patches went into Docker 17.06.x; which version are you running?

@sbrattla
Copy link

@thaJeztah we are running 17.06.02-ce. The issues we're seeing could certainly be something different from what's being described here, but what this thread describes aligns pretty well with what we're seeing. Is there anything more I can do to identify what's going on and what's going wrong?

@sbrattla
Copy link

sbrattla commented Sep 20, 2017

@thaJeztah and @fcrisciani I see that progress has been made on moby/libnetwork#1934 which, judging from the description, "smells" a bit of what could be our issue. Basically, already "taken" IPs are being handed out resulting in multiple services (load balancers) with the same IP.

I see that this patch is "stuck" as it needs review. If this is the patch, how long should we expect before this patch to go into the ce edition?

The issue we're seeing is causing a lot of noise in our development and production environments. Services gets their IPs mixed up on an daily/hourly basis, and the only remedy so far is to either downscale services to 0 and then up again (which works sometimes) or reboot entire Docker hosts.

I'm editing this post again as we're seeing this issue more and more. We're seeing duplicate IPs for load balancers representing services that runs on the same port. That is, it's always load balancers on the same port within the same network that gets mixed up. This seems to be the rule.

@iroller
Copy link

iroller commented Sep 25, 2017

Seeing the same issue with the latest edge Docker:
Version 17.09.0-ce-rc2-mac29 (19275)
Channel: edge
967c7199b8

@sbrattla
Copy link

sbrattla commented Oct 9, 2017

@thaJeztah or @fcrisciani I haven't hear from you in about 3 weeks. What's your take on this issue?

@fcrisciani
Copy link
Contributor

@sbrattla master patch got merged, I think this fix would come with the next 17.10 RC2 would be great to have a feedback on the base of that image

@thaJeztah
Copy link
Member

Anyone still seeing this problem on docker 17.10 or above?

@sbrattla
Copy link

sbrattla commented Nov 1, 2017

@thaJeztah we have unfortunately not been able to try out 17.10 RC2, so can't really say. Any chance this will be merged into the regular release if you don't receive any negative feedback?

@fcrisciani
Copy link
Contributor

@sachnk at best of my knowledge 18.06 GA is scheduled for next week and will mainly depends if the testing phase passes with no hiccups.

@thaJeztah
Copy link
Member

Looks like the fix is in 18.06.0-rc1; https://github.com/docker/docker-ce/blob/v18.06.0-ce-rc1/components/engine/vendor/github.com/docker/libnetwork/endpoint.go#L754-L758

18.06-rc1 is available for testing in the "test" channel on https://download.docker.com

or to use the install script; https://test.docker.com

@remingtonc
Copy link

remingtonc commented Aug 1, 2018

I believe I too am experiencing this issue currently with 18.06.0-ce. Completely removed and reinstalled twice today (purge, /etc/docker/, /var/lib/docker/). Cleared networking configuration via iptables and ip to as basic as possible before reinstallation. Cleared arp and turned on/off on docker0/docker_gwbridge. Single docker swarm node. @thaJeztah is 18.06.0-rc1 > 18.06.0-ce?

EDIT: I don't quite understand how it all works, but I'm guessing this is a bug in libnetwork ResolveName or ResolveService given IP resolution works correctly.

Docker Version

remcampb@remcampb-dev:~/tdm$ docker version
Client:
 Version:           18.06.0-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        0ffa825
 Built:             Wed Jul 18 19:11:02 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.0-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       0ffa825
  Built:            Wed Jul 18 19:09:05 2018
  OS/Arch:          linux/amd64
  Experimental:     false

Docker Info

remcampb@remcampb-dev:~/tdm$ docker info
Containers: 5
 Running: 5
 Paused: 0
 Stopped: 0
Images: 28
Server Version: 18.06.0-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: zmem59xovfqgkfozol7ud93qg
 Is Manager: true
 ClusterID: 173q07k838c8itt757d226fle
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 172.31.100.102
 Manager Addresses:
  172.31.100.102:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d64c661f1d51c48782c9cec8fda7604785f93587
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-131-generic
Operating System: Ubuntu 16.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 32
Total Memory: 94.16GiB
Name: remcampb-dev
ID: ETAB:QAG4:SJ35:N4A2:23E6:GPAJ:ITMK:E75A:VJVA:PE7N:BI42:ZXOZ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
HTTP Proxy: http://xxx.com:8080
HTTPS Proxy: http://xxx.com:8080
No Proxy: localhost,127.0.0.1,.xxx.com
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Docker Network

remcampb@remcampb-dev:~/tdm$ docker network inspect tdm_backend
[
    {
        "Name": "tdm_backend",
        "Id": "55f0y5igh52g0qiuy9bi1i2uc",
        "Created": "2018-07-31T20:39:04.42637736-07:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "6293b0a79860886dbc4cf79b80ef7dae53401e0ee66dd55eee511a2afda4f688": {
                "Name": "tdm_dbms.1.1ixkfmb5x7uqpmw9j1q4k6kaz",
                "EndpointID": "0fddef7dc59df0ebf855bab1511956add5d0de8dd32301a5a5c389d1da3ef9cd",
                "MacAddress": "02:42:0a:00:00:0c",
                "IPv4Address": "10.0.0.12/24",
                "IPv6Address": ""
            },
            "6cd5177b03e596f672d2cf4091c98e6787814cb2fbac3b6cad27a2c01d9e30ae": {
                "Name": "tdm_goaccess.zmem59xovfqgkfozol7ud93qg.pf0yuyp09814ywh845k90j7qr",
                "EndpointID": "1393efe89962660a3427fd18a4218c14c70dbdfeeddda14dfd9330636ccd1bc0",
                "MacAddress": "02:42:0a:00:00:0a",
                "IPv4Address": "10.0.0.10/24",
                "IPv6Address": ""
            },
            "94d1d5b65ba4a33a95d91155ac5098655181ea3115c13df79b2910711ad6208b": {
                "Name": "tdm_web.zmem59xovfqgkfozol7ud93qg.feq785imcpath3wtttkcvreqa",
                "EndpointID": "c1050494902f6e97ac9e2395ed4ff180e7f2e14820d34ffe931980341ddec6cf",
                "MacAddress": "02:42:0a:00:00:06",
                "IPv4Address": "10.0.0.6/24",
                "IPv6Address": ""
            },
            "d48204783b3d133d615f7868fee8199be4f101121d78a2798bbd4f1fc0652efd": {
                "Name": "tdm_nginx.zmem59xovfqgkfozol7ud93qg.9ci0yiagx2sref8a3vbnltwh1",
                "EndpointID": "48ba7958dac835bf691cb5cb4c29d9f37a7861b902c8592d07bd0c97dec3f183",
                "MacAddress": "02:42:0a:00:00:08",
                "IPv4Address": "10.0.0.8/24",
                "IPv6Address": ""
            },
            "e79b4cdc1d3439c8133dcc761c977821969753376648179176f20704895fde4a": {
                "Name": "tdm_etl.zmem59xovfqgkfozol7ud93qg.qrnqt9qciv3s3tdd4ww10sy8f",
                "EndpointID": "946fae848fd9a846c72b0c955b3a847c027c5d1aee8e32012e190c9ffbd82764",
                "MacAddress": "02:42:0a:00:00:04",
                "IPv4Address": "10.0.0.4/24",
                "IPv6Address": ""
            },
            "lb-tdm_backend": {
                "Name": "tdm_backend-endpoint",
                "EndpointID": "3975cd22848a18888a7ba9bf56b3f7b4159a9a72f8bd0b9814060220f2228dca",
                "MacAddress": "02:42:0a:00:00:02",
                "IPv4Address": "10.0.0.2/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4097"
        },
        "Labels": {
            "com.docker.stack.namespace": "tdm"
        },
        "Peers": [
            {
                "Name": "96a36b765eb9",
                "IP": "172.31.100.102"
            }
        ]
    }
]

Ping PoC

From the below it's clearly the resolver resolving the hostname incorrectly, resolving IP is correct.

remcampb@remcampb-dev:~/tdm$ docker exec -it d48204783b3d sh
/ # nslookup web
nslookup: can't resolve '(null)': Name does not resolve

Name:      web
Address 1: 10.0.0.5
/ # ping web
PING web (10.0.0.5): 56 data bytes
^C
--- web ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
/ # ping 10.0.0.6
PING 10.0.0.6 (10.0.0.6): 56 data bytes
64 bytes from 10.0.0.6: seq=0 ttl=64 time=0.232 ms
64 bytes from 10.0.0.6: seq=1 ttl=64 time=0.074 ms
64 bytes from 10.0.0.6: seq=2 ttl=64 time=0.147 ms
^C
--- 10.0.0.6 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.074/0.151/0.232 ms
/ # arp
tdm_web.zmem59xovfqgkfozol7ud93qg.feq785imcpath3wtttkcvreqa.tdm_backend (10.0.0.6) at 02:42:0a:00:00:06 [ether]  on eth1
? (172.18.0.1) at 02:42:b7:41:89:ad [ether]  on eth2
? (10.0.0.5) at <incomplete>  on eth1
/ # nslookup 10.0.0.6
nslookup: can't resolve '(null)': Name does not resolve

Name:      10.0.0.6
Address 1: 10.0.0.6 tdm_web.zmem59xovfqgkfozol7ud93qg.rzk5pefm0svdcrfzu57ttut69.tdm_backend
/ # nslookup 10.0.0.5
nslookup: can't resolve '(null)': Name does not resolve

Name:      10.0.0.5
Address 1: 10.0.0.5
/ # nslookup web
nslookup: can't resolve '(null)': Name does not resolve

Name:      web
Address 1: 10.0.0.5

EDIT: Including debug logs. Shows resolving as 10.0.0.5 per resolver instead of 10.0.0.6 per network.

Debug Logs

level=debug msg="Name To resolve: web."
level=debug msg="[resolver] lookup name web. present without IPv6 address"
level=debug msg="Name To resolve: web."
level=debug msg="[resolver] lookup for web.: IP [10.0.0.5]"
level=debug msg="Name To resolve: dbms."
level=debug msg="[resolver] lookup for dbms.: IP [10.0.0.11]"
level=debug msg="Name To resolve: web."
level=debug msg="IP To resolve 5.0.0.10"
level=debug msg="[resolver] query 5.0.0.10.in-addr.arpa. (PTR) from 172.18.0.6:46919, forwarding to udp:10.200.96.87"
level=debug msg="[resolver] external DNS udp:10.200.96.87 did not return any PTR records for \"5.0.0.10.in-addr.arpa.\""
level=debug msg="IP To resolve 6.0.0.10"
level=debug msg="[resolver] lookup for IP 6.0.0.10: name 51a94afdbe5a.tdm_backend"

libnetwork Diagnostic Tool

remcampb@remcampb-dev:~/tdm$ curl localhost:50015/help
OK
/getentry
/deleteentry
/help
/join
/clusterpeers
/updateentry
/networkstats
/
/gettable
/joinnetwork
/stackdump
/createentry
/ready
/leavenetwork
/networkpeers
remcampb@remcampb-dev:~/tdm$ curl localhost:50015/gettable?tname=endpoint_table\&nid=55f0y5igh52g0qiuy9bi1i2uc
OK
total entries: 5
0) k:`1960d8419b8830e5c5dea09edf91cb75cc47e5c42d069aefc7fb9313c5da0059` -> v:`Cjt0ZG1fd2ViLnptZW01OXhvdmZxZ2tmb3pvbDd1ZDkzcWcucnprNXBlZm0wc3ZkY3JmenU1N3R0dXQ2ORIHdGRtX3dlYhoZOWNjajdvYmlubm1tdGRiZXpleDFqcDdzdCIIMTAuMC4wLjUqCDEwLjAuMC42OgN3ZWJCDDUxYTk0YWZkYmU1YQ==` owner:`cc6ebbfd6253`
1) k:`2c833942674928e7de1e419a4a477758501c378ddd47fbcd506df194802eadb1` -> v:`Cj10ZG1fbmdpbnguem1lbTU5eG92ZnFna2Zvem9sN3VkOTNxZy50Y2s5MHVoZjlsMzM0ZmxteXNveWRxcmZwEgl0ZG1fbmdpbngaGXdmamQ3MDl6Z3R4MTFhcGpnZXFhZjIwY2UiCDEwLjAuMC43KggxMC4wLjAuODoFbmdpbnhCDDNkMWQ3MTVmNzEyMA==` owner:`cc6ebbfd6253`
2) k:`4c396a2eb253072c9c5a419f9fe8eb57ee6d370ad5c241dec013b871f005ad88` -> v:`CkB0ZG1fZ29hY2Nlc3Muem1lbTU5eG92ZnFna2Zvem9sN3VkOTNxZy53N2F4NmdnaHRhanQwcDN6ZHFvczJ2Y3dwEgx0ZG1fZ29hY2Nlc3MaGWx1bHdkN3dzOWc3ZmF5d2lldjdyNXVxZDciCDEwLjAuMC45KgkxMC4wLjAuMTI6CGdvYWNjZXNzQgxlN2VlMzgyNDdhYjc=` owner:`cc6ebbfd6253`
3) k:`b320043e5faf3f68e1594d915f85327a9b30dc9fb6db2a14928693f6f2f59c4b` -> v:`Cjt0ZG1fZXRsLnptZW01OXhvdmZxZ2tmb3pvbDd1ZDkzcWcucnV3dTRpZnI3cDhncnk5Z3cyM2JrMWJpNxIHdGRtX2V0bBoZamhweTlvejFia3E4aHV2MjNmdzBxcjZtbyIIMTAuMC4wLjMqCDEwLjAuMC40OgNldGxCDDgwMTcxYmVkNzdkYg==` owner:`cc6ebbfd6253`
4) k:`fcf1021badb5343704ffff86f26a115d20aad2738513b65c72941d59da5adf1e` -> v:`CiR0ZG1fZGJtcy4xLmFpbXJ3ZGRsMzM0N3d2aDNnZGo0MTczMjYSCHRkbV9kYm1zGhl3dXNnMHh6N3l6aDJrdDM2dXp3MTgzbnlhIgkxMC4wLjAuMTEqCTEwLjAuMC4xMDoEZGJtc0IMY2E1YjNlZmUwN2Jj` owner:`cc6ebbfd6253`
remcampb@remcampb-dev:~/tdm$ curl localhost:50015/gettable?tname=overlay_peer_table\&nid=55f0y5igh52g0qiuy9bi1i2uc
OK
total entries: 6
0) k:`1960d8419b8830e5c5dea09edf91cb75cc47e5c42d069aefc7fb9313c5da0059` -> v:`CgsxMC4wLjAuNi8yNBIRMDI6NDI6MGE6MDA6MDA6MDYaDjE3Mi4zMS4xMDAuMTAy` owner:`cc6ebbfd6253`
1) k:`2c833942674928e7de1e419a4a477758501c378ddd47fbcd506df194802eadb1` -> v:`CgsxMC4wLjAuOC8yNBIRMDI6NDI6MGE6MDA6MDA6MDgaDjE3Mi4zMS4xMDAuMTAy` owner:`cc6ebbfd6253`
2) k:`4c396a2eb253072c9c5a419f9fe8eb57ee6d370ad5c241dec013b871f005ad88` -> v:`CgwxMC4wLjAuMTIvMjQSETAyOjQyOjBhOjAwOjAwOjBjGg4xNzIuMzEuMTAwLjEwMg==` owner:`cc6ebbfd6253`
3) k:`9c48e23bfe3e80b32482b03da32d41f49e975834a6f09c61bb75581f19867df7` -> v:`CgsxMC4wLjAuMi8yNBIRMDI6NDI6MGE6MDA6MDA6MDIaDjE3Mi4zMS4xMDAuMTAy` owner:`cc6ebbfd6253`
4) k:`b320043e5faf3f68e1594d915f85327a9b30dc9fb6db2a14928693f6f2f59c4b` -> v:`CgsxMC4wLjAuNC8yNBIRMDI6NDI6MGE6MDA6MDA6MDQaDjE3Mi4zMS4xMDAuMTAy` owner:`cc6ebbfd6253`
5) k:`fcf1021badb5343704ffff86f26a115d20aad2738513b65c72941d59da5adf1e` -> v:`CgwxMC4wLjAuMTAvMjQSETAyOjQyOjBhOjAwOjAwOjBhGg4xNzIuMzEuMTAwLjEwMg==` owner:`cc6ebbfd6253`

Can update with stackdump if desired.

Host Interfaces

remcampb@remcampb-dev:~/tdm$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: lxcbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether 00:16:3e:00:00:00 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.1/24 scope global lxcbr0
       valid_lft forever preferred_lft forever
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:a4:e7:b6:91 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
6: docker_gwbridge: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:b7:41:89:ad brd ff:ff:ff:ff:ff:ff
    inet 172.18.0.1/16 brd 172.18.255.255 scope global docker_gwbridge
       valid_lft forever preferred_lft forever
    inet6 fe80::42:b7ff:fe41:89ad/64 scope link
       valid_lft forever preferred_lft forever
12: eth1@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue state UNKNOWN group default qlen 10000
    link/ether d2:52:fa:eb:8e:d3 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.31.100.102/24 brd 172.31.100.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::d052:faff:feeb:8ed3/64 scope link
       valid_lft forever preferred_lft forever
13: eth2@if14: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:35:9f:da brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.254.95/24 brd 192.168.254.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe35:9fda/64 scope link
       valid_lft forever preferred_lft forever
15: eth0@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 10000
    link/ether 3a:20:cd:54:ee:00 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.200.99.102/24 brd 10.200.99.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::3820:cdff:fe54:ee00/64 scope link
       valid_lft forever preferred_lft forever
77: veth934e66e@if76: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether 3e:d0:93:ca:30:b6 brd ff:ff:ff:ff:ff:ff link-netnsid 2
    inet6 fe80::3cd0:93ff:feca:30b6/64 scope link
       valid_lft forever preferred_lft forever
84: vethf72ec75@if83: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether da:ad:46:34:7a:11 brd ff:ff:ff:ff:ff:ff link-netnsid 4
    inet6 fe80::d8ad:46ff:fe34:7a11/64 scope link
       valid_lft forever preferred_lft forever
90: vethb6e012d@if89: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether 56:4a:67:d4:e9:9b brd ff:ff:ff:ff:ff:ff link-netnsid 5
    inet6 fe80::544a:67ff:fed4:e99b/64 scope link
       valid_lft forever preferred_lft forever
96: veth2ee600d@if95: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether 2a:85:a2:0e:57:50 brd ff:ff:ff:ff:ff:ff link-netnsid 6
    inet6 fe80::2885:a2ff:fe0e:5750/64 scope link
       valid_lft forever preferred_lft forever
98: vethdb26a0a@if97: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether 5a:38:dd:09:f6:e9 brd ff:ff:ff:ff:ff:ff link-netnsid 8
    inet6 fe80::5838:ddff:fe09:f6e9/64 scope link
       valid_lft forever preferred_lft forever
100: vethf0a1deb@if99: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker_gwbridge state UP group default
    link/ether 22:fc:07:72:c4:1e brd ff:ff:ff:ff:ff:ff link-netnsid 7
    inet6 fe80::20fc:7ff:fe72:c41e/64 scope link
       valid_lft forever preferred_lft forever

Swarm / docker-compose.yml

remcampb@remcampb-dev:~/tdm$ docker swarm init --advertise-addr eth1

Can post docker-compose.yml if desired.

I was debugging a separate issue where I couldn't reach any of the published ports from a host interface and was instead arping and then trying the ports and then setting up a proxy to whatever worked. It was a massive hack but worked out. I thought it was due to either 1) We're running inside an LXC or 2) Docker is confused with multiple interfaces on the host. I finally set the ports to publish in host mode which made the published ports correctly available on 0.0.0.0 and reachable externally. However, my containers were now unable to speak to each other. I figured at some point during leaving and recreating the swarm cluster over and over I had jacked up iptables or some other configuration somewhere and embarked on the upgrade/purge percussive maintenance path. Now it's looking more like the name resolution was simply incorrect.

Any way to debug the internals of the embedded resolver?

@fcrisciani
Copy link
Contributor

@remingtonc
the issue tracked here is a completely different one. In this issue attachable container were not being deleted from the dns resolver.
I went through your message and to me is not clear what is the expectation from a resolver point of view.
10.0.0.5 does not correspond to any running container so the call as it fails the resolution from the local name server is propagate to the external one. On the other side 10.0.0.6 is instead resolved to one of the running containers.
If at the end of the diagnostic tool you add the keyword unsafe like:
curl 'localhost:50015/gettable?tname=endpoint_table\&nid=55f0y5igh52g0qiuy9bi1i2uc&unsafe' you will have a string version of the data structure so that you can look at the fields value. It's called unsafe because just prints the binary as a string so some characters may not be printable. We have anyway another utility the clientDiagnostic that does the proper deserialization of the data structure if needed.

@remingtonc
Copy link

@fcrisciani Thanks for taking a look. The resolver is resolving the hostnames to invalid IPs. e.g. container web is being resolved as 10.0.0.5 when it is in fact 10.0.0.6. In the docker inspect network output the "web" container shows 10.0.0.6 as its IP. nslookup 10.0.0.6 resolves the container hostname correctly. In the debug logs we can also see dbms being resolved as 10.0.0.11 instead of 10.0.0.12. It appears to be a -1 offset from resolver from actual in network. Is this expected behavior?

@fcrisciani
Copy link
Contributor

@remingtonc isn't it the VIP?

@fcrisciani
Copy link
Contributor

if you want the task ip list you need to do tasks.<service_name> so will be nslookup tasks.web

@remingtonc
Copy link

remingtonc commented Aug 1, 2018

@fcrisciani That works. I was unfamiliar with the VIP - do you have any recommendations on troubleshooting VIP connectivity issues between containers? Running effectively a stock Docker installation with "storage-driver"="overlay" as the only customization.

EDIT: Or should I, internally to the stack, use tasks.<service> instead of <service>?

@fcrisciani
Copy link
Contributor

@remingtonc the advantage of using the VIP instead of the container IP is that with the VIP you don't care how many instances of the service are running behind it, it can be 1 or 10 but your application will reach the service using the same IP. You can also choose to use instead of the VIP the dns RR mode, and basically every time you resolve the service name you will have the list of containers ordered in round robin fashion. This means that you have to be aware that if the container that you are talking to goes down you will need to do another dns resolution and be sure that the previous results did not get cached.

From a debugging point of view, the tool to use are:

  1. tcpdump
  2. ipvsadm --> load balancer is done with ipvs

@cnaslain
Copy link

I had the same issue resolving container IP from other containers of the same overlay network.
Using tasks.<container_name> as name instead of <container_name> is a workaround that works for me.

@dankawka
Copy link

dankawka commented Aug 24, 2018

Issue still present for me.
Calling curl from one of the container inside docker network against other service(A) returns response from other service(B) from time to time:

curl STACKNAME_SERVICE_A:3001
-> (valid response)
{
  "name": "A_service",
  "version": {
    "number": "0.1.0",
    "build_commit_info": "XXX"
  },
  "hostname": "XXX"
}

curl STACKNAME_SERVICE_A:3001
-> (what?)
{
  "name": "B_service",
  "version": {
    "number": "1.0.0",
    "build_commit_info": "XXX"
  },
  "hostname": "XXX"
}
Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:24:56 2018
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:23:21 2018
  OS/Arch:          linux/amd64
  Experimental:     false

@odelreym
Copy link

Hi list

Can anyone confirm if the problem persists in 18.06.1-ce?

@markwylde
Copy link

markwylde commented Nov 11, 2018

I can confirm this problem still exists on Docker version 18.09.0, build 4d60db4.

When I connect to the host via stackname_containername it gives me the ip address of a different container.

But when I connect using tasks.stackname_containername it gives me the correct one.

shell-output

Edit: So after re-reading this thread I realise it is giving me a VIP (virtual ip) which load balances between all the containers. However using the VIP, (ie stackname_containername) doesn't actually connect me to any of the tasks. It just fails with failed: Connection refused.

@fcrisciani
Copy link
Contributor

@markwylde tasks.<service_name> returns the list of IP of the backend containers behind the specific service, while the resolution of the service name returns the VIP and then the LB will take care of redirecting the traffic towards an active container servicing that service.
I don't think I got the issue that you are describing, can you elaborate?

@thaJeztah
Copy link
Member

doesn't actually connect me to any of the tasks. It just fails with failed: Connection refused.

May not be the cause, but make sure that whatever is running in your container is listening on 0.0.0.0, and not on localhost / 127.0.01 (as "localhost" is "localhost" of the container, not of the "host", which means that it's only accessible from within the container.

@FalkNisius
Copy link

Sorry that I have to add a comment to this thread, because the problem is not solved yet.

I use Docker 18.09.0, in a swarm with 4 master nodes on ubuntu 16.0.4 at digital ocean droplets. In most cases the swarm behavior fine. Sometimes the internal DNS answers with two service IPs for one service, after several docker service rm <service_name> and docker stack deploy ... command. The services are connected to one overlay network.

One of the IP is an old never more existing service instance, the other one is the correct of the actual healthy service. The services are always as vip service available. The reverse proxy in a nginx container, tries to access both, one goes wrong, one succeed.

To resolve this situation, I found no way without removing the overlay network to reset the internal DNS. Tried docker service update --force, docker service rm .., docker stack rm, docker service scale service=0, and so on.

Perhaps there is a race condition to update the internal entries for service endpoints.
But it should be a better solution as shutting down all stacks, remove and recreate the network and starting the stacks again.

A docker network reinit-dns command would be a solution on resolving this issues.
I had assumed that a docker stack rm ... would clean all internal DNS entries.

Thanks

@leshik
Copy link

leshik commented Dec 17, 2018

The problem is still not solved in 18.09 and is easy reproducible even on single node swarm.

@olljanat
Copy link
Contributor

@leshik can you share how you are able to reproduce this on single node swarm? I just tried with guide on #30134 (comment) but it looked working just fine.

@leshik
Copy link

leshik commented Dec 18, 2018

@olljanat sure, here it is:

First, docker info:

Containers: 17
 Running: 14
 Paused: 0
 Stopped: 3
Images: 25
Server Version: 18.09.0
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
 NodeID: xbwvu8xd4eovyoaxjwnd1suv6
 Is Manager: true
 ClusterID: lobvck68wwkolom5c4xfa4vgi
 Managers: 1
 Nodes: 1
 Default Address Pool: 10.0.0.0/8
 SubnetSize: 24
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 10
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 192.168.1.23
 Manager Addresses:
  192.168.1.23:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: c4446665cb9c30056f4998ed953e6d4ff22c7c39
runc version: 4fc53a81fb7c994640722ac585fa9ca548971871
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.15.0-39-generic
Operating System: Ubuntu 18.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.683GiB
Name: demo
ID: UKKF:U6NR:LIWI:ASCX:LL5L:2KBX:2ARZ:5MMX:M4RI:WQMO:R2UA:F67D
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: No swap limit support

Stack compose file test.yml for tests:

version: '3.7'

services:
  first:
    image: alpine
    command: sleep 3600
    init: true

  second:
    image: alpine
    command: sleep 3600
    init: true

Then, docker stack deploy -c test.yml test, followed by docker network inspect test_default:

[
    {
        "Name": "test_default",
        "Id": "quisbu0cbrgz9acy9j808hrxu",
        "Created": "2018-12-18T10:00:10.846072788+07:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.5.0/24",
                    "Gateway": "10.0.5.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "39c200af9048f20e2ca50a5e59e2534329134bfd4b270c7b6dcd95119116e86b": {
                "Name": "test_first.1.qbwe3xsyuiaas50reme2ubtbr",
                "EndpointID": "3968618c4ac249f82fab75d1c6e68aa8893f3d9cf8279c95f18df924d0271b69",
                "MacAddress": "02:42:0a:00:05:03",
                "IPv4Address": "10.0.5.3/24",
                "IPv6Address": ""
            },
            "ccb93d48e2a7e8a6d4c0d0c60872b30f5c4cff43ef97aba4a5aec7118779e612": {
                "Name": "test_second.1.rermha7mo5sipex53qm6o0s9o",
                "EndpointID": "d37de1616bc297f9c9b047eeed66f75048f5d9ea2347bcf738fb92eb3623080e",
                "MacAddress": "02:42:0a:00:05:06",
                "IPv4Address": "10.0.5.6/24",
                "IPv6Address": ""
            },
            "lb-test_default": {
                "Name": "test_default-endpoint",
                "EndpointID": "90df88a051eaf6557bf22883a6659044826d288f41559728ba5b13d6b5efed36",
                "MacAddress": "02:42:0a:00:05:04",
                "IPv4Address": "10.0.5.4/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4102"
        },
        "Labels": {
            "com.docker.stack.namespace": "test"
        },
        "Peers": [
            {
                "Name": "eb787c8b22e8",
                "IP": "192.168.1.23"
            }
        ]
    }
]

Note the container addresses should be 10.0.5.3 and 10.0.5.6, let's check them.

docker exec -it test_first.1.qbwe3xsyuiaas50reme2ubtbr sh and then ifconfig:

eth0      Link encap:Ethernet  HWaddr 02:42:0A:00:05:03
          inet addr:10.0.5.3  Bcast:10.0.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

eth1      Link encap:Ethernet  HWaddr 02:42:AC:12:00:03
          inet addr:172.18.0.3  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:188 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:8300 (8.1 KiB)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

docker exec -it test_second.1.rermha7mo5sipex53qm6o0s9o sh and then ifconfig:

eth0      Link encap:Ethernet  HWaddr 02:42:0A:00:05:06
          inet addr:10.0.5.6  Bcast:10.0.5.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:42 (42.0 B)  TX bytes:0 (0.0 B)

eth1      Link encap:Ethernet  HWaddr 02:42:AC:12:00:0A
          inet addr:172.18.0.10  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:355 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:15342 (14.9 KiB)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

So far, so good, now let's ping by name and nslookup each other:

On first container ping second:

PING second (10.0.5.5): 56 data bytes
64 bytes from 10.0.5.5: seq=0 ttl=64 time=0.184 ms
64 bytes from 10.0.5.5: seq=1 ttl=64 time=0.113 ms
64 bytes from 10.0.5.5: seq=2 ttl=64 time=0.119 ms
^C
--- second ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.113/0.138/0.184 ms

nslookup second:

nslookup: can't resolve '(null)': Name does not resolve

Name:      second
Address 1: 10.0.5.5

The other way round, ping first:

PING first (10.0.5.2): 56 data bytes
64 bytes from 10.0.5.2: seq=0 ttl=64 time=0.075 ms
64 bytes from 10.0.5.2: seq=1 ttl=64 time=0.060 ms
64 bytes from 10.0.5.2: seq=2 ttl=64 time=0.064 ms
^C
--- first ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.060/0.066/0.075 ms

nslookup first:

nslookup: can't resolve '(null)': Name does not resolve

Name:      first
Address 1: 10.0.5.2

What? Why ping to 10.0.5.2 and 10.0.5.5 works and DNS resolves to them while these addresses aren't even present anywhere? Maybe, some glitch? Let's see.

docker stop test_first.1.qbwe3xsyuiaas50reme2ubtbr test_second.1.rermha7mo5sipex53qm6o0s9o then docker network inspect test_default after the service recreates containers:

[
    {
        "Name": "test_default",
        "Id": "quisbu0cbrgz9acy9j808hrxu",
        "Created": "2018-12-18T10:00:10.846072788+07:00",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.5.0/24",
                    "Gateway": "10.0.5.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "15875ea4771e581cb63e27f80d97ab52d19d43f967f354e97550af8864d9e798": {
                "Name": "test_second.1.uzbttwyswm8t0ajt1z9ucmw4l",
                "EndpointID": "42927351b0b94a1acab1c567a959b5d45b36118031bf12571446bed3d7330820",
                "MacAddress": "02:42:0a:00:05:07",
                "IPv4Address": "10.0.5.7/24",
                "IPv6Address": ""
            },
            "3f904b9f904cba634e697a8db04938c77b1d0bf897a86ab2c020515c228fe9f3": {
                "Name": "test_first.1.ys7ul4vpkcpfjulcx5l11phtl",
                "EndpointID": "921745cb0d0b5f4e3687792b8fffa14ba088da1ebea1194f76818751bb2b0bde",
                "MacAddress": "02:42:0a:00:05:08",
                "IPv4Address": "10.0.5.8/24",
                "IPv6Address": ""
            },
            "lb-test_default": {
                "Name": "test_default-endpoint",
                "EndpointID": "90df88a051eaf6557bf22883a6659044826d288f41559728ba5b13d6b5efed36",
                "MacAddress": "02:42:0a:00:05:04",
                "IPv4Address": "10.0.5.4/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4102"
        },
        "Labels": {
            "com.docker.stack.namespace": "test"
        },
        "Peers": [
            {
                "Name": "eb787c8b22e8",
                "IP": "192.168.1.23"
            }
        ]
    }
]

Addresses have changed, that's good. What about DNS?

docker exec -it test_first.1.ys7ul4vpkcpfjulcx5l11phtl sh, then nslookup second:

nslookup: can't resolve '(null)': Name does not resolve

Name:      second
Address 1: 10.0.5.5

The other way round, docker exec -it test_second.1.uzbttwyswm8t0ajt1z9ucmw4l sh, followed by nslookup first:

nslookup: can't resolve '(null)': Name does not resolve

Name:      first
Address 1: 10.0.5.2

Nope, same thing. What kind of magic is happening here? NAT? Not a good thing, this breaks everything that checks the once resolved IP address at the application level.

@olljanat
Copy link
Contributor

@leshik swarm services are designed to work with n replicas of containers. That why by default swarm will create load balancer IP for each service. You can see it with command docker service inspect <service_name>

Anyway, if you want to use container addresses instead then just modify your stack looks like this:

version: '3.7'

services:
  first:
    image: alpine
    command: sleep 3600
    init: true
    deploy:
      endpoint_mode: dnsrr

  second:
    image: alpine
    command: sleep 3600
    init: true
    deploy:
      endpoint_mode: dnsrr

This is btw documented to https://docs.docker.com/engine/swarm/ingress/#configure-an-external-load-balancer

PS. This have nothing to do with original issue so please create new one if you don't get it working with this guidance.

@leshik
Copy link

leshik commented Dec 18, 2018

Wow, thanks @olljanat, that helped a lot.
May I request changing the compose documentation a bit so that this option is mentioned somewhere closer to the networking stuff? Seriously, I'm not the guy who ignores documentation, I've read and googled mostly all regarding network configuration and DNS in Docker during past several days, and frankly, I started pulling my hair out trying to find a solution. It wasn't too obvious that I had to look into the deploy section. Maybe the small note in the networks section of the documentation with the reference to this key would help others to save some time researching.

@olljanat
Copy link
Contributor

@leshik sounds good idea. How ever I'm not fully sure which document you are referring. Can you create new issue about it to https://github.com/docker/docker.github.io with links to documents?

@markwylde
Copy link

markwylde commented Dec 19, 2018

I have tried to recreate the problem. Bare in mind that I also had this breaking a few weeks ago. However, using the same version of docker I can't seem to recreate it.

I have tried with both Ubuntu 16 and 18. Maybe it was a quirk with the operating system.

Using the steps below, everything works exactly as expected.

It still bugs me though that I can't recreate the bug. I've had to stop using the load balancer of all my services, and start using tasks.service_name. I would now be tempted to move it back, but it scares me incase this happens again.

It is possible that I wasn't creating my network with the scope set to swarm. Would that make a difference?

$ docker network create --attachable --driver overlay TESTNETA

Instead of:

$ docker network create --attachable --scope swarm --driver overlay TESTNETA

DNS Resolver in Swarm Mode

Setup environment

  1. Create an Ubuntu 16.04x64 VM in DigitalOcean

  2. SSH into the box

  3. Follow the steps from the official DigitalOcean documentation to install Docker
    https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-18-04

  4. Verify you are on the latest version of docker

$ docker --version
Docker version 18.09.0, build 4d60db4
  1. Initialise a new Docker Swarm instance
$ docker swarm init --advertise-addr

6 Create a new network

$ docker network create --attachable --scope swarm --driver overlay TESTNETA

Manual service creation works

  1. Create two webservers as a service
$ docker service create --name first --network TESTNETA tutum/hello-world
$ docker service create --name second --network TESTNETA tutum/hello-world
  1. Let's attach to a task on one of the services
$ docker exec -it $(docker ps -qf name=first) ping second

Stack deploy didn't work, but now does

  1. Create two folders with a docker-compose.yml file.

aaa/docker-compose.yml:

version: '3.7'

services:
  first:
    image: tutum/hello-world

  second:
    image: tutum/hello-world

networks: 
  default:
    external: true
    name: TESTNETA

bbb/docker-compose.yml:

version: '3.7'

services:
  third:
    image: tutum/hello-world

  fourth:
    image: tutum/hello-world

networks: 
  default:
    external: true
    name: TESTNETA
  1. Deploy the stacks
$ cd aaa
$ docker stack deploy --compose-file docker-compose.yml aaa
$ cd ../bbb
$ docker stack deploy --compose-file docker-compose.yml bbb
  1. Try and connect from one stack to another
$ docker exec -it $(docker ps -qf name=aaa_first) wget -qO- bbb_fourth

@olljanat
Copy link
Contributor

@markwylde

Stack deploy didn't work, but now does

have nothing to do with original issue which was about that Swarm DNS worked incorrectly on case where service/container crashed/was removed ( and that have been fixed already on 18.09)

so if you still see this on one latest version plz create new issue about it.

@sandys
Copy link

sandys commented Aug 24, 2019

we are facing this issue as well. i think for some reason if the health checks fails intermittently, then swarm de-registers the service entirely.
and in situations where services are interdependent on each other, this completely stops them from coming back up.

@thaJeztah
Copy link
Member

Interdependency can definitely be difficult. Generally, when designing your containers/services, they should be designed/configured to be resilient against failures of the services they depend on; those may not (yet) be available (for example, when deploying your stack), but should also take into account that those services may be (temporarily) unavailable during their whole lifecycle; a network connection can fail, a database may be in maintenance, ...

Having (eg) a retry-loop to reconnect to those services could help in such situations.

@jabteles
Copy link

Having the same issue with 18.09.5

@olljanat
Copy link
Contributor

@jabteles plz create new issue and fill all asked details to it. This one have been closed already

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking area/swarm kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/more-info-needed version/1.12
Projects
None yet
Development

No branches or pull requests