Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swarm Service - IP resolves differently from within container vs other containers (same overlay) #30963

Open
ventz opened this issue Feb 13, 2017 · 18 comments

Comments

@ventz
Copy link

ventz commented Feb 13, 2017

Description

Service containers seem to get 2 IP addresses (let's say x.y.z.2 and x.y.z.3) -- one ends up being the DNS name, and the other is inserted in /etc/hosts. The problem this causes is that the container resolves it's own name to x.y.z.3, while other containers resolve it (by name) to x.y.z.2. For services like MariaDB Galera, where you need to specify an IP or a name, this breaks the cluster, since the cluster advertises by name one IP but in reality the other nodes push a different.

Ex - Starting a simple service (only 1 container) with something like:

docker service create \
--name apache \
--hostname apache \ <-- NOTE: this seems to cause the issue!
--network some-overlay \
$image

Seems to assign 2 IP addresses to the container. Let's say the network is 10.0.0.0/24, it gives it:
10.0.0.2 and 10.0.0.3

Everything resolves "apache" as "10.0.0.2", except that /etc/hosts on the apache container is "10.0.0.3" so if you attach to the apache container and resolve "apache", it thinks it's 10.0.0.3.

Steps to reproduce the issue:

  1. Run a service, let's say the container is "$hostname"
  2. Find the host it's running on, and attach to it: "docker exec -it $container /bin/shell"
  3. "ip addr" to find the 2 IPs
    4.) ping $hostname, and note the IP
    5.) run another service or simply attach to the overlay (make sure it's attachable when created) and start an alpine container to test with: "docker run -it --rm --net=some-overlay alpine /bin/ash"
    6.) ping $hostname again, and note the IP -- it will NOT match Template-less DNS auto-configuration #4

Describe the results you received:
DNS name within the swarm/overlay, and the container's internal hostname<->IP do not match.

Describe the results you expected:
A single IP, or at least for the /etc/hosts to agree with the DNS name that's available.

Additional information you deem important (e.g. issue happens only occasionally):
N/A

Output of docker version:

Client:
 Version:      1.13.1
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 06:50:14 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   092cba3
 Built:        Wed Feb  8 06:50:14 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 12
Server Version: 1.13.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 50
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: mwxekr4tr4jhbhiv3s9at6ozp
 Is Manager: true
 ClusterID: vcwzg0mebqw4kp58pz8ynm0cn
 Managers: 3
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: PUBIP#1
 Manager Addresses:
  PUBIP#1:2377
  PUBIP#2:2377
  PUBIP#3:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1
runc version: 9df8b306d01f59d3a8029be411de015b7304dd8f
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-62-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 31.42 GiB
Name: docker01
ID: SZUT:WA2N:TMPP:MAYQ:ZBZ3:VQS6:7N35:QTGU:NOCB:GJNQ:BUEO:UIQ2
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
 nfs=no
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):
Swarm cluster with 3 managers, over 3 public IPs (1-1 NAT). Encrypted overlay network.
update: tested with non-encrypted overlay, and same issue.
update#2: it seems that having --hostname when creating the service causes the issue, so maybe there is some sort of a bug around that? If you don't have it, the container resolves the same IP from internally and externally.

@ventz
Copy link
Author

ventz commented Feb 13, 2017

Tested with non-encrypted overlay -- same issue.

@ventz
Copy link
Author

ventz commented Feb 13, 2017

UPDATE: It seems that having "--hostname" when creating the service causes the issue, so maybe there is some sort of a bug around that? If you don't have it, the container resolves the same IP from internally and externally.

@aboch
Copy link
Contributor

aboch commented Feb 13, 2017

@ventz

10.0.0.2 is the Virtual IP for the apache server, each apache backend task will be programmed with it. 10.0.0.3 is the real IP address of the backend apache task.

--hostname option should not make a difference.

@ventz
Copy link
Author

ventz commented Feb 13, 2017

@aboch It seems that --hostname makes the difference because it matches the name of the service (--name).

I think the bug is around which IP gets inserted into /etc/hosts though. So when the --hostname matches the --name, and the first IP is in the /etc/hosts file, that responds from the inside instead of the 2nd IP (the DNS name IP).

So essentially you get:
1.) (from the container) "ping $hostname" => 10.0.0.3
2.) (from the outside) "ping $hostname" => 10.0.0.2

@aboch
Copy link
Contributor

aboch commented Feb 13, 2017

Yes, the --hostname you could set when running containers has never been populated in the internal DNS. It has only local meaning inside the container.
So now that was made available to set for services, it end up inside all the service backend tasks, but it is not visible to other containers in the network. I am not sure what is the intended use.

@ventz
Copy link
Author

ventz commented Feb 13, 2017

I guess the issue is that it causes the collision. (You are right - with the dynamic --name to DNS, it almost seems redundant, other than the nice benefit of setting the container name). Basically in any cluster where the nodes have to agree on the IPs by name, this causes a break now. MariaDB's Galera cluster is actually a really good example. You start the nodes with a gcomm://node1, node2, node3 -- and node2 + node3 will see a different IP for node1 than what node1 will claim it's IP is, with the --hostname flag.

@aboch
Copy link
Contributor

aboch commented Feb 13, 2017

It looks like we support templates for the newly introduced --hostname on the service cli.

docker service create ... --replicas <N> --hostname {{.Service.Name}}.{{.Task.Slot}} ...

This will have it match the task's name which the internal DNS reports.

^ Hmm no. This one instead:

--hostname {{.Service.Name}}.{{.Task.Slot}}.{{.Task.ID}} ...

@jmzwcn
Copy link
Contributor

jmzwcn commented Feb 13, 2017

Reproduced too, strange why there is two IPs for a container.

@ventz
Copy link
Author

ventz commented Feb 13, 2017

@aboch thanks. Could there be a warning possibly if the --hostname is set to the same as --name, since it would produce undesirable results?

@jmzwcn - @aboch mentioned the reason ^

10.0.0.2 is the Virtual IP for the apache server, each apache backend task will be programmed with it. 10.0.0.3 is the real IP address of the backend apache task.

Having the two IPs is ok, but not having the container/task resolve itself to the same as external nodes is the issue.

Basically in any circumstance where multiple nodes need to agree on a "pool" based on name/dns (since it can't be done by --ip (@thaJeztah - this is a great example of the need for a static IP here: #29816, which is the sub-issue for this: #25303 (comment))

@ventz
Copy link
Author

ventz commented Feb 13, 2017

Another issue that this causes (assuming you remove --hostname or make it different than --name to eliminate that bit), the traffic that leaves is using the 2nd IP. But the advertised/by-name is the first IP.

So in a simple example, let's say you have 2 nodes:
A (10.0.0.2, and 10.0.0.3)
B (10.0.0.4 and 10.0.0.5)

from B: "ping A" will tell you 10.0.0.2
from A: "ping B" will tell you 10.0.0.4

But the nodes will communicate using 10.0.0.3 and 10.0.0.5

For things that require pre-configuration/exchange of IP or hostname -- this breaks it. You are now starting both nodes with something like "cluster-members: A, B" and they are communicate using neither of those addresses.

And since there is no static --ip option, and you can't use by name/DNS, this basically becomes impossible to do with a "docker service" setup. Where as, this works perfectly with "docker run".

I think this really needs a re-examine, because there are many clustered applications that tend to work with requiring pre-distribution of IP or hostname in order to cluster.

@rahulpalamuttam
Copy link

Has there been any updates to this issue or workarounds?
It's been a blocker for me when setting up my clustered application.

@ventz
Copy link
Author

ventz commented Aug 9, 2017

@rahulpalamuttam My current "solution" (more like limited fix) is NOT to use the --hostname, but instead, just use --name, and then use those names everywhere. But yes, the real solution would be for this to be fixed.

@jdelamar
Copy link

jdelamar commented Aug 25, 2017

Oh, well that is a bit of a party pooper I guess :(

I am deploying cassandra cluster with docker-compose, but I can't really find a way to hack something using docker swarm but we can't seem to be able to leverage docker service scale. It almost works.

This issue will likely break any clustering mechanism that relies on node identity to provide seed or contact point, and it makes docker swarm unusable for all but the most simple use cases (scaling NGINX, I guess :).

Has anyone found a workaround that could work in conjunction with docker-compose v3? I will continue trying stuff, but having this issue prioritized would be a great enabler for a lot of use cases!

Of course, I could always revert back to spelling out precisely my deployment. One entry per services.

@aki-k
Copy link

aki-k commented Sep 10, 2019

I'm seeing this same behaviour with docker-ce-19.03.1-3.el7.x86_64.rpm in CentOS 7.
Docker's built-in DNS service returns a different IP address for the container than is configured in the container's /etc/hosts.

Edit: I'm using hostname: in Docker compose yaml. Maybe I'll remove it then.

@tartemov
Copy link

tartemov commented Jun 11, 2020

I use redis-sentinel in docker swarm, and this bug also breaks HA feature. Because when sentinel trying to promote master it breaks whole cluster (cause IP that returns by DNS doesn't match with container IP).

Is there some workaround?

PS: I don't use hostname in my compose file.

Docker version: 19.03.8

@pwFoo
Copy link

pwFoo commented Feb 6, 2021

Got the same problem since docker update to 20.3.x...
Still persistent after downgrade to previous version.

container ip

inet 10.0.10.6/24

dns resolution of service name

Address: 10.0.10.5

Found a simple solution by change endpoint mode to dnsrr. It's a replica 1 service and should be fixed if vip is replaced

    deploy:
      endpoint_mode: dnsrr

Was that changed with docker 20.x ?!

@Vanav
Copy link

Vanav commented Aug 16, 2021

Using swarm mode, if I have set container's hostname the same as service name, and using the default endpoint_mode: vip, then I got multiple extra IPs while resolving service name:
docker-compose.yml:

services:
  fpm:
    hostname: fpm
    deploy:
      replicas: 2

Withing any container:

# getent hosts fpm
10.0.5.3        fpm
10.0.5.4        fpm
10.0.5.2        fpm

10.0.5.2 is a virtual IP, others are containers' IP. This is a bug, because only a single virtual IP is expected here.

@GCSBOSS
Copy link

GCSBOSS commented Nov 1, 2022

I'm using swarm with a default overlay network and a few days ago some of the services received a second buggy ip address. When I do dig on the hostname of such service, I can see both VIPs. When I call them though, only one of them work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests