docker service update messes up VIP tables and "tasks" DNS entries #26772

evanp · 2016-09-21T08:44:02Z

Multiple docker service update calls make the VIP tables for the overlay network incorrect, and mess up the DNS lookups for tasks.<service name>.

Description

I noticed connectivity problems between services in my cluster. By launching a terminal and using curl and dig to examine the service names and "tasks" round-robin names, I realized that the map of IP addresses was incorrect.

Steps to reproduce the issue:

Create a 3-node swarm with a single manager and an encrypted overlay network testnet. I used docker-machine with the digitalocean driver.

parallel docker-machine create --driver digitalocean --digitalocean-access-token xxxxxxxxxx --digitalocean-image ubuntu-16-04-x64 --digitalocean-region sfo1 --digitalocean-private-networking --digitalocean-size 512mb test\{\} ::: 20 21 22
docker $(docker-machine config test20) swarm init --advertise-addr xx.xx.xx.xx
docker $(docker-machine config test21) swarm join --token SWMTKN-1-xxxxxxx xx.xx.xx.xx:2377
docker $(docker-machine config test22) swarm join --token SWMTKN-1-xxxxxxx xx.xx.xx.xx:2377
docker $(docker-machine config test20) node ls
# Output:
#ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
#4qfqe8ru5z5afme3003m4luk5 *  test20    Ready   Active        Leader
#ajaw2v30hidqgahivczw146lf    test21    Ready   Active        
#auf08eohiu8lbx210piriqi7w    test22    Ready   Active        
docker $(docker-machine config test20) network create --driver overlay --opt encrypted testnet

Create three simple web servers that show HTML with a line of text defined in an env var, and a terminal to query them.

for i in `seq 11 13`; do docker $(docker-machine config test20) service create --network testnet --name web${i} --env LINE="Test server ${i}" --replicas 3 fuzzyio/show-line; done
docker $(docker-machine config test20) service create --network testnet --name terminal ubuntu sleep 365d

Within the terminal, query the web11 and tasks.web11 (and 12 and 13) dns entries with dig, and check the output from curl.

for i in `seq 11 13`; do curl http://web${i}/; done
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 12</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 13</p>
  </body>
</html>
root@d3026eb35a1d:/# dig tasks.web11

; <<>> DiG 9.10.3-P4-Ubuntu <<>> tasks.web11
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27789
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.web11.           IN  A

;; ANSWER SECTION:
tasks.web11.        600 IN  A   10.0.0.3
tasks.web11.        600 IN  A   10.0.0.4
tasks.web11.        600 IN  A   10.0.0.5

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Sep 21 08:09:07 UTC 2016
;; MSG SIZE  rcvd: 110

root@d3026eb35a1d:/# dig tasks.web12

; <<>> DiG 9.10.3-P4-Ubuntu <<>> tasks.web12
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15666
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.web12.           IN  A

;; ANSWER SECTION:
tasks.web12.        600 IN  A   10.0.0.9
tasks.web12.        600 IN  A   10.0.0.7
tasks.web12.        600 IN  A   10.0.0.8

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Sep 21 08:09:11 UTC 2016
;; MSG SIZE  rcvd: 110

root@d3026eb35a1d:/# dig tasks.web13

; <<>> DiG 9.10.3-P4-Ubuntu <<>> tasks.web13
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35157
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.web13.           IN  A

;; ANSWER SECTION:
tasks.web13.        600 IN  A   10.0.0.13
tasks.web13.        600 IN  A   10.0.0.11
tasks.web13.        600 IN  A   10.0.0.12

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Sep 21 08:09:13 UTC 2016
;; MSG SIZE  rcvd: 110

Do multiple service update calls per service. I did 19 updates, just changing the LINE environment variable.

for j in `seq 2 20`; do for i in `seq 11 13`; do docker $(docker-machine config test20) service update --env-add LINE="Test server ${i} update ${j}" web${i}; done; done

Within the terminal service task container, again, use curl and dig to review the web11 and tasks.web11 DNS entries and the Web output.

Describe the results you received:

Lookup on web11 remained correct, but tasks.web11 has far too many IP addresses for scale=3 service.

dig tasks.web11

; <<>> DiG 9.10.3-P4-Ubuntu <<>> tasks.web11
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23513
;; flags: qr rd ra; QUERY: 1, ANSWER: 8, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tasks.web11.           IN  A

;; ANSWER SECTION:
tasks.web11.        600 IN  A   10.0.0.11
tasks.web11.        600 IN  A   10.0.0.8
tasks.web11.        600 IN  A   10.0.0.3
tasks.web11.        600 IN  A   10.0.0.16
tasks.web11.        600 IN  A   10.0.0.14
tasks.web11.        600 IN  A   10.0.0.15
tasks.web11.        600 IN  A   10.0.0.4
tasks.web11.        600 IN  A   10.0.0.5

;; Query time: 0 msec
;; SERVER: 127.0.0.11#53(127.0.0.11)
;; WHEN: Wed Sep 21 08:11:53 UTC 2016
;; MSG SIZE  rcvd: 245

curl sporadically failed to connect or connected to Web servers for different services.

for i in `seq 1 10`; do curl --connect-timeout 1 http://web11/; done   
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11 update 19</p>
  </body>
</html>curl: (28) Connection timed out after 1001 milliseconds
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 12 update 20</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 13 update 20</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11 update 20</p>
  </body>
</html><!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11 update 20</p>
  </body>
</html>curl: (28) Connection timed out after 1000 milliseconds
curl: (28) Connection timed out after 1001 milliseconds
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Show line</title>
  </head>
  <body>
  <h1>Show line</h1>
  <p>Test server 11 update 19</p>
  </body>
</html>curl: (28) Connection timed out after 1001 milliseconds

Describe the results you expected:

At scale=3, a lookup on tasks.web11 should return 3 IP addresses.

And the curl results (using the web11 name, which points to the VIP) should only return HTML from the server 11 service task containers.

Additional information you deem important (e.g. issue happens only occasionally):

The /proc/net/ip_vs output is attached.

I think this situation can arise with the ingress overlay network, too.

proc-net-ip_vs.txt

Output of docker version:

Client:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:33:38 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.1
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   23cf638
 Built:        Thu Aug 18 05:33:38 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 16
 Running: 3
 Paused: 0
 Stopped: 13
Images: 1
Server Version: 1.12.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 42
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null host bridge overlay
Swarm: active
 NodeID: 4qfqe8ru5z5afme3003m4luk5
 Is Manager: true
 ClusterID: 5blzvlhz9l1f1uvpouwa9yzht
 Managers: 1
 Nodes: 3
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 104.131.144.235
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-38-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 488.5 MiB
Name: test20
ID: DGRP:JBSI:O6PC:G2WE:ZGSI:EXKT:PWB6:WT5M:BVTQ:WYTY:BFWH:7UZH
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: evanp
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Labels:
 provider=digitalocean
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):

The text was updated successfully, but these errors were encountered:

xiaods · 2016-09-21T09:50:03Z

i found current report issue almost point the issue for you report.

#25394
#26480

evanp · 2016-09-21T10:21:06Z

@xiaods almost, but not quite. #26480 says the names aren't available on different hosts. That's not what's happening with this issue; the names are available on all hosts. #25394 says that the round-robin isn't routing to tasks on different hosts; I'm not checking that in this issue.

Neither of those issues do an update, so I think this issue is different from both.

One thing to note: I just tried updating my test servers to 1.13-dev, and doing a few dozen updates, and the problem does not occur! I see a small list of IPs in tasks.web11, and a bunch of curl calls come back correctly.

I'm going to try testing a little more to see if I can make this happen again in 1.13-dev, but if not I'll close this bug.

xiaods · 2016-09-21T10:41:19Z

@evanp i also came across the annoy issue related VIP. so wait your testing result.

thaJeztah · 2016-09-21T11:07:25Z

ping @mrjana ptal

mrjana · 2016-09-21T19:38:31Z

@evanp Thanks for taking the effort to test these with docker/docker master code. Many more fixes were added there and please try to reproduce this problem there and let us know.

evanp · 2016-09-21T21:04:09Z

@mrjana We upgraded our 90-node cluster this morning and saw a great improvement.

However, later in the day after a few updates, we're again seeing this error. I'm going to see if I can get some more detail.

evanp · 2016-09-22T02:20:37Z

Also, as of right now the only way I see to repair this situation once it has arisen is to burn the cluster. It might be possible to just remove all the services, remove the network, and then re-add the network and all the services.

It would be nice if you could use ipvsadm to edit and tune the ip_vs network, but I haven't been able to make that work. Even if I manage to nsenter into the right network namespace on one node, and make changes to the ipvs tables, it doesn't propagate to other nodes.

Probably the most frustrating part of this situation is that the data is available and correct in requests like docker service inspect <serviceid> or docker inspect <taskid>. I wrote a quick script for extracting the data here https://gist.github.com/evanp/8be3e3536dcb27bf16f8d47cb4c93cf5 . It would be interesting to put together a script that could extract that data and then use ipvsadm to synch it with the ip_vs network.

I'm going to retry the test scenario outlined above with 1.13-dev and see if I can replicate it and possibly get some debug information from logs when it occurs.

evanp · 2016-09-22T20:00:08Z

So, I spent some time this morning trying to replicate the error, and I couldn't do it in a test environment as described above.

My next step is to set the debugging flag on a Docker node in our production environment and then do an update on a service in that node. If it causes the same problem, we'll at least have the logs necessary to explain it.

xiaods · 2016-09-26T04:11:33Z

@evanp wait for your update.

aluzzardi · 2016-09-27T21:56:52Z

@evanp Thanks for the detailed information!

We're actually in the process of building 1.12.2-rc1 today which contains the fixes found in 1.13-dev.

In a few hours you should be able to try that version (and the official 1.12.2 should be out in a matter of weeks).

MichaelW-SD · 2016-09-27T22:52:59Z

Don't know if it helps but I am seeing this problem in a situation where docker service update fails. I have a service that consists out of two containers but after a failed update (due to errors in the image) I have dig tasks.<service> showing me 3 IP addresses. Removing and recreating the service does not fix the problem.
Most annoying part is that there is no manual way to fix an issue in the internal DNS.

evanp · 2016-09-29T04:10:44Z

@MichaelW-SD what do you mean by "fail"? How does the update fail when there are errors "in the image"?

clhlc · 2016-09-29T06:51:41Z

In my test enviroment , i find the same error when update service. and in my case ,i recreate service some time, so dig , output some diff vip . docker 1.13 fixed?

woyorus · 2016-09-29T12:56:41Z

@clhlc looks like 1.12.2-rc1 fixed the problem in my particular case.

thaJeztah · 2016-09-29T13:08:19Z

If others are able to test if 1.12.2-rc1 is fixing this, that would be great (note of course, it's an RC, so generally we don't recommend testing it on critical / production systems) https://github.com/docker/docker/releases/tag/v1.12.2-rc1

MichaelW-SD · 2016-09-29T19:28:21Z

@evanp In my case I had a faulty image so that containers based on that image would not start. docker service update keeps trying forever to update the containers but it never succeeds, so eventually you have to stop the update process. After that DNS was not cleaned up. After I fixed the image and started the service again I had 2 containers but 3 IP addresses. 2 addresses were correct and one was a leftover from before the update.

I will test this with 1.12.2-rc1.

MichaelW-SD · 2016-10-01T04:51:57Z

1.12.2-rc1 fixed this for me.

thaJeztah · 2016-10-01T10:54:53Z

@evanp was this fixed for you as well on 1.12.2-rc?

evanp · 2016-10-01T18:28:14Z

So, we still saw this error with 1.13-dev as of this morning. We've been unable to get any purchase on the bug, and so we're regretfully moving to another clustering tool. I'm happy to help out with this bug if there's anything further I can do, but we no longer have a production cluster running Docker 1.12.x in swarm mode.

Also, feel free to close this bug if there aren't others seeing the same problem.

aluzzardi · 2016-10-01T20:12:51Z

/cc @mrjana

mrjana · 2016-10-01T21:52:51Z

@evanp When you tried 1.13-dev from this morning can you tell me what failures you had? Did you have incorrect tasks.<svc> response or did you have connectivity problems or both? Although you described your problem in detail as part of opening this issue it may not be exactly the same issue you are experiencing now. That's why I am asking for the details. Can you give a simple set of repro steps of how you encountered issues with latest 1.13-dev? Also if you can, can you attach daemon logs from one of the problem nodes? Thanks much for your help in this.

garthk · 2016-10-05T06:21:53Z

This issue smells like it's within cooee of #25266. My temperature is certainly within cooee of whomever on @evanp's team declared Swarm unfit for production. I'll try pounding docker service update. Meanwhile, I trust @mrjana is prowling the corridors looking for whomever wrote all this code without enough instrumentation to troubleshoot issues.

mrjana · 2016-10-12T23:16:32Z

@garthk The instrumentation is really in the daemon error logs and every issue that is fixed in 1.12.2 as you can see is based on such instrumentation. That is why I am asking for daemon logs. Do you have any from problem nodes so that we can confirm or deny if this is already fixed in 1.12.2?

GordonTheTurtle added the version/1.12 label Sep 21, 2016

justincormack added the area/swarm label Sep 21, 2016

evanp mentioned this issue Sep 21, 2016

IP for one service's task is in rotation for another service's VIP #26733

Closed

thaJeztah added the area/networking label Sep 21, 2016

aluzzardi assigned dperny Sep 27, 2016

aluzzardi closed this as completed Oct 18, 2016

YouriT mentioned this issue Jan 16, 2017

Wrong behaviour of the DNS resolver within Swarm Mode overlay network #30134

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker service update messes up VIP tables and "tasks" DNS entries #26772

docker service update messes up VIP tables and "tasks" DNS entries #26772

evanp commented Sep 21, 2016

xiaods commented Sep 21, 2016

evanp commented Sep 21, 2016

xiaods commented Sep 21, 2016

thaJeztah commented Sep 21, 2016

mrjana commented Sep 21, 2016

evanp commented Sep 21, 2016 •

edited

evanp commented Sep 22, 2016

evanp commented Sep 22, 2016

xiaods commented Sep 26, 2016

aluzzardi commented Sep 27, 2016

MichaelW-SD commented Sep 27, 2016

evanp commented Sep 29, 2016

clhlc commented Sep 29, 2016 •

edited

woyorus commented Sep 29, 2016

thaJeztah commented Sep 29, 2016

MichaelW-SD commented Sep 29, 2016

MichaelW-SD commented Oct 1, 2016

thaJeztah commented Oct 1, 2016

evanp commented Oct 1, 2016

aluzzardi commented Oct 1, 2016

mrjana commented Oct 1, 2016

garthk commented Oct 5, 2016

mrjana commented Oct 12, 2016

docker service update messes up VIP tables and "tasks" DNS entries #26772

docker service update messes up VIP tables and "tasks" DNS entries #26772

Comments

evanp commented Sep 21, 2016

xiaods commented Sep 21, 2016

evanp commented Sep 21, 2016

xiaods commented Sep 21, 2016

thaJeztah commented Sep 21, 2016

mrjana commented Sep 21, 2016

evanp commented Sep 21, 2016 • edited

evanp commented Sep 22, 2016

evanp commented Sep 22, 2016

xiaods commented Sep 26, 2016

aluzzardi commented Sep 27, 2016

MichaelW-SD commented Sep 27, 2016

evanp commented Sep 29, 2016

clhlc commented Sep 29, 2016 • edited

woyorus commented Sep 29, 2016

thaJeztah commented Sep 29, 2016

MichaelW-SD commented Sep 29, 2016

MichaelW-SD commented Oct 1, 2016

thaJeztah commented Oct 1, 2016

evanp commented Oct 1, 2016

aluzzardi commented Oct 1, 2016

mrjana commented Oct 1, 2016

garthk commented Oct 5, 2016

mrjana commented Oct 12, 2016

evanp commented Sep 21, 2016 •

edited

clhlc commented Sep 29, 2016 •

edited