New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swarm VIP stops working on a node #25693
Comments
Could your provide some more information, as requested in the issue template, and if possible, steps to reproduce?
Without this, it'll be hard to find if there's a bug here, or to resolve it. |
Sorry for lack of info.
I'm trying to run a Swarm on four identical physical servers, they are in same network, in same DC. Maybe I could run some diagnostics? |
@vasily-kirichenko we fixed a bunch of issues in master. could you please try 1.12.1-rc1 (https://github.com/docker/docker/releases/tag/v1.12.1-rc1) and let us know how it goes. |
If I try to join the swarm, I get the following error:
|
|
@mavenugo I installed 1.12.1-rc1 on all my four nodes. Three of them are formed a Sworm, but when I try to join the rest node, I get this error:
|
Is the clock set correctly on all the nodes? On 14 Aug 2016 3:03 p.m., "Vasily Kirichenko" notifications@github.com
|
@justincormack ooooh. It's not. Great point. Will fix it and see if it helps. Thanks! |
Done. Swarm is up and running:
The service containing single replica is running as well:
Try to access the service via each node:
So I can successfully access the service via #3 and #4 nodes, but not via #1 and #2, even though Docker shows all the nodes are OK. |
firewalld is stopped and disabled on all the machines. What I should check next? |
ifconfig shows about 10 interfaces like this:
Is it normal? |
@vasily-kirichenko You seem to be getting some http body when you curl the hosts where the request is not working. Where is that coming from? |
@mrjana I believe it's from the corporate proxy server. I added all four node IPs to |
Service VIP port is not open on h1 and h2 machines:
However, it's open on h3 and h4:
Current service state:
|
If I increase the number of containers such that a container is running on h1, then port 33030 is open on that node. If I decrease the number of containers so that the container running on h1 shut down, then the port is immediately closed. However, it does not work for node h2 - even if several containers are running on it, the port 33030 is not open. |
It turns out DOCKER-INGRESS iptables chain does not exist on nodes h1 and h2 (the problematic ones).
Any ideas? |
OK, I destroyed the swarm and recreated it from scratch, which seems to help, DOCKER-INGRESS appears on all the nodes and service is available via any of them. |
Everything works OK. I think the problem was caused by not synchronized clocks (~2 hours divergence). Closing it. |
The text was updated successfully, but these errors were encountered: