Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static/Reserved IP addresses for swarm services #24170

Open
F21 opened this issue Jun 30, 2016 · 104 comments
Open

Static/Reserved IP addresses for swarm services #24170

F21 opened this issue Jun 30, 2016 · 104 comments
Labels
area/networking area/swarm kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@F21
Copy link
Contributor

F21 commented Jun 30, 2016

There are some things that I want to run on docker, but are not fully engineered for dynamic infrastructure. Ceph is an example. Unfortunately, its monitor nodes require a static ip address, otherwise, it will break if restarted. See here for background: ceph/ceph-container#190

docker run has an --ip and --ip6 flag to set a static IP for the container. It would be nice if this is something we can do when creating a swarm service to take advantage of rolling updates and restart on failures.

For example, when we create a service, we could pass in a --static-ip and --static-ip6 option. Docker would assign a static ip for each task for the life of the service. That is, as long as the service exists, those ip addresses would be reserved and mapped to each task. If the task scales up and down, then more ip addresses are reserved or relinquished. The ip address is then passed into each task as an environment variable such as DOCKER_SWARM_TASK_IP and DOCKER_SWARM_TASK_IP6

@nktl
Copy link

nktl commented Jul 5, 2016

Agreed, this is crucial functionality for some services. I suspect it might be a bit complicated to implement with in new swarm model, as for every 'service' at least two IP addresses exist: one Virtual LB IP for the service itself and then N of additional IPs, where N = number of replicas.

I think what we really need is an option to deploy a service without replicas and LB layer - just with simple static IP configured (but still managed by swarm, with clustering, HA, failover, etc).

@thaJeztah thaJeztah added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny area/networking area/swarm labels Jul 15, 2016
@dminca
Copy link

dminca commented Oct 17, 2016

This could be very useful, as I'm currently struggling with ActiveMQ Network of Brokers, and the activemq.xml configuration file requires static IP addresses of brokers in the case of static discovery ...

@azzeddinefaik
Copy link

I am facing the same issue! is there any update to set a static ip for a container in swarm mode ?

@thaJeztah
Copy link
Member

Please don't leave +1 comments on issues, you can use the 👍 emoji in the first
comment to let people know you're interested in this, and use the subscribe button
to keep informed on updates.

Implementing this feature is non-trivial for a number of reasons;

  • There's two possible feature requests here;
    • Allow a static IP for the service (Virtual IP)
    • Allow a static IP for the container (task)
  • When looking at static IP-addresses for containers, things become complicated,
    because a service can be backed by multiple tasks (containers). Specifying a
    single IP-address for that won't work; also, what to do when scaling, or updating
    a service (in which case new tasks are created to replace the old one)

Just +1's don't help getting this implemented; explaining your use-case, or
help find a design to implement this on the other hand would be useful.

Also see #29816, which has some information for one use-case.

I removed the +1 comments, and copied the people that left a +1 below so that they're still "subscribed" to the issue;

@a-jung
@adiospeds
@darkstar42
@dzx912
@GuillaumeM69
@isanych
@jacksgt
@joseba
@martialblog
@nickweedon
@olivpass
@prapdm
@wubin1989

@jacksgt
Copy link

jacksgt commented May 11, 2017

Essentially, I'd like to use Docker Swarm's routing mesh as a load balancer. Being able to assign a (public) IP to a Docker Swarm service (not an individual container!), one could simply add the IP to one's DNS provider (e.g. CloudFlare). For example I could then run my S3 service and web server with:

docker service create --hostname s3.docker.swarm --publish 80 --ip dead::beef minio/minio
docker service create --hostname www.docker.swarm --publish 80 --ip cafe::babe nginx

Swarm then tells the node which has the appropriate IP in its subnet to forward all request from dead::beef to one of the s3.docker.swarm-containers and cafe::babe to one of the www.docker.swarm-containers (within the routing mesh).

This way, one could also run multiple services on the same port within the same Swarm cluster (assuming different IP addresses and different domains). This is currently only possible with an additional load-balancer such as HA-Proxy (which only supports TCP for example).

See also: https://devops.stackexchange.com/questions/1130/assign-dns-name-to-docker-swarm-containers-with-public-ipv6

@blueskyjunkie
Copy link

Some licensed applications require a static IP for the license. I have licensed applications that I would like to deploy to Docker Swarm that require either static IP or MAC address for licensing.

I think it would be initially acceptable to state a limitation that specifying either a static IP or MAC for a service implies that the scale must be one.

Perhaps when scaling, the new replicas will fail to start with a "static IP required" error message informing the user that they need to go back and provide static IP for a "static IP" service to start correctly?

So in docker service ls you would see 3/5 replicas started and the remaining 2 replicas would have the "static IP" error message until resolved.

@drnybble
Copy link

drnybble commented Jun 28, 2017

This may help some people: if you combine the 'hostname' setting for a service with the 'endpoint-mode' setting to dnsrr you get a hostname that resolves to the container IP. This may make some software happy to run in the swarm.

@gabrielfsousa
Copy link

any news ? can we assign a static ip to a service ?

@rohithpeddi
Copy link

rohithpeddi commented Oct 25, 2017

I searched around and tried different possibilities and I was able to assign static ips to containers. I am pretty new to docker, I don't know if this is the right way though.

I created a swarm. In manager, I created an attachable overlay network with a subnet.

docker network create -d overlay --subnet=10.0.9.0/22 --attachable overlay-network

My docker compose:

version: "3.2"

services:
  hadoop-master:
    image: devhadoop-master
    container_name: hadoop-master
    .
    .
    .
    networks:
      overlay-network:
        ipv4_address: 10.0.9.22

networks:
  overlay-network:
    external: true

Note the configuration under default networks where external is true. That is what made it work.
Though I created the overlay-network with attachable flag, service in my worker was not able to connect to it. So I ran a docker run --rm --net overlay-network alpine sleep 1d to make overlay-network discoverable, then my worker-service able to connect to it.

@simingweng
Copy link

We recently ran into a situation where the ability to reserve a VIP for a service would really help.

We have a zookeeper service named "zoo1" and other services are connecting to it using "zoo1:2181". We found sometimes when we shut down and restart the "zoo1" service, the IP address resolved by host name "zoo1" changed. The reconnection mechanism in the zookeeper client library does not try redoing the DNS lookup when it enters the retry loop, instead, it holds onto the previous IP address. As a result, even we bring the "zoo1" service back online, other services are never able to re-establish the connection to zookeeper anymore.

By the way, under what exact circumstance would trigger a change to a VIP of a service?

@softcombiz
Copy link

@cpclass you example don't work

@tianon
Copy link
Member

tianon commented May 17, 2018

I don't see macvlan mentioned here anywhere, so here goes! I think this would be immensely useful for setting up the same macvlan network on all nodes in a cluster, then being able to give a service a static IP address that can then roam between machines in the cluster but keep the same IP address via macvlan (so --ip would have to be a per-network option, which is probably obvious but worth being explicit about).

(As it stands, services can't easily both talk to other services in the cluster and do multicast with other machines on the host network, which is what this sort of thing would enable.)

(Edit: there's a great post about how macvlan in a cluster might work that describes exactly the thoughts I had at #25303 (comment) -- using either an IP address range or an interface glob to auto-match the appropriate parent interface for each host)

macvlan may be a better solution instead of host networking in this case but it's not supported by services. Also, a macvlan network need to be setup on each host in the cluster as it is implemented right now. Maybe some cluster wide setup may be possible to achieve by giving the "docker network" command an interface match condition, like IP-network address or interface name. Then a macvlan network may be setup on all hosts matching the condition in just one command. For the "docker service" command the network specification can be handled as a constraint so that containers are not scheduled on a host that does not have the specified network defined.

@Julius2342
Copy link

To add a usecase. We want to add some services like

  • pypi-daemon for hosting own packages,
  • smtpd for incoming email,
  • dnsd,
  • ftpd (with some plugins which talk to our internal APIs)
  • munin
  • redmine
  • ...

as a dedicated services stack within our swarm. Some of the services contain patches or even are our own implementations. Ideally they are tested by AQA and deployed and teared down automatically. Some need to talk to other stacks like the productions stacks.

Some of these services need to be reachable with a fixed IP from internal and/or external networks.

@RayBa82
Copy link

RayBa82 commented May 22, 2018

another use case:

use a container as DNS server for other containers.

(this could be easier to achieve if you could use a docker container hostname as dns server entry via --dns)

@jsenecal
Copy link

Another use case:
DualStack Web services with multiple replicas in Docker Swarm. Being able to assign static v4 and v6 IPs would be great for this :)

@MoominCat
Copy link

Another use case:
I have an application (containerised) that reads from IP based sensor devices.
This application can only be configured with the IP address of the sensor devices.
For testing, I want to run simulated sensor devices (containerised) on fixed IP addresses.

@mateodelnorte
Copy link

Use case: Running a STUN or TURN server, or other similar type of server used to help p2p clients find each other to accomplish p2p discovery and NAT Traversal. This is required for technologies like WebRTC.

@ashak
Copy link

ashak commented Aug 24, 2018

Use case: Samba ad domain controller, with a changing IP it just makes s mess of its DNS zone file and is not usable.

@khba
Copy link

khba commented Aug 29, 2018

Another use case: setting up a mysql cluster (mgm + ndb + sql nodes), requires giving the nodes IPs in the conf files

@johny-mnemonic
Copy link

@khba that applies to a lot of other clustered things. For example Vertica database also uses static IPs for all nodes in the cluster.
And the reason I found this issue is that I am playing with distributed filesystem called Lizardfs, which would also benefit from having static IPs for nodes holding the data chunks...

@FrostbyteGR
Copy link

FrostbyteGR commented Nov 27, 2020

Let me provide another example.
Please imagine:

1. Google search host [www.google.com](http://www.google.com) will get resolved by the google DNS servers to 100 IP addresses with TTL 600 seconds;

2. then Google makes the decision to scale down HTTP servers to 50 containers;

3. Google DNS servers now respond to DNS requests with the new IP address value of HTTP servers.

From your point of you that is correct behavior. But not from mine.
Because all browsers in the world is know the IP addresses of 100 HTTP containers before scaling down and all browsers in the world will try to reach HTTP containers not available more and will get connection timeout errors.

From your point of view "bug located in browser client and need to fix all browsers in the world".
From my point of view if DNS informed the world host in the world that IP address server HTTP traffic for 600 seconds when all of this IP must work until DNS cache is expired on clients.

I want to ask you.
If we cannot scale down containers that sever Google site until expired cache in the browser clients, why you think SIP client must use other behavior?

In real-world DNS servers exposed IP address of load-balancers.To be closer to the real world please replace the term HTTP containers by load balancer containers.

Also to be closer to the real world usage, you can scale down from 2 containers to 1 container. In this case, you will get 50% of failed requests. So you simply cannot scale down any service because this requires to implement connection timeout handling .on the client-side.

Allow me to introduce you to the standard operating procedure for such things:

  1. First you remove the records
  2. Then you wait for the TTL
  3. Then you scale down / remove IPs (optionally you can point those IPs temporarily to another host)

If you leave your DNS alive and you scale down / remove IPs, everything is going to fail, no matter if it's docker or whatever.
If you can't remove/manage records that are critical to you, because they originate from docker's internal DNS, you shouldn't be using a DNS service you cannot fully control in the first place. For such things you need your own DNS solution.

Also setting TTL to 600 when dealing with containers is not very smart to say the least 

You can say this for all admins that use Kubernets (EKS, google and other clouds), All of these use 600 TTL for internal Kubernetes DNS server.

Software that doesn't do retries in such case is really bad and should be fixed.

Looks as you not have real service redundancy experience. Because errors like "connection timeout" also may break service. You can try open any web-server that hosted on two IP servers (addresses) and then shutdown one server.

Then you will describe to customers why this website does not work.

To me it looks like you do not have real administration experience in general. Before moving on to concepts like service redundancy, scaling and containerization, you have to understand how all of the platform's underlying services work and then you also have to understand the services you're going to deploy yourself. If you don't know how to manage those services individually, you cannot possibly hope to manage them in a more complex environment. It doesn't matter where the DNS is; it's still a DNS and the same principles apply.

Also setting TTL to 600 when dealing with containers is not very smart to say the least 

@johny-mnemonic please show me how you can change the default TTL value of the embedded docker swarm DNS server.
And then show how to change the TTL values only for the SIP server and do not touch the default TTL value of the embedded swarm DNS server.

Then I will make a decision your approach is smart or does not.

Can you explain to me why you're on a holy war about DNS and its records' TTL, when these have nothing to do with my proposal of how static IP assignment could be implemented? The way I suggested it, is purely static. It's not a range, it's a list of individual elements that can be whatever you want them to be (as long as it makes sense). The way they should be assigned should be serial. That way you always know which replica gets which address on the list. And when you downscale or upscale, you still know which is which because that's also done in a serial fashion. Downscaling doesn't take random replicas down, a downscaling of 50 should take replicas 51-100 down. Upscaling doesn't generate replicas in random order, an upscaling of 25 should create replicas 51-75 in order. Then the matching (by index) array elements would be deassigned/assigned correspondingly. Everything should be predictable.

Stop derailing this topic.

@johny-mnemonic
Copy link

Thanks @FrostbyteGR for clarifying it in a way that hopefully is clear to even someone who even think about using an internal Docker/Kube DNS for serving public records 😲
Also seems never have heard about "Design for failure and nothing will fail" principle when defending the client, that can't retry :-(

Anyway as stated previously I really like your design and would love to see it implemented in swarm.
I am just a bit afraid we are beating a dead horse here 😢

@FrostbyteGR
Copy link

FrostbyteGR commented Nov 29, 2020

Anyway as stated previously I really like your design and would love to see it implemented in swarm.

Also, to avoid confusion and really showcase why this is supposed to be static/predictable; I believe I should explain in more detailed examples, how my suggestion is supposed to work (I also added a range option to it + another temporary solution suggestion):

Option 1. Require an amount of static IPs equal to that of the number of requested replicas

Individual assignment (aka: put in whatever you want, as long as it's valid):

replicas: 4

ipv4_address: { 192.168.1.17, 192.168.1.42, 192.168.1.23, 192.168.1.65 }

Case 1 - Deploying the stack
Replica 1 reserves and is assigned value on array index 0: 192.168.1.17
Replica 2 reserves and is assigned value on array index 1: 192.168.1.42
Replica 3 reserves and is assigned value on array index 2: 192.168.1.23
Replica 4 reserves and is assigned value on array index 3: 192.168.1.65

Case 2 - Downscaling to 2 replicas
Replica 4 is taken down, value on array index 3 remains reserved for Replica 4.
Replica 3 is taken down, value on array index 2 remains reserved for Replica 3.
Replica 2 remains as-is (up).
Replica 1 remains as-is (up).

Case 3 - Upscaling back to 3 replicas
Replica 1 remains as-is (up).
Replica 2 remains as-is (up).
Replica 3 is brought up, gets reassigned it's reserved value on array index 2: 192.168.1.23
Replica 4 remains as-is (down), value on array index 3 still remains reserved for Replica 4.

Case 4 - One or more replicas crashes
Replica 1 remains as-is (up).
Replica 2 crashes, upon recovery (assuming new container name myservice.2.6ca62afb74dc0d7af370ff49c6), gets reassigned it's reserved value on array index 1: 192.168.1.42
Replica 3 remains as-is (up).
Replica 4 crashes, upon recovery (assuming new container name myservice.4.1ae8d532b78a48cab0f8a6fc2), gets reassigned it's reserved value on array index 3: 192.168.1.65

Range assignment:

I ultimately decided to also include this as an option in my suggestion, because it can operate the same way.
Plus, it would be helpful not writing hundreds of IP addresses, on larger deployments.

replicas: 4

ipv4_address: { 192.168.1.21-192.168.1.24 }

The ipv4_address string gets translated into an array of 4 elements, but this time the values are in sequential order.
Resulting array from this range-input would be: { 192.168.1.21, 192.168.1.22, 192.168.1.23, 192.168.1.24 }

Case 1 - Deploying the stack
Replica 1 reserves and is assigned value on array index 0: 192.168.1.21
Replica 2 reserves and is assigned value on array index 1: 192.168.1.22
Replica 3 reserves and is assigned value on array index 2: 192.168.1.23
Replica 4 reserves and is assigned value on array index 3: 192.168.1.24

Case 2 - Downscaling to 2 replicas
Replica 4 is taken down, value on array index 3 remains reserved for Replica 4.
Replica 3 is taken down, value on array index 2 remains reserved for Replica 3.
Replica 2 remains as-is (up).
Replica 1 remains as-is (up).

Case 3 - Upscaling back to 3 replicas
Replica 1 remains as-is (up).
Replica 2 remains as-is (up).
Replica 3 is brought up, gets reassigned it's reserved value on array index 2: 192.168.1.23
Replica 4 remains as-is (down), value on array index 3 still remains reserved for Replica 4.

Case 4 - One or more replicas crashes
Replica 1 remains as-is (up).
Replica 2 crashes, upon recovery (assuming new container name myservice.2.6ca62afb74dc0d7af370ff49c6), gets reassigned it's reserved value on array index 1: 192.168.1.22
Replica 3 remains as-is (up).
Replica 4 crashes, upon recovery (assuming new container name myservice.4.1ae8d532b78a48cab0f8a6fc2), gets reassigned it's reserved value on array index 3: 192.168.1.24

--ip6 (and it's corresponding json option) should follow the same logic.

Option 2. Make --ip and --replicas arguments and corresponding json settings, mutually exclusive

This is best suited as a temporary solution to the problem and not a permanent one. Until a proper solution arrives.

  • This effectively means that you can have either or.
  • You can't have those arguments/json-options together for the definition or deployment of a service/stack.
  • Defining --replicas (or it's corresponding json option) should lock you out of the --ip argument (or it's corresponding json option).
  • Defining --ip (or it's corresponding json option) should lock you out of the --replicas argument (or it's corresponding json option), also enforces --replicas 1 (or replicas: 1 within the json).

Again, --ip6 (and it's corresponding json option) should follow the same logic.

@siddjellali
Copy link

any news here ?

@hinorashi
Copy link

hi guys, any update, is swarm going to end its life despite what Mirantis said 🐶

@MichaelBrenden
Copy link

MichaelBrenden commented Feb 5, 2021

Still evaluating how to best meet this aging need. Maybe others are facing this, too.

Our simple need is
(1) on a single host having multiple NICs, each NIC having multiple IP addresses (ipv4) and numerous services in complex arrangements -- despite its awesomeness, Docker is still a duckling in established systems
(2) make Docker Swarm answer/respond on only a few, specific IP addresses, not all IP addresses on the host; Traefik runs as reverse-proxy in its own swarmed container -- Traefik is primary target of specific IP addresses (trying to put Traefik in its own container outside of swarm causes its "internal" API and dashboard to become inaccessible)
(--) problem seems related to Traefik's need to itself run swarmed for api/dashboard to work
(--) problem also seems related to Docker Swarm's difficulty (or outright inability?) to bind to specific IP addresses on host with complex networking

Options
(A) Don't do it this way; instead, rent fresh host whose IP stack can be totally taken over by Docker Swarm
(B) Investigate macvlan which is claimed workable by those in a parallel universe ( see moby/libnetwork#2249 )
(C) See also https://stackoverflow.com/questions/27937185/assign-static-ip-to-docker-container
(D) Consider https://serverfault.com/questions/893647/publish-docker-swarm-services-on-specific-ip-addresses
(E) See also #29816
(F) See also https://success.mirantis.com/article/how-do-i-change-the-docker-gwbridge-address
(G) #25303
(H) https://stackoverflow.com/questions/52301256/traefik-ha-in-docker-swarm-mode
(I) https://tiangolo.medium.com/docker-swarm-mode-and-traefik-for-a-https-cluster-20328dba6232 (dockerrocks)
(J) https://tiangolo.medium.com/docker-swarm-mode-and-distributed-traefik-proxy-with-https-6df45d0c0fc0

We're seeing Docker Swarm 'take over' all the hosts's IP addresses via iptables (despite having defined a Docker Network thusly:

#!/bin/bash
#! define custom docker network, intentionally limited to one public IP on host
docker network create
--subnet=172.20.20.0/24
--gateway=172.20.20.1
--driver overlay
--attachable
--opt "com.docker.network.bridge.name"="net2020"
--opt "com.docker.network.bridge.host_binding_ipv4"="1.2.3.4"
net2020
#! end

In other words, it's not the host IP binding but that Docker iptables manipulation captures all port 80/443/etc to all IPs on the host. Therefore, this looks most immediately applicable, (from D) Quoting user name NewsNow1 -----

We had a need to publish separate docker swarm services on the same ports, but on separate specific IP addresses. Here's how we did it.

Docker adds rules to the DOCKER-INGRESS chain of the nat table for each published port. The rules it adds are not IP-specific, hence normally any published port will be accessible on all host IP addresses. Here's an example of the rule Docker will add for a service published on port 80:

iptables -t nat -A DOCKER-INGRESS -p tcp -m tcp --dport 80 -j DNAT --to-destination 172.18.0.2:80

(You can view these by running iptables-save -t nat | grep DOCKER-INGRESS).

Our solution is to publish our services on different ports, and use a script that intercepts dockerd's iptables commands to rewrite them so they match the correct IP address and public port pair.

For example:

service #1 is published on port 1080, but should listen on 1.2.3.4:80
service #2 is published on port 1180, but should listen on 1.2.3.5:80

We then configure our script accordingly:

###! cat /usr/local/sbin/iptables
#!/bin/bash

REGEX_INGRESS="^(.DOCKER-INGRESS -p tcp) (--dport [0-9]+) (-j DNAT --to-destination .)"
IPTABLES=/usr/sbin/iptables

SRV_1_IP=1.2.3.4
SRV_2_IP=1.2.3.5

ipt() {
echo "EXECUTING: $@" >>/tmp/iptables.log
$IPTABLES "$@"
}

if [[ "$*" =~ $REGEX_INGRESS ]]; then
START=${BASH_REMATCH[1]}
PORT=${BASH_REMATCH[2]}
END=${BASH_REMATCH[3]}

echo "REQUESTED: $@" >>/tmp/iptables.log

case "$PORT" in
'--dport 1080') ipt $START --dport 80 -d $SRV_1_IP $END; exit $?; ;;
'--dport 2080') ipt $START --dport 80 -d $SRV_2_IP $END; exit $?; ;;
*) ipt "$@"; exit $?; ;;
esac
fi

echo "PASSING-THROUGH: $@" >>/tmp/iptables.log

$IPTABLES "$@"

N.B. The script must be installed in dockerd's PATH ahead of your distribution's iptables command. On Debian Buster, iptables is installed to /usr/sbin/iptables, and dockerd's PATH has /usr/local/sbin ahead of /usr/sbin, so it makes sense to install the script at /usr/local/sbin/iptables. (You can check dockerd's PATH by running cat /proc/$(pgrep dockerd)/environ | tr '\0' '\012' | grep ^PATH).

Now, when these docker services are launched, the iptables rules will be rewritten as follows:

iptables -t nat -A DOCKER-INGRESS -d 1.2.3.4/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 172.18.0.2:1080
iptables -t nat -A DOCKER-INGRESS -d 1.2.3.5/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination 172.18.0.2:2080

The result is that requests for http://1.2.3.4/ go to service #1, while requests for http://1.2.3.5/ go to service #2.

The script can be customised and extended according to your needs, and must be installed on all nodes to which you will be directing requests, and customised to that node's public IP addresses.

More on this added here --
https://wp.brenden.com/2021/02/05/traefik-on-docker-swarm-with-fixed-ip-static-reserved-ip-addresses-for-swarm-services/

Some posts suggest removing docker_gwbridge network, then recreating it with docker network create -o "com.docker.network.bridge.host_binding_ipv4"="192.168.1.151" docker_gwbridge, then running in host mode. Because docker_gwbridge is used by all services running on node's engine (swarm or not), this "solution" precludes adjusting some services and some IP addresses -- it's too much, but in the opposite direction.

Usually when simple requests encounter such incredible problem it indicates fundamental, overwhelming difference -- like squirting a garden hose upstream into a river.

@FrostbyteGR
Copy link

FrostbyteGR commented Feb 27, 2021

I figured out a workaround for those that absolutely cannot do without static IP addresses inside their swarm.
It's a very very very ugly workaround, but a working one nonetheless. It's what I resorted to, until we can see a proper solution from the team.

DISCLAIMER:
This requires that your services are running with no more than 1 replicas.
This requires that you have administrative access to your network equipment as well as your swarm hosts.
This example makes use of the macvlan driver, with a little bit of research and tinkering, I feel that you may be able to adapt it into other scenarios as well.

For this example we will assume the following topology:
[PC] ----- [MAIN ROUTER] ----- [ROUTER] ----- [SWARM HOSTS]

  1. What I am doing is simply creating a VLAN for every Service that I am going to make.
  2. The VLANs that I am making are /30. (Therefore 1x NetID, 1x Host, 1x Gateway, 1x Broadcast)
  3. But I am summarizing it backwards under a /24 network in another router in front. (In case you want to do the same, just make sure that the destination router itself has a blackhole/null0 route, should you request something from the /24 network that you haven't created yet, thus ending up in a routing loop until the TTL expires)
  4. For every VLAN/Service that I have created, I have assigned an address on the router's side, to act as the gateway.
  5. Every VLAN has also been configured (just the VLAN ID - no need to waste an address here) on each of the Swarm Hosts.

Then, before I launch my service, I create a network exclusively for it.
WARNING: This needs to be done on every Swarm Host which may run the Service, regardless if it's a manager or a worker.
docker network create --config-only --subnet <NetID>/30 --gateway <Gateway> --ip-range <Service IP>/32 --opt parent=eth0.<VLAN ID> local_myService

Afterwards, I go to any of my managers and expand my network's scope to the swarm.
docker network create --driver macvlan --scope swarm --config-from local_myService swarm_myService

This way we're enforcing the internal DHCP to assign the IP we want.
The only downside to this is that we're ending up with lots of VLAN interfaces and routes. (Plus you might not get to pick the exact addresses you may want)
This example won't work with just one interface/gateway/subnet because docker will complain if some other container on the host is already using them. (And if all of your hosts have at least one Service container running, your HA may not work as well)

@si-kotic
Copy link

I've been tearing my hair out trying to either figure this out or work around it.
In my case I have two services, one needs to send data to the other and the only option is an IP address, otherwise I would just reference the service name.
If this is not possible in the near future, being able to resolve the service name to an IP within the compose file would make things easier. For example ${service_name.ipv4_address}

@White-Raven
Copy link

It's already boring enough to have to run glusterfs of each node of a swarm just to not have some containers reset to 0 or rollback when the node they usually ran on goes down for X reason.

Now imagine with metric servers and data bases. Here my use-case:

Let's say I want to collect metrics, logs and stuff. For maximum uptime I wanna have these services run on the swarm for high availability.
But if for that I punch in the IP of one of the nodes and tell my hosts that they have to send their syslogs there, or that the influxDB is running there, or that they log in SNMPv3 there, and the node goes down... then what? Well it just fails. So much for uptime and stuff.

Wouldn't it be better if these services just spawned on the same real network all these hosts share on an IP that doesn't change, no matter which node is up or down? It would solve soooo much headaches.

@johny-mnemonic
Copy link

@White-Raven I am afraid you are messing unrelated things together.
What you seamingly want, is to have a service running in a swarm exposed to the rest of the network on a static IP.
This is completely different thing than what this issue is about and can be easily done by a loadbalancer.
This issue is about being able to define static IP (internal to the swarm cluster, not visible from the outside network), to the service or a container.
BTW: What you mean by the necessity to have a gluster running on each swarm node. Could you please describe the use case you have with gluster and what forces you to have it on all nodes?

@White-Raven
Copy link

@johny-mnemonic Oh well, funnily enough just a week ago I was way more 'green' concerning docker swarm than I am now, and I had misunderstandings about docker swarm's networking.
I pretty much 'solved' my issue through a combination of keepalived and HAproxy, but I ended up disliking the chonkiness of the method that I migrated onto k3s + kube-vip, which just felt more practical.

I ran glusterFS was to sync up the containers' config files and data, have some persistent storage.
That was a mean to have high availability in case some nodes of the swarm went down, so the containers affected would spin up on an other node without rolling back or resetting to blank slate, which is the kind of thing you don't want on stuff like metric and log servers.

It's an homelab/dev environment meaning I'm trying stuff, so hard resets or kernel panics aren't unheard of, so having a sturdy piece of volume syncing software for high availability was very welcome.

@johny-mnemonic
Copy link

@White-Raven no worries, we all start green ;-)
I was asking about GlusterFS as I am searching for persistent storage for my homelab and now I am debating between GlusterFS and MinIO. In case you have good experience with GlusterFS, would you share the setup that works for you? We can move to reddit if you're willing to chat about it.

@White-Raven
Copy link

@johny-mnemonic you can directly message me on the reddit account I linked onto my github presentation!
But to be honest I didn't do anything fancy or original, I just took the glusterFS part of this tutorial and iterated from it some extra steps that were specific to my setup (LXC container on proxmox host).

Managed to get the swarm working on this unsupported setup, but ended up giving up on LXC containers for swarm/k3s applications and going full VMs instead because of the amount of tweaking has to be done to make it work with the proxmox kernel and apparmor, which can break on updates.)

@DANW999
Copy link

DANW999 commented Apr 30, 2022

Hi, may I humbly ask if there is any update on this and whether this feature is likely to be implemented? I too have had some challenges with being unable to use static IP addresses for Docker Swarm containers and have had to use some workarounds like setting the IP range to /32 subnet on docker networks, which is a pain to do with MACVLAN because you can only have one docker network per node with the same gateway which means some of my containers using MACVLAN have been constrained to one device, defeating the purpose of having docker in swarm mode in the first place. Truly appreciate on the works and efforts that go into this project as I love using docker! Thank you.

@Stephan-Walter
Copy link

Stephan-Walter commented Jun 15, 2022

We solved the issue through using docker in docker. So we have a "proxy" service with replica 1 for each kind of real service we want to create with a static IP (MACVLAN), that use docker create and docker network attach to build a container how we like it.

At the end it is a huge pain, that the options of swarm are no superset of der normal container commands, since you start to develop something with just one container and then at the end it breaks when you want to add failover capability through Swarm.

At the end, I think the only thing that is missing is a proper plugin for the internal DHCP, that provided static assignment... I have forgotten the name of this subcomponent, but when we faced the problem now 4 or 5 years ago, there were no willingness to create this component, since they already exist as fee-paying services and that the need doesn't match the needs of the majority of the users. That the situation as it is is broken from my point of view seems to be ok.

I have therefore not much hope that things will change anytime in the future and at the end people will be forced to move to kubernetes, so that the project is dead at some point.

EDIT:
I have just searched for the right component, that could solve the issue. It would be an IPAM Plugin like this https://github.com/Stephan-Walter/docker-infoblox

@dani
Copy link

dani commented Jul 8, 2022

Another use case I'm facing : I'm trying to run several FTP servers (~ 100), which must be accessible to clients through dedicated IPSec tunnels (IPSec is handled on my firewall, with P2 restricted to the /32 of the corresponding FTP server). This is not possible without static IP. Each FTP requires 1 TCP port for the control channel and 100 TCP ports for the passive data channels. So it's not manageable to expose them using the swarm mesh. Creating an ipvlan network, I can connect containers directly to my firewall. The only missing bit is the ability to set a static IP to my containers. I tried writing a custom IPAM service, but the problem is that there's nothing to identify the container in the POST request on /IpamDriver.RequestAddress. At best I can get the MAC addr (If my driver sets RequiresMACAddress to true), but as we can't set a fixed mac addr for services either, all we get is a randomly generated MAC ...

@juampe
Copy link

juampe commented Jul 10, 2022

From five years ago, until now, it was exposed diverse use cases that needs a swarm static IP. Nobody cares about it. I think that is a good exercise to forget docker swarm as a network friendly container runtime. Good luck.

@dani
Copy link

dani commented Jul 10, 2022

Indeed, and that's too bad... Swarm is really a simple and elegant solution, which would filful a lot of use case (if it wasn't abandonware)

@Stephan-Walter
Copy link

Stephan-Walter commented Jul 11, 2022

yes, SWARM is quite simple to setup and use compared to Kubernetes. Nevertheless, it seems to me, that there is not much progress and visions for the future of it.

As mentioned, 5 years - in words FIVE years - since this problem was addressed and zero progress. It is to some degree not bad, that SWARM doesn't move as fast as Kubernetes, but maybe a bit faster wouldn't be bad.

I have still a ticket open since more then a year, that just get no attention. Even a simple "wontfix" would be better then just no reaction.

@rubinho76
Copy link

Hi folks,

I basically love Docker and discover new things every day.
However, I am sure that among docker developers, network engineers are underrepresented.

I am a network admin myself and I can only shake my head at many things in the Docker network implementation.

I've been playing with Swarm for two days and I'm already seeing a new network problem.... Static IP in Swarm mode :/

@ilyavaiser
Copy link

Oh, guys. I run into this problem a couple of times a year. And I cry every time.

@ieugen
Copy link

ieugen commented Oct 26, 2023

I think this application bug would be easy to fix via staic IP address milvus-io/milvus#25032 .

@ewh0
Copy link

ewh0 commented Nov 21, 2023

This is a useful feature but sadly it is still not yet implemented at swarm level. I don't expect such a feature disparity between docker compose and docker swarm. Is there any plan to implement it in the near future?

This proposal makes sense to me #24170 (comment)

@ieugen
Copy link

ieugen commented Nov 21, 2023

don't know. try and please report back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking area/swarm kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

No branches or pull requests