Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: docker network conflicts with host routes when creating a new network #46615

Closed
dtronche opened this issue Oct 11, 2023 · 5 comments · May be fixed by #46630
Closed

Regression: docker network conflicts with host routes when creating a new network #46615

dtronche opened this issue Oct 11, 2023 · 5 comments · May be fixed by #46630
Labels
area/networking kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage version/23.0 version/24.0

Comments

@dtronche
Copy link

dtronche commented Oct 11, 2023

Description

docker release 23 introduced a regression in its choice of the network pool which can conflict with the host existing routes.
Previous versions of docker (until release 20), would not create a network route that overlaps with the host one

Reproduce

  1. To start from a fresh installation:
  • All docker networks have been removed, the internal network database file docker/network/files/local-kv.db is removed
  • docker daemon is stopped
  • docker0 bridge is removed : ip link del docker0
  1. Create a route that will conflict with docker default ones: ip route add 172.18.13.0/24 via xxx.xxx.xxx.xxx dev yyy
  2. Start docker daemon. By default it should create the route 172.17.0.0/16 dev docker0
  3. create a docker network : docker network create net1 => This command creates a route on network 172.18.0.0/16 which conflicts with the route from step 2. On previous versions of docker, the command creates a route on network 172.19.0.0/16

Expected behavior

We should get back to the former behavior of docker and not create a route that conflicts with the host one


Result on docker 20.10.14 (correct):

# ip route
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
172.18.13.0/27 via 192.168.2.221 dev eth0
**172.19.0.0/16 dev br-c1e73b4c4b98** proto kernel scope link src 172.19.0.1 linkdown
192.168.0.0/23 via 10.4.0.104 dev eth1
192.168.2.0/23 dev eth0 proto kernel scope link src 192.168.2.13

Result on docker 24.0.6 (incorrect):

# ip route
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
**172.18.0.0/16 dev br-5d8b3ebd612a** proto kernel scope link src 172.18.0.1 linkdown
172.18.13.0/24 via 192.168.2.221 dev eth0
192.168.2.0/24 dev eth0 proto kernel scope link src 192.168.2.231

docker version

# docker version
Client:
 Version:           24.0.6
 API version:       1.43
 Go version:        go1.19.13
 Git commit:        24.0.6
 Built:             unknown-buildtime
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          24.0.6
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.19.13
  Git commit:       buildroot
  Built:
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.22
  GitCommit:
 runc:
  Version:          1.1.7

docker info

Client:
 Version:    24.0.6
 Context:    default
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 24.0.6
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version:
 runc version:
 init version: N/A
 Security Options:
  seccomp
   Profile: builtin
 Kernel Version: 5.15.130-grsec
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 1.929GiB

Additional Info

Docker debug logs on 20.10.14:

docker network create net1:

DEBU[2023-10-11T05:21:25.820794534Z] Calling HEAD /_ping
DEBU[2023-10-11T05:21:25.821691142Z] Calling POST /v1.41/networks/create
DEBU[2023-10-11T05:21:25.821786007Z] form data: {"Attachable":false,"CheckDuplicate":true,"ConfigFrom":null,"ConfigOnly":false,"Driver":"bridge","EnableIPv6":false,"IPAM":{"Config":[],"Driver":"default","Options":{}},"Ingress":false,"Internal":false,"Labels":{},"Name":"net1","Options":{},"Scope":""}
DEBU[2023-10-11T05:21:25.822143163Z] Allocating IPv4 pools for network net1 (077c6bdda9fc2a5f54749c4e8c7187f278f2a28b0978dfddba1937b4a6104d6d)
DEBU[2023-10-11T05:21:25.822157053Z] RequestPool(LocalDefault, , , map[], false)
DEBU[2023-10-11T05:21:25.822319403Z] RequestPool(LocalDefault, , , map[], false)
DEBU[2023-10-11T05:21:25.822378464Z] ReleasePool(LocalDefault/172.18.0.0/16)
DEBU[2023-10-11T05:21:25.822392003Z] RequestAddress(LocalDefault/172.19.0.0/16, <nil>, map[RequestAddressType:com.docker.network.gateway])
DEBU[2023-10-11T05:21:25.822403238Z] Request address PoolID:172.19.0.0/16 App: ipam/default/data, ID: LocalDefault/172.19.0.0/16, DBIndex: 0x0, Bits: 65536, Unselected: 65534, Sequence: (0x80000000, 1)->(0x0, 2046)->(0x1, 1)->end Curr:0 Serial:false PrefAddress:<nil>
DEBU[2023-10-11T05:21:25.822667093Z] Did not find any interface with name br-077c6bdda9fc: Link not found
DEBU[2023-10-11T05:21:25.822712613Z] Setting bridge mac address to 02:42:68:61:e3:1d
DEBU[2023-10-11T05:21:25.826315409Z] Assigning address to bridge interface br-077c6bdda9fc: 172.19.0.1/16
DEBU[2023-10-11T05:21:25.830407574Z] /usr/sbin/iptables, [--wait -t nat -C POSTROUTING -s 172.19.0.0/16 ! -o br-077c6bdda9fc -j MASQUERADE]
DEBU[2023-10-11T05:21:25.835525181Z] /usr/sbin/iptables, [--wait -t nat -I POSTROUTING -s 172.19.0.0/16 ! -o br-077c6bdda9fc -j MASQUERADE]

Docker debug logs on 24.0.6:

docker network create net1:

DEBU[2023-10-11T05:16:21.771289969Z] Calling HEAD /_ping
DEBU[2023-10-11T05:16:21.772811402Z] Calling POST /v1.43/networks/create
DEBU[2023-10-11T05:16:21.772922134Z] form data: {"Attachable":false,"CheckDuplicate":true,"ConfigFrom":null,"ConfigOnly":false,"Driver":"bridge","EnableIPv6":false,"IPAM":{"Config":[],"Driver":"default","Options":{}},"Ingress":false,"Internal":false,"Labels":{},"Name":"net1","Options":{},"Scope":""}
DEBU[2023-10-11T05:16:21.773263501Z] Allocating IPv4 pools for network net1 (c1c835401b07ddced9005834e4defbf96b0d1d13dea03787a7a11e801d28113f)
DEBU[2023-10-11T05:16:21.773279797Z] RequestPool(LocalDefault, , , map[], false)
DEBU[2023-10-11T05:16:21.773456142Z] RequestAddress(LocalDefault/172.18.0.0/16, <nil>, map[RequestAddressType:com.docker.network.gateway])
DEBU[2023-10-11T05:16:21.773477392Z] Request address PoolID:172.18.0.0/16 Bits: 65536, Unselected: 65534, Sequence: (0x80000000, 1)->(0x0, 2046)->(0x1, 1)->end Curr:0 Serial:false PrefAddress:invalid IP
DEBU[2023-10-11T05:16:21.773527441Z] Did not find any interface with name br-c1c835401b07: Link not found
DEBU[2023-10-11T05:16:21.773543840Z] Setting bridge mac address to 02:42:96:ef:5c:6c
DEBU[2023-10-11T05:16:21.774850494Z] Assigning address to bridge interface br-c1c835401b07: 172.18.0.1/16
DEBU[2023-10-11T05:16:21.775950270Z] /usr/sbin/iptables, [--wait -t nat -C POSTROUTING -s 172.18.0.0/16 ! -o br-c1c835401b07 -j MASQUERADE]
DEBU[2023-10-11T05:16:21.779158811Z] /usr/sbin/iptables, [--wait -t nat -I POSTROUTING -s 172.18.0.0/16 ! -o br-c1c835401b07 -j MASQUERADE]
DEBU[2023-10-11T05:16:21.781776343Z] /usr/sbin/iptables, [--wait -t nat -C DOCKER -i br-c1c835401b07 -j RETURN]
DEBU[2023-10-11T05:16:21.783920686Z] /usr/sbin/iptables, [--wait -t nat -I DOCKER -i br-c1c835401b07 -j RETURN]
@dtronche dtronche added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage labels Oct 11, 2023
@thaJeztah
Copy link
Member

/cc @akerouanton

akerouanton added a commit to akerouanton/docker that referenced this issue Oct 12, 2023
This reverts commit ee9e526.

Route scope is used by the kernel to choose what source IP address
should be used when establishing an outbound connection. As such,
filtering routes based on their scope doesn't make sense.

Resolve moby#46615.
akerouanton added a commit to akerouanton/docker that referenced this issue Oct 12, 2023
This reverts commit ee9e526.

Route scope is used by the kernel to choose what source IP address
should be used when establishing an outbound connection. As such,
filtering routes based on their scope doesn't make sense.

Resolve moby#46615.
akerouanton added a commit to akerouanton/docker that referenced this issue Oct 12, 2023
This reverts commit ee9e526.

Route scope is used by the kernel to choose what source IP address
should be used when establishing an outbound connection. As such,
filtering routes based on their scope doesn't make sense.

Resolve moby#46615.

Signed-off-by: Albin Kerouanton <albinker@gmail.com>
@bf
Copy link

bf commented Feb 20, 2024

@dtronche did you find a workaround? or just downgrade docker to the old version before this buildx stuff?

@dtronche
Copy link
Author

dtronche commented Feb 20, 2024

@dtronche did you find a workaround? or just downgrade docker to the old version before this buildx stuff?

Our platform is on a dedicated OS which we build using buildroot, so we have patched docker sources to get back to the old behavior.

@bf
Copy link

bf commented Feb 20, 2024

I appreciate your quick reply!

so we have patched docker sources to get back to the old behavior.

Okay, that's crazy. But docker is only a small indie company with measly $200M in funding so what can we do 🤣

@akerouanton
Copy link
Member

Oh, I missed the 'recent' replies on this issue -- sorry!

We re-discussed the fix proposed in #46630 today during a libnet maintainers call. We all agree that there's no one-size-fits-all set of heuristics to determine whether a Docker subnet overlaps with some host config. We surely don't want to re-introduce the issue(s) originally fixed by #42598 (eg. dynamic IPAM allocation can't be used when OpenVPN is used in full tunnel mode). We prefer to deliberately miss some cases and give users an escape hatch, than 'over detect' potential overlaps and give users no workaround.

We might still reconsider this set of heuristics, or the ability for users to influence them at a later time -- but nothing planned right now.

In your case, the workaround is to configure the set of default-address-pools used for dynamic subnet allocation to exclude any prefix that would overlap with your host config. The default value for that parameter can be found on this page: https://docs.docker.com/config/daemon/ipv6/#dynamic-ipv6-subnet-allocation.

I'll close this issue as "won't fix" for the reasons explained above but feel free to continue the discussion.

@akerouanton akerouanton closed this as not planned Won't fix, can't repro, duplicate, stale Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. status/0-triage version/23.0 version/24.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants