Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kilo Incorrectly Chooses an eth0 IP Over Node's Configured Internal IP #367

Open
boedy opened this issue Oct 4, 2023 · 6 comments
Open

Comments

@boedy
Copy link

boedy commented Oct 4, 2023

When deploying Kilo on DigitalOcean, I've noticed that it fetches the wrong internal IP. Specifically, it seems to be selecting an internal IP bound to the eth0 interface.

All my Kubernetes nodes are correctly configured with their respective internal IPs. I've reviewed the Kilo codebase but couldn't pinpoint where the internal IP is determined outside of setting it through annotations.

Could you provide some insight into why Kilo doesn't use the default internal IP set for the nodes? It would be beneficial to understand the logic behind this and if there's a way to ensure the correct IP is used. Using the K8S internal ip seems like a sane default.

@squat
Copy link
Owner

squat commented Oct 4, 2023

Hi @boedy, the following file [0] contains all of the logic used for discovering IP addresses and includes some comments to describe the prioritization of interfaces and addresses. Would you mind providing some more details on the specific situation? What are the IPs in question? What IP is Kilo selecting? What IP did you expect to be selected? Also, what interfaces are these IPs assigned to?

[0] https://github.com/squat/kilo/blob/main/pkg/mesh/discoverips.go

@boedy
Copy link
Author

boedy commented Oct 4, 2023

Hi @squat

This is the situation on one of my nodes. Kilo resolves the interal ip to 10.19.0.23 in the following example. The IP that should have been picked is 10.134.128.203

root@do-ams3-3ff9:/# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 8e:41:9b:53:c6:fb brd ff:ff:ff:ff:ff:ff
    inet 161.35.92.50/20 brd 161.35.96.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 10.19.0.23/16 brd 10.19.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::8c41:9bff:fe53:c6fc/64 scope link 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether b6:82:26:81:e4:99 brd ff:ff:ff:ff:ff:ff
    inet 10.134.128.203/16 brd 10.134.255.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::b482:26ff:fe81:e49a/64 scope link 
       valid_lft forever preferred_lft forever

Leaving that aside. The Kubernetes api lists the correct internel IP which I was referring to initially as the "K8S internal ip":

apiVersion: v1
kind: Node
metadata:
  name: do-ams3-3ff9
....
status:
  addresses:
    - type: InternalIP 
      address: 10.134.128.203 <--- can't we just use this as default?
    - type: ExternalIP
      address: 161.35.92.50
    - type: Hostname
      address: do-ams3-3ff9

https://github.com/kubernetes/api/blob/9a776fe3a720323e4f706b3ebb462b3dd661634f/core/v1/types.go#L5512

@squat
Copy link
Owner

squat commented Oct 4, 2023

Hmm I agree that relying on the IPs registered on the node objects in the Kubernetes API would be a great default (internal IPs are not always available so some discovery code will still be necessary) however it's not entirely trivial. An easier implementation would be to copy the logic used by the kubelet for IP discovery during node registration.

Out of curiosity, what makes this IP on this interface "the correct IP"? I wonder if we can translate your needs into a heuristic during IP discovery.

@squat
Copy link
Owner

squat commented Oct 4, 2023

Oh right, the internal IPs are registered by the controllers for each cloud, which are not always available depending on the flavor of Kubernetes and the cloud/on premises situation.

@boedy
Copy link
Author

boedy commented Oct 4, 2023

I'd argue that if node objects don't have an Internal IP defined, it likely means there's no internal network for communication. As you pointed out, many cloud providers handle this automatically. In my case I'm running K3S on Baremetal + multiple clouds (This is where Kilo comes in 😄 ) and I manually set the internal IP. For scenarios where someone might want to use a third network layer specifically for Kilo communication, the kilo.squat.ai/force-internal-ip annotation could still be used.

Relying on heuristics might not always pinpoint the right internal IP address, given the diverse ways machines can be configured. However, I get the need for them, especially for maintaining backward compatibility.

@boedy
Copy link
Author

boedy commented Oct 4, 2023

Oh and one thing I should add - and why I opened this issue in the first place - is that with k3s for example I can set node labels when provisioning, but I'm not able to specify annotations. This means that in my current setup I have to manually add the kilo.squat.ai/force-internal-ip annotation after the node comes online.

To give you a concrete example how I provision my worker nodes through cloud-init:

# only contains the relevant parts
  - export NODE_NAME=$(hostname)
  - export EXTERNAL_IP=$(ifconfig eth0 | awk -F ' *|:' '/inet /{print $3}')
  - export INTERNAL_IP=$(ifconfig eht1 | awk -F ' *|:' '/inet /{print $3}')
  - |
    curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.25.7+k3s1" sh -s - agent \
      --server <ip of a master node>
      --node-name $NODE_NAME \
      --node-ip $INTERNAL_IP \
      --node-external-ip $EXTERNAL_IP \
      --node-label provider=digitalocean \
      --node-label topology.kubernetes.io/region=eu-west \
      --node-label topology.kubernetes.io/zone=eu-west-ams3 \
      --node-label workload=compute \
      --node-label infra=cloud \

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants