-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod crashes when setting HCLOUD_NETWORK and network: false #630
Comments
Without setting
And for the sake of completeness, with
the hcloud-cloud-controller-manager starts and adds the metadata as expected. This is not a solution for us, since |
Just to clarify, you mentioned "HelmChart version 3.3.0" in the original issue. We do not have a helm chart with that version, the current version is |
Sorry, that was a copy n paste error. I'm using |
I am unable to reproduce this with hccm While trying to reproduce I noticed that you also need to provide the k3s flag
I installed k3s with:
Then created a secret for hccm:
And installed the chart the same way you did with the first Could you post the output of the two following commands here?
|
My bad. I must been lost in values. The described behaviour
happens with the env:
HCLOUD_TOKEN:
valueFrom:
secretKeyRef:
name: hcloud
key: token
HCLOUD_NETWORK:
valueFrom:
secretKeyRef:
name: hcloud
key: network
networking:
enabled: false
robot:
enabled: false Note: k3s is running with
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
meta.helm.sh/release-name: hccm
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2024-04-04T11:19:10Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
name: hcloud-cloud-controller-manager
namespace: kube-system
resourceVersion: "3440"
uid: 62e7b715-e99d-4878-8133-d01cd17a95be
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 2
selector:
matchLabels:
app.kubernetes.io/instance: hccm
app.kubernetes.io/name: hcloud-cloud-controller-manager
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: hccm
app.kubernetes.io/name: hcloud-cloud-controller-manager
spec:
containers:
- command:
- /bin/hcloud-cloud-controller-manager
- --allow-untagged-cloud
- --cloud-provider=hcloud
- --route-reconciliation-period=30s
- --webhook-secure-port=0
- --leader-elect=false
env:
- name: HCLOUD_NETWORK
valueFrom:
secretKeyRef:
key: network
name: hcloud
- name: HCLOUD_TOKEN
valueFrom:
secretKeyRef:
key: token
name: hcloud
- name: ROBOT_PASSWORD
valueFrom:
secretKeyRef:
key: robot-password
name: hcloud
optional: true
- name: ROBOT_USER
valueFrom:
secretKeyRef:
key: robot-user
name: hcloud
optional: true
image: hetznercloud/hcloud-cloud-controller-manager:v1.19.0
imagePullPolicy: IfNotPresent
name: hcloud-cloud-controller-manager
ports:
- containerPort: 8233
name: metrics
protocol: TCP
resources:
requests:
cpu: 100m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: Default
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: hcloud-cloud-controller-manager
serviceAccountName: hcloud-cloud-controller-manager
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
- key: CriticalAddonsOnly
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/not-ready
status:
conditions:
- lastTransitionTime: "2024-04-04T11:19:10Z"
lastUpdateTime: "2024-04-04T11:19:11Z"
message: ReplicaSet "hcloud-cloud-controller-manager-6f454fcfbf" has successfully
progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2024-04-04T11:19:19Z"
lastUpdateTime: "2024-04-04T11:19:19Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
observedGeneration: 1
replicas: 1
unavailableReplicas: 1
updatedReplicas: 1
apiVersion: v1
kind: Node
metadata:
annotations:
alpha.kubernetes.io/provided-node-ip: 10.0.0.2
etcd.k3s.cattle.io/local-snapshots-timestamp: "2024-04-04T11:08:33Z"
etcd.k3s.cattle.io/node-address: 10.0.0.2
etcd.k3s.cattle.io/node-name: k3s-controlplane1-ba0bd5a4
k3s.io/node-args: '["server","--data-dir","/var/lib/rancher/k3s","--disable","traefik","--disable","servicelb","--flannel-backend","none","--disable-network-policy","--embedded-registry","true","--write-kubeconfig-mode","0600","--tls-san","lbctrl.iquestria.cso.ninja","--disable-cloud-controller","--token","********","--tls-san","k3s-controlplane1","--tls-san","10.0.0.2","--node-ip","10.0.0.2","--node-external-ip","x.x.x.x","--kubelet-arg","cloud-provider=external"]'
k3s.io/node-config-hash: QNU4YAKJZSOORINBMHYXXYIO754HSV5OGAWEWZC56NJR74RX56AQ====
k3s.io/node-env: '{"K3S_DATA_DIR":"/var/lib/rancher/k3s/data/4344eae0657f7fc0c99af34fc51358389f500f18c9bb80f5a55c130de07565d2"}'
node.alpha.kubernetes.io/ttl: "0"
p2p.k3s.cattle.io/node-address: /ip4/10.0.0.2/tcp/5001/p2p/QmWjS45ca9RZuoMnavYUhNHH4wD7V4SXVHRhzcn1tCWNdi
volumes.kubernetes.io/controller-managed-attach-detach: "true"
creationTimestamp: "2024-04-04T11:07:10Z"
finalizers:
- wrangler.cattle.io/node
- wrangler.cattle.io/managed-etcd-controller
labels:
beta.kubernetes.io/arch: arm64
beta.kubernetes.io/os: linux
kubernetes.io/arch: arm64
kubernetes.io/hostname: k3s-controlplane1
kubernetes.io/os: linux
node-role.kubernetes.io/control-plane: "true"
node-role.kubernetes.io/etcd: "true"
node-role.kubernetes.io/master: "true"
p2p.k3s.cattle.io/enabled: "true"
name: k3s-controlplane1
resourceVersion: "4135"
uid: c1b6d78b-55dc-47f8-9ba0-557b81a452a7
spec:
podCIDR: 10.42.0.0/24
podCIDRs:
- 10.42.0.0/24
taints:
- effect: NoSchedule
key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
status:
addresses:
- address: 10.0.0.2
type: InternalIP
- address: k3s-controlplane1
type: Hostname
allocatable:
cpu: "4"
ephemeral-storage: "55192664021"
hugepages-1Gi: "0"
hugepages-2Mi: "0"
hugepages-32Mi: "0"
hugepages-64Ki: "0"
memory: 7934528Ki
pods: "110"
capacity:
cpu: "4"
ephemeral-storage: 56735880Ki
hugepages-1Gi: "0"
hugepages-2Mi: "0"
hugepages-32Mi: "0"
hugepages-64Ki: "0"
memory: 7934528Ki
pods: "110"
conditions:
- lastHeartbeatTime: "2024-04-04T11:10:25Z"
lastTransitionTime: "2024-04-04T11:10:25Z"
message: Cilium is running on this node
reason: CiliumIsUp
status: "False"
type: NetworkUnavailable
- lastHeartbeatTime: "2024-04-04T11:22:30Z"
lastTransitionTime: "2024-04-04T11:07:22Z"
message: Node is a voting member of the etcd cluster
reason: MemberNotLearner
status: "True"
type: EtcdIsVoter
- lastHeartbeatTime: "2024-04-04T11:20:46Z"
lastTransitionTime: "2024-04-04T11:07:10Z"
message: kubelet has sufficient memory available
reason: KubeletHasSufficientMemory
status: "False"
type: MemoryPressure
- lastHeartbeatTime: "2024-04-04T11:20:46Z"
lastTransitionTime: "2024-04-04T11:07:10Z"
message: kubelet has no disk pressure
reason: KubeletHasNoDiskPressure
status: "False"
type: DiskPressure
- lastHeartbeatTime: "2024-04-04T11:20:46Z"
lastTransitionTime: "2024-04-04T11:07:10Z"
message: kubelet has sufficient PID available
reason: KubeletHasSufficientPID
status: "False"
type: PIDPressure
- lastHeartbeatTime: "2024-04-04T11:20:46Z"
lastTransitionTime: "2024-04-04T11:10:20Z"
message: kubelet is posting ready status. AppArmor enabled
reason: KubeletReady
status: "True"
type: Ready
daemonEndpoints:
kubeletEndpoint:
Port: 10250
images:
- names:
- quay.io/cilium/cilium@sha256:bfeb3f1034282444ae8c498dca94044df2b9c9c8e7ac678e0b43c849f0b31746
sizeBytes: 195832613
- names:
- quay.io/cilium/operator-generic@sha256:4dd8f67630f45fcaf58145eb81780b677ef62d57632d7e4442905ad3226a9088
sizeBytes: 24175419
- names:
- docker.io/rancher/mirrored-pause@sha256:74c4244427b7312c5b901fe0f67cbc53683d06f4f24c6faee65d4182bf0fa893
- docker.io/rancher/mirrored-pause:3.6
sizeBytes: 253243
nodeInfo:
architecture: arm64
bootID: b44ffa8e-82e2-4740-b6ab-bf53631f8310
containerRuntimeVersion: containerd://1.7.11-k3s2
kernelVersion: 6.1.0-18-arm64
kubeProxyVersion: v1.29.2+k3s1
kubeletVersion: v1.29.2+k3s1
machineID: e7c1065f9ccd42ce8d0c10c61a494f91
operatingSystem: linux
osImage: Debian GNU/Linux 12 (bookworm)
systemUUID: 2376c8c9-a1c5-4485-8bea-efcfa76fb865 with networking:
enabled: false
network:
valueFrom:
secretKeyRef:
name: hcloud
key: network
robot:
enabled: false There is no env:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
meta.helm.sh/release-name: hccm
meta.helm.sh/release-namespace: kube-system
creationTimestamp: "2024-04-04T11:10:32Z"
generation: 1
labels:
app.kubernetes.io/managed-by: Helm
name: hcloud-cloud-controller-manager
namespace: kube-system
resourceVersion: "2171"
uid: e97fe5ed-db35-4eaf-a290-371b87780a2c
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 2
selector:
matchLabels:
app.kubernetes.io/instance: hccm
app.kubernetes.io/name: hcloud-cloud-controller-manager
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/instance: hccm
app.kubernetes.io/name: hcloud-cloud-controller-manager
spec:
containers:
- command:
- /bin/hcloud-cloud-controller-manager
- --allow-untagged-cloud
- --cloud-provider=hcloud
- --route-reconciliation-period=30s
- --webhook-secure-port=0
- --leader-elect=false
env:
- name: HCLOUD_TOKEN
valueFrom:
secretKeyRef:
key: token
name: hcloud
- name: ROBOT_PASSWORD
valueFrom:
secretKeyRef:
key: robot-password
name: hcloud
optional: true
- name: ROBOT_USER
valueFrom:
secretKeyRef:
key: robot-user
name: hcloud
optional: true
image: hetznercloud/hcloud-cloud-controller-manager:v1.19.0
imagePullPolicy: IfNotPresent
name: hcloud-cloud-controller-manager
ports:
- containerPort: 8233
name: metrics
protocol: TCP
resources:
requests:
cpu: 100m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: Default
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: hcloud-cloud-controller-manager
serviceAccountName: hcloud-cloud-controller-manager
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: node.cloudprovider.kubernetes.io/uninitialized
value: "true"
- key: CriticalAddonsOnly
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/not-ready
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2024-04-04T11:10:33Z"
lastUpdateTime: "2024-04-04T11:10:37Z"
message: ReplicaSet "hcloud-cloud-controller-manager-584f6fc4f4" has successfully
progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2024-04-04T11:13:22Z"
lastUpdateTime: "2024-04-04T11:13:22Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
observedGeneration: 1
readyReplicas: 1
replicas: 1
updatedReplicas: 1 |
I appreciate the help. For the sake of completeness, without
|
Thanks for the detailed responses :) I can reproduce the issue with these values from your comment yesterday: env:
HCLOUD_TOKEN:
valueFrom:
secretKeyRef:
name: hcloud
key: token
HCLOUD_NETWORK:
valueFrom:
secretKeyRef:
name: hcloud
key: network
networking:
enabled: false
robot:
enabled: false The core issue is, that hccm & the Helm Chart always assume that users with Networks also want to use the Routing functionality. This is not always true and there are cases where you want the You can set the env variable These values should work (or just yours with the env variable added): env:
HCLOUD_NETWORK_ROUTES_ENABLED:
value: "false"
networking:
enabled: true |
Thank you. Will test that. With |
Yes, should work 👍 You will have to do some magic to get the private IPs for the Robot Servers in, as that is not automatically supported in HCCM right now. |
TL;DR
Despite of
network: false
the hcloud-cloud-controller-manager tries to startnode-route-controller
. Thenode-route-controller
fails due to the missing CIDR.Expected behavior
hcloud-cloud-controller-manager starting up and configuring the nodes metadata.
Observed behavior
hcloud-cloud-controller-manager pod crashes with
Minimal working example
command:
hccm-values.yaml
:Remark: The same happens when configuring
as described in the README.md.
Log output
Additional information
--kubelet-arg="cloud-provider=external"
The text was updated successfully, but these errors were encountered: