Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hanging Cilium Operator #12923

Open
toschneck opened this issue Dec 14, 2023 · 2 comments
Open

Hanging Cilium Operator #12923

toschneck opened this issue Dec 14, 2023 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/app-management Denotes a PR or issue as being assigned to SIG App Management. sig/networking Denotes a PR or issue as being assigned to SIG Networking.

Comments

@toschneck
Copy link
Member

What happened?

On some changes of the cilium values of the system application, in my case added the following ingress config, the deployment of the cilium operator hangs and the helm deployment don't get finished. Somehow a k rollout restart deployment cilium-operator fixing the problem. But as is quite non-transparent for users, we should fix the behavour.
Cluster 'hardcore-hodgkin-cilium-ingress-test-vt69m6bnrn' in Project 'Tobi's Demo Project 🚀' 2023-12-14 16-11-53

### added values
ingressController:
  enabled: true
  loadbalancerMode: shared
  default: true
  enforceHttps: false
envoy:
  enabled: true

Possible Solution
If we may can add some label to the operator deployment, what ensures that the operator get restarted, e.g. last-update:date.

Expected behavior

Cilium config change get rolled out properly.

How to reproduce the issue?

Create a Cilium CNI based cluster, edit the values of the cilium system application and add the above yaml.
Cluster 'hardcore-hodgkin-cilium-ingress-test-b2vxk8vnm5' in Project 'Tobi's Demo Project 🚀' 2023-12-14 16-28-06

check kube-system namespace hanging pods, after the change

kube-system            cilium-7ljnp                                    0/1     Running   0               69s
kube-system            cilium-envoy-fhxzq                              1/1     Running   0               70s
kube-system            cilium-envoy-l7fnk                              1/1     Running   0               70s
kube-system            cilium-operator-79fdc88454-fw58x                1/1     Running   0               8m14s
kube-system            cilium-pxrb4                                    0/1     Running   0               69s

Restart operator, fixes the problem.

How is your environment configured?

  • KKP version: 2.24.0
  • Shared or separate master/seed clusters?: shared

Provide your KKP manifest here (if applicable)

https://github.com/kubermatic/sig-cs-infra/tree/main/lab.kubermatic.io

What cloud provider are you running on?

tested on AWS

What operating system are you running in your user cluster?

Ubuntu

Additional information

@toschneck toschneck added the kind/bug Categorizes issue or PR as related to a bug. label Dec 14, 2023
@csengerszabo csengerszabo added sig/networking Denotes a PR or issue as being assigned to SIG Networking. sig/app-management Denotes a PR or issue as being assigned to SIG App Management. labels Dec 15, 2023
@xrstf
Copy link
Contributor

xrstf commented Mar 8, 2024

When I try to reproduce this, adding the YAML snippet you provided yields an error:

admission webhook "applicationinstallations.apps.kubermatic.k8c.io" denied the request: ApplicationInstallation validation request 55773472-939f-4071-b8e2-462febf01b0e denied: [spec.values.cni: Invalid value: "null": value is immutable spec.values.ipam: Invalid value: "null": value is immutable spec.values.kubeProxyReplacement: Not found: "null" spec.values.operator: Invalid value: "null": value missing or incorrect spec.values.operator.securityContext: Invalid value: "null": value is immutable]

Am using Cilium 1.14.3.

@cnvergence
Copy link
Member

cnvergence commented Apr 10, 2024

found the same issue, seems like the dashboard is sending only updated valuesBlock, while values are empty, causing validation failure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/app-management Denotes a PR or issue as being assigned to SIG App Management. sig/networking Denotes a PR or issue as being assigned to SIG Networking.
Projects
None yet
Development

No branches or pull requests

4 participants