Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cilium): cilium-operator must be able to patch K8S nodes resources #15470

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

maximumG
Copy link

@maximumG maximumG commented Jun 5, 2023

Fixes #15464.

As per the cilium recommended installation, the cilium-operator pod is in charge of removing the node.cilium.io/agent-not-ready that can be put on nodes upon startup. As such the operator needs the PATCH permission on the node resource.

As per the cilium recommended installation, the cilium-operator is in charge of removing the `node.cilium.io/agent-not-ready` that is put on nodes upon startup. As such the operator needs the PATCH permission on the node ressource.
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 5, 2023
@k8s-ci-robot
Copy link
Contributor

Hi @maximumG. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign johngmyers for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hakman
Copy link
Member

hakman commented Jun 5, 2023

/ok-to-test
/cc @olemarkus

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 5, 2023
@maximumG
Copy link
Author

maximumG commented Jun 5, 2023

/retest-required

@hakman
Copy link
Member

hakman commented Jun 5, 2023

/test pull-kops-e2e-aws-upgrade-126-ko126-to-klatest-kolatest-many-addons

@hakman
Copy link
Member

hakman commented Jun 5, 2023

@maximumG To fix the failing tests you need to run hack/update-exptected.sh and commit the changes.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 6, 2023
@maximumG
Copy link
Author

maximumG commented Jun 6, 2023

/retest

@olemarkus
Copy link
Member

But that taint is not used in our setup. Does it then need this permission?

@maximumG
Copy link
Author

maximumG commented Jun 7, 2023

But that taint is not used in our setup. Does it then need this permission?

I must agree that by default the cilium taint is not configured. However I might argue that users are able to customize the taint per kops instance groups, leaving the possibility to user to add this specific taint.

Actually this is exactly what we tried out, 😃 we put this taint to ensure no workload got scheduled before the cilium-agent was ready. And we were surprised that it was not removed after all.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 21, 2023
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@maximumG
Copy link
Author

maximumG commented Jul 7, 2023

@olemarkus Do you need additional info to validate this PR on your side ?

@h3poteto
Copy link
Contributor

I also have the same issue. And I can understand what @olemarkus says.

node.cilium.io/agent-not-ready taint is only used when we use ENI as IPAM in Cilium. And now we need to add the taint manually. Nevertheless, is it correct to prepare only permissions with kOps?
I'm thinking of 3 ideas.

  1. Just write a document about the taint and permissions.
    • This is an optional usage.
    • So users need to add the taint and create ClusterRole for the cilium-operator manually.
  2. The permission and taint are automatically added by kOps when users specify ENI as IPAM.
    • The taint is automatically added by kOps only when users specify networking.cilium.ipam = eni.
    • At that time, permissions are also added by kOps.
  3. Same as this PR.
    • Users add the taint manually.
    • But the permission required at that time is prepared by kOps.

What do you think?

@hakman
Copy link
Member

hakman commented Jul 30, 2023

@h3poteto @maximumG Added to the topics list for kOps Office Hours next week.

@hakman
Copy link
Member

hakman commented Jul 30, 2023

@johngmyers Any thoughts?

@johngmyers
Copy link
Member

Why is nodes/status needed?

The referenced documentation describes a situation where there is a second CNI that Cilium has to wrest control from, but that shouldn't apply to kOps.

If an extra taint was useful in some common configuration, I would prefer that kOps automatically apply it, in addition to giving cilium-operator the permssion to remove it.

@zadjadr
Copy link
Contributor

zadjadr commented Aug 4, 2023

I believe this is needed long term anyway, because in v1.14 this change has been introduced:

The Cilium operator now taints nodes where Cilium is scheduled to run but is not running.
This prevents pods from being scheduled on nodes without Cilium.
The CNI configuration file is no longer removed on agent shutdown.
This means that pod deletion will always succeed; previously it would fail if Cilium was down for an upgrade.
This should help prevent nodes accidentally entering an unmanageable state.
It also means that nodes are not removed from cloud LoadBalancer backends during Cilium upgrades.

cilium/cilium#23486

@johngmyers
Copy link
Member

When Cilium requires it we will need to add the permission.

@maximumG
Copy link
Author

maximumG commented Aug 7, 2023

Why is nodes/status needed?

On our production K8S clusters, we had some weird scheduling on new joining nodes. Basically, business pods were sometimes scheduled before Cilium daemonset, exhausting the node ressources (CPU/memory). As a result, the cilium-agent was not able to get scheduled on the node and trigger another node scale up.

With this taint added and removed by the cilium-operator, we would avoid such situation.

@olemarkus
Copy link
Member

If your business pods gets scheduled before the cilium DS, you must have specific tolerations for that. If you e.g tolerate any taint, adding more taints won't work.

@johngmyers
Copy link
Member

With this taint added and removed by the cilium-operator, we would avoid such situation.

Taints are not under nodes/status. So I ask again: why is nodes/status needed?

@zadjadr
Copy link
Contributor

zadjadr commented Aug 8, 2023

@johngmyers node/status is used to annotate the node. But I don't know if this is needed 100%.

@maximumG
Copy link
Author

maximumG commented Aug 8, 2023

node/status is used to annotate the node. But I don't know if this is needed 100%.

Looking at the latest cilium helm documentation for v1.12 here, I would say that this permission is indeed useless. The annotateK8sNode is set to false by default and AFAIK there is no way to modify this using cilium kops integration.  

@h3poteto
Copy link
Contributor

h3poteto commented Aug 9, 2023

I think node/status permission is requested https://github.com/cilium/cilium/blob/5b92dce1b003c64fa5ba36cc2f0fc355632f0cf5/operator/watchers/node_taint.go#L298-L327

And removing agent-not-ready is here: https://github.com/cilium/cilium/blob/5b92dce1b003c64fa5ba36cc2f0fc355632f0cf5/operator/watchers/node_taint.go#L345-L398

So, maybe node/status is not necessary to manage agent-not-ready annotation. But they wrote the reason why NodeNetworkUnavailable is needed:

// setNodeNetworkUnavailableFalse sets Kubernetes NodeNetworkUnavailable to
// false as Cilium is managing the network connectivity.
// https://kubernetes.io/docs/concepts/architecture/nodes/#condition
// This is because some clusters (notably GCP) come up with a NodeNetworkUnavailable condition set
// and the network provider is expected to remove this manually.

@maximumG
Copy link
Author

Based on @h3poteto comment, I would still state that this permission it actually useless (and sorry for pushing this PR)

Basically if we want to avoid scheduling pods on nodes where cilium is yet not ready, we just have to put the taint node.kubernetes.io/network-unavailable:NoSchedule on the the kops instance group. As soon as cilium is ready on the node, the cilium-operator will notice it and update the NodeNetworkUnavailable condition to false. As such Kubernetes will deliberately remove the taint by itself node.kubernetes.io/network-unavailable:NoSchedule and pods can now be scheduled properly.

As already stated by some people here, the node.cilium.io/agent-not-ready taint is only useful in case of CNI chaining, where the first CNI (handling the IPAM) updates this NodeNetworkUnavailable condition to false, while not waiting for cilium to be up and running.

Does it make sense ?

@k8s-ci-robot
Copy link
Contributor

@maximumG: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kops-kubernetes-e2e-ubuntu-gce-build 00e6914 link true /test pull-kops-kubernetes-e2e-ubuntu-gce-build
pull-kops-verify-hashes 00e6914 link true /test pull-kops-verify-hashes

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle rotten
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/addons cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/office-hours lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cilium-operator is missing RBAC permission to remove node.cilium.io/agent-not-ready taint
8 participants