Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install the kube-proxy main container without privileged mode #2948

Open
uablrek opened this issue Oct 26, 2023 · 18 comments
Open

Install the kube-proxy main container without privileged mode #2948

uablrek opened this issue Oct 26, 2023 · 18 comments
Labels
area/addons kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Milestone

Comments

@uablrek
Copy link

uablrek commented Oct 26, 2023

A new flag --init-only was added to kube-proxy that executes all configuration steps that required privileged mode and then exits. It is intended to be used in an initContainer to allow the main container to run with NET_ADMIN rights only.

Please read the (not yet published) blog post for more info (temporary direct-link).

Kubeadm should support this installation. It will increase the startup time for kube-proxy, so IMO it should be optional, but perhaps be the default.

/kind feature

Versions

kubeadm version (use kubeadm version): $\ge$ v1.29

Environment:

  • Kubernetes version (use kubectl version): $\ge$ v1.29
  • Cloud provider or hardware configuration: N/A
  • OS (e.g. from /etc/os-release): N/A
  • Kernel (e.g. uname -a): N/A
  • Container runtime (CRI) (e.g. containerd, cri-o): N/A
  • Container networking plugin (CNI) (e.g. Calico, Cilium): N/A
  • Others:

Anything else we need to know?

Regression tests are OK (of course), but I suppose the real e2e test will be with this feature enabled, which mean with an updated kubeadm if I understand correctly. I have however executed e2e tests in KinD with:

FOCUS='\[sig-network\].*ervice'
SKIP='Disruptive|Serial|ESIPP|DNS|GCE|finalizer'

which actually tests more of kube-proxy than the conformance suite.

Upgrade is not explicitly tested, but as a part of testing in KinD I have updated the kube-proxy manifest.

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 26, 2023
@uablrek
Copy link
Author

uablrek commented Oct 26, 2023

PR kubernetes/kubernetes#120864

@uablrek
Copy link
Author

uablrek commented Oct 26, 2023

/cc @danwinship @aojea @sftim

@neolit123
Copy link
Member

neolit123 commented Oct 26, 2023

thanks for logging this issue and sending that PR.
a de-privilidge solution for kube-proxy was required for a long time.

A new flag --init-only was added to kube-proxy that executes all configuration steps that required privileged mode and then exits. It is intended to be used in an initContainer to allow the main container to run with NET_ADMIN rights only.

i see a Microsoft reviewer on the PR, so i'm going to assume it also works on Windows.
is that correct?

Kubeadm should support this installation. It will increase the startup time for kube-proxy, so IMO it should be optional, but perhaps be the default.

do you have estimates for that? - e.g. 20% increase on a desktop PC. and what window are we measuring - e.g. kube-proxy DS
creation, up until main container is running on a single node?

some consumers of kubeadm are quite sensitive to that, inc minikube, kind, others. my preference would be to make this the default, non-configurable path.

in the incoming releases, we are about to release a new API v1beta4, and we are just about to add a new sub-struct for "proxy" configuration there. so now it would be a good time for a knob for this kube-proxy "startup-mode".
if we do add this with a knob though, that would mean the change will be available when v1beta4 is out.

if we use something common like a feature-gate we can add it in 1.29.

...some decision making is required here.

@neolit123 neolit123 added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. area/addons labels Oct 26, 2023
@neolit123 neolit123 added this to the v1.29 milestone Oct 26, 2023
@neolit123
Copy link
Member

neolit123 commented Oct 26, 2023

Regression tests are OK (of course), but I suppose the real e2e test will be with this feature enabled, which mean with an updated kubeadm if I understand correctly.

we have a variety of tests jobs that can exercise this functionality (inc upgrade). if we add a knob it would be slightly more complex, since we have to add new targeted jobs.

FOCUS='[sig-network].*ervice'

we only run conformance tests in kubeadm e2e jobs.
what specific tests outside of the conf suite would be required to ensure this feature works properly?

to my understanding it's all about the startup sequence and setup. if it fails...it fails. from there on the conformance suite should suffice when testing the runtime aspect?

@uablrek
Copy link
Author

uablrek commented Oct 26, 2023

i see a Microsoft reviewer on the PR, so i'm going to assume it also works on Windows.
is that correct?

No, the function will not work on Windows. The --init-only should do nothing, and exit with an error code, and the kube-proxy must still be privileged, please see kubernetes/kubernetes#120864 (comment)

@uablrek
Copy link
Author

uablrek commented Oct 26, 2023

the conformance suite should suffice when testing the runtime aspect?

I don't expect more than the conformance tests to test this feature. But there are some e2e tests that tests aspects of "services" (i.e. kube-proxy) that are not in the conformance suite. I have explicitly executed those I know of.

@neolit123
Copy link
Member

neolit123 commented Oct 26, 2023

i see a Microsoft reviewer on the PR, so i'm going to assume it also works on Windows.
is that correct?

No, the function will not work on Windows. The --init-only should do nothing, and exit with an error code, and the kube-proxy must still be privileged, please see kubernetes/kubernetes#120864 (comment)

that's a problem, because the kube-proxy manifest has no OS branching in kubeadm, currently.
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/addons/proxy/manifests.go#L54
why wasn't the flag made no-op instead of throwing an error on Windows?

if we add this change there has to be a _windows.go specific gobuild tag based file that injects the init container part for !Windows.

@neolit123
Copy link
Member

the conformance suite should suffice when testing the runtime aspect?

I don't expect more than the conformance tests to test this feature. But there are some e2e tests that tests aspects of "services" (i.e. kube-proxy) that are not in the conformance suite. I have explicitly executed those I know of.

ok, i think we can go without extending what we test in our jobs for this feature.

@neolit123
Copy link
Member

i see a Microsoft reviewer on the PR, so i'm going to assume it also works on Windows.
is that correct?

No, the function will not work on Windows. The --init-only should do nothing, and exit with an error code, and the kube-proxy must still be privileged, please see kubernetes/kubernetes#120864 (comment)

that's a problem, because the kube-proxy manifest has no OS branching in kubeadm, currently. https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/addons/proxy/manifests.go#L54 why wasn't the flag made no-op instead of throwing an error on Windows?

if we add this change there has to be a _windows.go specific gobuild tag based file that injects the init container part for !Windows.

commented on the PR about using a warning instead of error:
kubernetes/kubernetes#120864 (comment)

@uablrek
Copy link
Author

uablrek commented Oct 26, 2023

do you have estimates for that? - e.g. 20% increase on a desktop PC. and what window are we measuring - e.g. kube-proxy DS
creation, up until main container is running on a single node?

I actually expect a very small increase. I believe that the bulk of the startup time is to load the image, and it is only done once. Then there is an overhead to create/start the init-container of course. The actual init code should run fast, but there is an unnecessary connection setup to the API-server that may be removed in the future.

It is hard to measure since it is highly environment dependent. But I can try in KinD on my PC with local image registry (which is kind of unfair).

@neolit123
Copy link
Member

do you have estimates for that? - e.g. 20% increase on a desktop PC. and what window are we measuring - e.g. kube-proxy DS
creation, up until main container is running on a single node?

I actually expect a very small increase. I believe that the bulk of the startup time is to load the image, and it is only done once. Then there is an overhead to create/start the init-container of course. The actual init code should run fast, but there is an unnecessary connection setup to the API-server that may be removed in the future.

It is hard to measure since it is highly environment dependent. But I can try in KinD on my PC with local image registry (which is kind of unfair).

kubeadm does a image pre-pull so we can exclude that from the calculation. a local registry is suitable.
having some average time increase would help justify this change in front of kubeadm consumers.

@danwinship
Copy link

I assume the added startup time would be trivial: we don't have to pull a second image, and kube-proxy barely does anything when you specify --init-only.

Already commented on the other two PRs, but we need distinct configs for Linux and Windows; just making Windows accept --init-only without error wouldn't work because then we'd run the main kube-proxy without privileged and it would fail.

@neolit123
Copy link
Member

will comment on the initial PR where everyone else is.

@neolit123 neolit123 added the kind/design Categorizes issue or PR as related to design. label Oct 26, 2023
@neolit123
Copy link
Member

neolit123 commented Oct 30, 2023

as discussed on the other issues / PRs we need two separate daemon sets to manage this:
kubernetes/kubernetes#120864 (comment)

i also had a short discussion with #sig-windows, and the two DS pattern has already been required for other features.
we can proceed forward, but it arguably needs a proposal doc (maybe a full KEP too...) - do we rename the kube-proxy DS to kube-proxy-linux and add kube-proxy-windows, or just add a separate kube-proxy-windows and only break windows users...in any case this is a breaking change in kubeadm.

for the time being we can put a hold on this and decide what to do...in particular i'm looking for feedback from the rest of the kubeadm OWNERS.

also we have this ticket from 2021, i will close it:
#2410

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 31, 2024
@neolit123 neolit123 removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 31, 2024
@neolit123 neolit123 modified the milestones: v1.30, Next Jan 31, 2024
@neolit123
Copy link
Member

removing stale and moving to next milestone. but i don't think we have a code AI unless there is a more OS portable approach with a single DS.

@uablrek
Copy link
Author

uablrek commented Jan 31, 2024

If the oom-score is set, then SYS_RESOURCE is also needed (beside NET_ADMIN), please see gardener/gardener#9000 (comment)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 30, 2024
@neolit123 neolit123 added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/addons kind/design Categorizes issue or PR as related to design. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete.
Projects
None yet
Development

No branches or pull requests

5 participants