Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RequestHeader authentication: add UID to recognized request headers #115834

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

stlaz
Copy link
Member

@stlaz stlaz commented Feb 16, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds the ability to set a UID header in order to be recognized in the RequestHeader authentication. It pushes the new header configuration into the kube-system/extension-apiserver-authentication CM, and makes the client-go AuthProxyRoundTripper add user UID to the requests it handles.

Which issue(s) this PR fixes:

Fixes #93699

Special notes for your reviewer:

Does this PR introduce a user-facing change?

A new option is being added to services that are capable of using the request header authentication - `requestheader-uid-headers`.

This new option allows you to configure which headers should be used in order to recognize the authenticating user's UID.

The suggested value for the new option is `X-Remote-Uid`.

The `kube-system/extension-apiserver-authentication` config map, that is commonly used for cluster trust discovery, will reflect this new option in its `.Data` under the `requestheader-uid-headers` key.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 16, 2023
@stlaz stlaz force-pushed the remote-uid branch 3 times, most recently from 1ce1154 to aef1832 Compare February 16, 2023 14:54
@k8s-ci-robot k8s-ci-robot added area/cloudprovider sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. labels Feb 16, 2023
@stlaz
Copy link
Member Author

stlaz commented Feb 16, 2023

/test verify
infra issues it seems

@k8s-ci-robot
Copy link
Contributor

@stlaz: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-kubernetes-conformance-kind-ga-only-parallel
  • /test pull-kubernetes-coverage-unit
  • /test pull-kubernetes-dependencies
  • /test pull-kubernetes-dependencies-go-canary
  • /test pull-kubernetes-e2e-gce
  • /test pull-kubernetes-e2e-gce-100-performance
  • /test pull-kubernetes-e2e-gce-big-performance
  • /test pull-kubernetes-e2e-gce-canary
  • /test pull-kubernetes-e2e-gce-cos
  • /test pull-kubernetes-e2e-gce-cos-canary
  • /test pull-kubernetes-e2e-gce-cos-no-stage
  • /test pull-kubernetes-e2e-gce-network-proxy-http-connect
  • /test pull-kubernetes-e2e-gce-scale-performance-manual
  • /test pull-kubernetes-e2e-kind
  • /test pull-kubernetes-e2e-kind-ipv6
  • /test pull-kubernetes-integration
  • /test pull-kubernetes-integration-go-canary
  • /test pull-kubernetes-kubemark-e2e-gce-scale
  • /test pull-kubernetes-node-e2e-containerd
  • /test pull-kubernetes-typecheck
  • /test pull-kubernetes-unit
  • /test pull-kubernetes-unit-go-canary
  • /test pull-kubernetes-update
  • /test pull-kubernetes-verify
  • /test pull-kubernetes-verify-go-canary
  • /test pull-kubernetes-verify-govet-levee

The following commands are available to trigger optional jobs:

  • /test check-dependency-stats
  • /test pull-ci-kubernetes-unit-windows
  • /test pull-e2e-gce-cloud-provider-disabled
  • /test pull-kubernetes-conformance-image-test
  • /test pull-kubernetes-conformance-kind-ga-only
  • /test pull-kubernetes-conformance-kind-ipv6-parallel
  • /test pull-kubernetes-cos-cgroupv1-containerd-node-e2e
  • /test pull-kubernetes-cos-cgroupv1-containerd-node-e2e-features
  • /test pull-kubernetes-cos-cgroupv2-containerd-node-e2e
  • /test pull-kubernetes-cos-cgroupv2-containerd-node-e2e-features
  • /test pull-kubernetes-cos-cgroupv2-containerd-node-e2e-serial
  • /test pull-kubernetes-cross
  • /test pull-kubernetes-e2e-autoscaling-hpa-cm
  • /test pull-kubernetes-e2e-autoscaling-hpa-cpu
  • /test pull-kubernetes-e2e-capz-azure-disk
  • /test pull-kubernetes-e2e-capz-azure-disk-vmss
  • /test pull-kubernetes-e2e-capz-azure-file
  • /test pull-kubernetes-e2e-capz-azure-file-vmss
  • /test pull-kubernetes-e2e-capz-conformance
  • /test pull-kubernetes-e2e-capz-ha-control-plane
  • /test pull-kubernetes-e2e-capz-windows-containerd
  • /test pull-kubernetes-e2e-containerd-gce
  • /test pull-kubernetes-e2e-gce-correctness
  • /test pull-kubernetes-e2e-gce-cos-alpha-features
  • /test pull-kubernetes-e2e-gce-cos-kubetest2
  • /test pull-kubernetes-e2e-gce-csi-serial
  • /test pull-kubernetes-e2e-gce-device-plugin-gpu
  • /test pull-kubernetes-e2e-gce-network-proxy-grpc
  • /test pull-kubernetes-e2e-gce-serial
  • /test pull-kubernetes-e2e-gce-storage-disruptive
  • /test pull-kubernetes-e2e-gce-storage-slow
  • /test pull-kubernetes-e2e-gce-storage-snapshot
  • /test pull-kubernetes-e2e-gci-gce-autoscaling
  • /test pull-kubernetes-e2e-gci-gce-ingress
  • /test pull-kubernetes-e2e-gci-gce-ipvs
  • /test pull-kubernetes-e2e-inplace-pod-resize-containerd-main-v2
  • /test pull-kubernetes-e2e-kind-canary
  • /test pull-kubernetes-e2e-kind-dual-canary
  • /test pull-kubernetes-e2e-kind-ipv6-canary
  • /test pull-kubernetes-e2e-kind-ipvs-dual-canary
  • /test pull-kubernetes-e2e-kind-kms
  • /test pull-kubernetes-e2e-kind-multizone
  • /test pull-kubernetes-e2e-kops-aws
  • /test pull-kubernetes-e2e-kubelet-credential-provider
  • /test pull-kubernetes-e2e-ubuntu-gce-network-policies
  • /test pull-kubernetes-integration-go-compatibility
  • /test pull-kubernetes-kind-dra
  • /test pull-kubernetes-kubemark-e2e-gce-big
  • /test pull-kubernetes-local-e2e
  • /test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e
  • /test pull-kubernetes-node-crio-cgrpv2-e2e
  • /test pull-kubernetes-node-crio-cgrpv2-e2e-kubetest2
  • /test pull-kubernetes-node-crio-e2e
  • /test pull-kubernetes-node-crio-e2e-kubetest2
  • /test pull-kubernetes-node-e2e-containerd-alpha-features
  • /test pull-kubernetes-node-e2e-containerd-features
  • /test pull-kubernetes-node-e2e-containerd-features-kubetest2
  • /test pull-kubernetes-node-e2e-containerd-kubetest2
  • /test pull-kubernetes-node-kubelet-credential-provider
  • /test pull-kubernetes-node-kubelet-serial-containerd
  • /test pull-kubernetes-node-kubelet-serial-containerd-kubetest2
  • /test pull-kubernetes-node-kubelet-serial-cpu-manager
  • /test pull-kubernetes-node-kubelet-serial-cpu-manager-kubetest2
  • /test pull-kubernetes-node-kubelet-serial-crio-cgroupv1
  • /test pull-kubernetes-node-kubelet-serial-crio-cgroupv2
  • /test pull-kubernetes-node-kubelet-serial-hugepages
  • /test pull-kubernetes-node-kubelet-serial-memory-manager
  • /test pull-kubernetes-node-kubelet-serial-pod-disruption-conditions
  • /test pull-kubernetes-node-kubelet-serial-topology-manager
  • /test pull-kubernetes-node-kubelet-serial-topology-manager-kubetest2
  • /test pull-kubernetes-node-memoryqos-cgrpv2
  • /test pull-kubernetes-node-swap-fedora
  • /test pull-kubernetes-node-swap-fedora-serial
  • /test pull-kubernetes-node-swap-ubuntu-serial
  • /test pull-kubernetes-unit-experimental
  • /test pull-kubernetes-unit-go-compatibility
  • /test pull-publishing-bot-validate

Use /test all to run the following jobs that were automatically triggered:

  • pull-kubernetes-conformance-kind-ga-only-parallel
  • pull-kubernetes-dependencies
  • pull-kubernetes-e2e-gce
  • pull-kubernetes-e2e-gce-100-performance
  • pull-kubernetes-e2e-kind
  • pull-kubernetes-e2e-kind-ipv6
  • pull-kubernetes-integration
  • pull-kubernetes-node-e2e-containerd
  • pull-kubernetes-typecheck
  • pull-kubernetes-unit
  • pull-kubernetes-verify
  • pull-kubernetes-verify-govet-levee

In response to this:

/test verify
infra issues it seems

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@stlaz
Copy link
Member Author

stlaz commented Feb 16, 2023

/test pull-kubernetes-verify

@enj
Copy link
Member

enj commented Feb 17, 2023

@stlaz I haven't looked at the diff yet but remember that you need to add an integration test that shows this works correctly with aggregated API servers. I would recommend using a service account to call an aggregated API and maybe intercept the authorization checks made by the aggregated API server to see what identity it saw.

@stlaz
Copy link
Member Author

stlaz commented Feb 20, 2023

Yes, an integration test is currently missing. I suppose I could write some kind of an aggregated API server similar to how the admission webhooks are being tested:
https://github.com/kubernetes/kubernetes/blob/aef1832fbf9e584b5fefa5d90717d585a3340643/test/integration/apiserver/admissionwebhook/admission_test.go#L473

@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Feb 20, 2023
@stlaz stlaz force-pushed the remote-uid branch 2 times, most recently from b7c77b3 to 10aeed2 Compare February 20, 2023 16:32
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 20, 2023
@stlaz
Copy link
Member Author

stlaz commented Feb 20, 2023

The front-proxy config test now lives separately from the other, rather big aggregated apiserver test.

edit: I split the addition of the test and a minor cleanup of the file into two commits so while reviewing, check the first test commit first as that gets a much smaller diff.

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 12, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Reopen this PR with /reopen
  • Mark this PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@stlaz
Copy link
Member Author

stlaz commented Apr 17, 2024

/reopen
/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot reopened this Apr 17, 2024
@k8s-ci-robot
Copy link
Contributor

@stlaz: Reopened this PR.

In response to this:

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 17, 2024
@k8s-ci-robot
Copy link
Contributor

Please note that we're already in Test Freeze for the release-1.30 branch. This means every merged PR will be automatically fast-forwarded via the periodic ci-fast-forward job to the release branch of the upcoming v1.30.0 release.

Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Wed Apr 17 07:56:14 UTC 2024.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 28, 2024
Copy link
Member

@enj enj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from today's review meeting:

request header authn

proxy --(RH)--> KAS

this proxy is telling KAS who the user is
EKS -> request signing

KAS --(RH)--> aggregated API server

the proxy is KAS, and it is telling the aggregated server what the user is

bug:

david forgot that UID is a thing

username
UID
groups
extra fields (metadata)

for example, authz knows about UID

proxy --(RH)--> server
mTLS
you can constrain to a single CN

It is unclear how to make a change like this safe on upgrade if we want to update the defaults.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test case for the upgrade scenario where the existing CM has no UID config, but the API server now does have said config, meaning it should update the config map.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a unit test that verifies that the controller should be capable of this step.

Looking at the controller, we might still face trouble for the upgrade period where kube-apiserver would exist in different versions, it doesn't seem to be guarded by leader election. That might lead to update hotloops.

The same issue would occur today if somebody wanted to configure additional headers, although that's probably not very typical action, compared to cluster upgrades.

I suspect we may want to address this. Not sure if adding leader election to KAS controllers is an OK practice?

@enj
Copy link
Member

enj commented Apr 29, 2024

It is unclear how to make a change like this safe on upgrade if we want to update the defaults.

Maybe we just have to leave KAS with no default value here, but can update the defaults for anything "behind" KAS (i.e. aggregated API servers) because the only actor that can assert that mTLS connection is KAS, and we can rely on KAS deleting the UID headers from any incoming request before it forwards the request onwards.

Comment on lines +123 to +127
fs.StringSliceVar(&s.UIDHeaders, "requestheader-uid-headers", s.UIDHeaders, ""+
"List of request headers to inspect for UIDs. X-Remote-Uid is suggested.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per triage meeting review from today:

This flag needs to not be wired up in the first release that contains this feature as then it would be possible for old KAS to proxy a request to a new aggregated API server without dropping the UID header. Everything else is safe to do in this PR. To be able to test this code path in integration tests, we can use an approach similar to SetServiceResolverForTests to let us set the UIDHeaders field. A future PR in a later release would simply drop that global in place of the actual flag wiring.

I believe it is safe to update NewDelegatingAuthenticationOptions with the X-Remote-Uid as a default because no matter what that relies on the front proxy dropping headers that the client should not set. And that code path isn't really used because the vast majority of cases should be using the config map, which will only get updated once KAS is updated and the --requestheader-uid-headers CLI flag is set. @liggitt any thoughts on the upgrade story here?

Once we let people configure this flag on KAS, we should require that it contains at least X-Remote-Uid so that API aggregation will work correctly (we can be stricter here because the config is new).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just got here, need to think about the skew implications.

There was just a lengthy discussion about what skew between kube-apiserver and aggregated servers are allowed / recommended / supported (kubernetes/website#46109, #124533, #124655, api-machinery meeting https://www.youtube.com/watch?v=0TXm-DGcK1k&t=31m). We don't really dictate whether aggregated API servers' use of k8s.io/apiserver library can / should be newer than kube-apiserver today. cc @deads2k as well who might have thoughts.

I agree our defaults (in kube-apiserver and k8s.io/apiserver library) should not lead to aggregated API servers trusting headers before kube-apiserver sets / guards / protects them.

I think the following defaults work:

  • kube-apiserver 1.31+ does not default the uid CLI flag (just like the other requestheader flags)
  • kube-apiserver 1.31+ (as kube-aggregator) unconditionally clears the standard uid header when proxying (just like it clears the other standard headers)
  • kube-apiserver 1.31+ (as kube-aggregator) unconditionally sets the standard uid header when proxying (just like it sets the other standard headers)
  • k8s.io/apiserver library v0.31.0+ wires the uid CLI flag
  • k8s.io/apiserver library v0.31.0+ NewDelegatingAuthenticationOptions() does not default the uid header; this keeps the CLI flags safe against much older kube-apiservers which aren't protecting the header; in practice, most aggregated API servers pull their auth header config from the configmap published by kube-apiserver rather than the CLI flags, so if/when their kube-apiserver publishes uid header config, they'll start using it

In the future (kube-apiserver 1.32+) we can consider making kube-apiserver also include the standard username/groups/extra/uid headers the aggregator uses (which are not configurable) in the configmap it publishes containing auth config, so that aggregation works even if kube-apiserver only set non-standard requestheader flag options

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the defaulting from the DelegatingAuthenticationOptions() which I think was the only thing missing to get us to the desired state as described by Jordan.

We'll probably still have to think how to update the cluster trust CM - #115834 (comment).

@stlaz stlaz force-pushed the remote-uid branch 2 times, most recently from 603a7ab to 6d7b8c8 Compare May 20, 2024 12:16
@stlaz stlaz force-pushed the remote-uid branch 2 times, most recently from cda7dc1 to ade608a Compare May 22, 2024 13:46
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 22, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: stlaz
Once this PR has been reviewed and has the lgtm label, please assign deads2k, mikedanese for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented May 22, 2024

@stlaz: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-gce-providerless 8922cd2 link false /test pull-kubernetes-e2e-gce-providerless
pull-kubernetes-linter-hints adea2e6 link false /test pull-kubernetes-linter-hints

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@stlaz
Copy link
Member Author

stlaz commented May 23, 2024

The latest commit removes the DelegatedAuthenticationOptions() defaulting for the UID headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apiserver area/cloudprovider area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: In Review
Status: Tracked
Development

Successfully merging this pull request may close these issues.

UserInfo.UID is not passed by client-go
10 participants