Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop PodSecurityPolicy usage / move to PodSecurity #5250

Closed
23 tasks done
rfranzke opened this issue Jan 12, 2022 · 11 comments
Closed
23 tasks done

Drop PodSecurityPolicy usage / move to PodSecurity #5250

rfranzke opened this issue Jan 12, 2022 · 11 comments
Assignees
Labels
area/open-source Open Source (community, enablement, contributions, conferences, CNCF, etc.) related kind/enhancement Enhancement, improvement, extension priority/2 Priority (lower number equals higher priority)

Comments

@rfranzke
Copy link
Member

rfranzke commented Jan 12, 2022

How to categorize this issue?

/area open-source
/kind enhancement

What would you like to be added:
Drop usage of PodSecurityPolicys and potentially move to PodSecurity.
Read more here: PodSecurityPolicy Deprecation: Past, Present, and Future

Steps:

Why is this needed:
PodSecurityPolicys are deprecated and will be removed in v1.25. With v1.23, a new feature called PodSecurity was promoted to beta (ref).

@rfranzke rfranzke added the kind/enhancement Enhancement, improvement, extension label Jan 12, 2022
@gardener-robot gardener-robot added the area/open-source Open Source (community, enablement, contributions, conferences, CNCF, etc.) related label Jan 12, 2022
@gardener-ci-robot
Copy link
Contributor

The Gardener project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close

/lifecycle stale

@gardener-prow gardener-prow bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 12, 2022
@acumino acumino added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label May 5, 2022
@acumino
Copy link
Member

acumino commented May 5, 2022

/remove-lifecycle stale

@gardener-prow gardener-prow bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 5, 2022
@rfranzke rfranzke added the priority/2 Priority (lower number equals higher priority) label Jun 7, 2022
@rfranzke
Copy link
Member Author

/remove lifecycle/frozen
/assign @shafeeqes

@shafeeqes
Copy link
Contributor

shafeeqes commented Jul 6, 2022

I looked into this issue.I will try to summarize what I understood:

  1. Pod Security Admission is controlled by labels on namespaces. eg: pod-security.kubernetes.io/MODE: LEVEL
    There are 3 modes,
  • enforce : Policy violations will cause the pod to be rejected.
  • audit : Policy violations will trigger the addition of an audit annotation to the event recorded in the audit log, but are otherwise allowed.
  • warn : Policy violations will trigger a user-facing warning, but are otherwise allowed.

and 3 levels:

  • Privileged : Unrestricted policy
  • Baseline : Allows the default (minimally specified) Pod configuration.
  • Restricted : Heavily restricted policy.
  1. It is Purely validating , So we have to take care of the current mutating and mutating&validating fields. Also introduce validating webhooks for the fields which are not part of the new PodSecurity , but we still want to handle. The whole mapping is given here,

In short, The fields which we have to take care in our case are:

  • .spec.allowedHostPaths
  • .spec.supplementalGroups
  • .spec.fsGroup
  • .spec.seLinux
  • .spec.runAsUser
  • .spec.requiredDropCapabilities
  • .spec.seLinux
  • .spec.runAsUser
  • .spec.allowPrivilegeEscalation
  1. So the suggested migration method, is to create a new PSP without these fields, (and handling everything else by our own admission webhooks), and then give RBAC to use this PSP, and once all the pods are migrated, we have to delete the old PSP. (detailed steps are here)
  2. Then we update the namespaces with the desired levels and modes. and for new namespaces, there is default setting as well: https://kubernetes.io/docs/tasks/configure-pod-container/enforce-standards-admission-controller/#configure-the-admission-controller and for exemptions: https://kubernetes.io/docs/concepts/security/pod-security-admission/#exemptions
  3. So currently doing a dryrun of the baseline policy on a seed, gives these warning, which means, we have to give these namespaces privileged permissons/exempt them
kubectl label --dry-run=server --overwrite ns --all pod-security.kubernetes.io/enforce=baseline
namespace/default labeled
namespace/extension-dns-external-99c7f labeled
namespace/extension-networking-calico-njsht labeled
namespace/extension-os-gardenlinux-9h7lj labeled
Warning: existing pods in namespace "extension-provider-aws-4kwh2" violate the new PodSecurity enforce level "baseline:latest"
Warning: mtu-customizer-2qlwd (and 5 other pods): non-default capabilities, host namespaces
namespace/extension-provider-aws-4kwh2 labeled
Warning: existing pods in namespace "garden" violate the new PodSecurity enforce level "baseline:latest"
Warning: fluent-bit-946g8 (and 5 other pods): hostPath volumes
namespace/garden labeled
namespace/istio-ingress labeled
namespace/istio-system labeled
namespace/kube-node-lease labeled
namespace/kube-public labeled
Warning: existing pods in namespace "kube-system" violate the new PodSecurity enforce level "baseline:latest"
Warning: apiserver-proxy-jpxzz (and 5 other pods): non-default capabilities, host namespaces, hostPort
Warning: calico-kube-controllers-6f444cdf45-prhdq: privileged
Warning: calico-node-76xrz (and 11 other pods): host namespaces, hostPath volumes, hostPort, privileged
Warning: calico-typha-deploy-7c7455f5c7-lzf2k: host namespaces, hostPort
Warning: egress-filter-applier-55lwd (and 5 other pods): non-default capabilities, host namespaces
Warning: kube-proxy-worker-tj5yb-v1.23.6-2mhjp (and 5 other pods): non-default capabilities, host namespaces, hostPath volumes, hostPort, privileged
Warning: network-problem-detector-host-9bjvg (and 11 other pods): host namespaces, hostPath volumes, hostPort
Warning: network-problem-detector-pod-8w9v7 (and 5 other pods): hostPath volumes
Warning: node-problem-detector-5k44s (and 5 other pods): hostPath volumes, privileged
Warning: vpn-shoot-578d5dcd9b-kp584: non-default capabilities, privileged
namespace/kube-system labeled
Warning: existing pods in namespace "shoot--i545724dev--i545724-1" violate the new PodSecurity enforce level "baseline:latest"
Warning: etcd-events-0 (and 1 other pod): non-default capabilities
Warning: kube-apiserver-864475f96-ns8qd: hostPath volumes
Warning: loki-0 (and 1 other pod): non-default capabilities, privileged
namespace/shoot--i545724dev--i545724-1 labeled
  1. And we have to rollout the new PSPs and webhooks as mentioned, in one release and then handle the logic after no pods are using the old ones. And we have to do this for only for clusters >= v1.22

@shafeeqes
Copy link
Contributor

shafeeqes commented Jul 14, 2022

Summarizing the meeting on "Migration to PodSecurity" on 12/07/22,
The next steps on this issue would be:

    • For clusters with k8s v1.23+, Adapt components to use gardener.privileged PSP instead of dedicated PSPs.
  1. Adapt extensions as well to use gardener.privileged PSP For clusters with k8s v1.23+
  2. Add an option to disable the kube-apiserver admission plugins in the ShootSpec.
  3. Add handling for enforcing PodSecurityPolicy admission plugin is disabled in the ShootSpec, if users want to upgrade their clusters to k8s v1.25 . Add documentation to follow the migration steps mentioned here and cleanup the PSPs deployed by them, because otherwise in v1.25 there won't be any API serving the PodSecurityPolicy and therefore the resources can't be cleaned up.
  4. If PodSecurityPolicy plugin is disabled in ShootSpec, clean up gardener.privileged PSP as well.
  5. For new clusters starting from v1.25, if spec.kubernetes.allowPrivilegedContainers is set to false in shoot yaml, we apply the restricted level for all the namespaces by default.
    If spec.kubernetes.allowPrivilegedContainers is set to true, then all the namespaces would have privileged level by default (currently if this field is set to true, gardener.privileged psp is used which have all the permissions).
    Resources deployed by gardener will be exempted in all cases.

Update: The progress is now tracked in the issue description itself and has modified steps.

@rfranzke
Copy link
Member Author

/remove-lifecycle frozen

@gardener-prow gardener-prow bot removed the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jul 15, 2022
@rfranzke
Copy link
Member Author

With @ary1992 we looked the description above and were a bit confused about the following points:

  1. The step

    Disable PodSecurityPolicy admission controller for these clusters after migration.

    is mentioned before the step

    For safe migration of shoot clusters for end users, we will introduce a field [...]

    However, it should be connected with each other, right? Only when the end-user confirms that the PSP migration was done from his side (by setting this disablePSP=true) then we should disable the PodSecurityPolicy admission plugin in the kube-apiserver and delete all PodSecurityPolicy API objects that we created.

  2. Instead of introducing the new disablePSP field, can we re-use

    // AdmissionPlugins contains the list of user-defined admission plugins (additional to those managed by Gardener), and, if desired, the corresponding
    // configuration.
    // +patchMergeKey=name
    // +patchStrategy=merge
    // +optional
    AdmissionPlugins []AdmissionPlugin `json:"admissionPlugins,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,2,rep,name=admissionPlugins"`
    and extend it with the option to disable admission plugins? Right now, this API only allows to enable admission plugins and/or provide configuration for them, but we don't support disabling admission plugins yet. This way, it would be (a) cleaner (a field disablePSP is quite "ugly" in the spec) and (b) has the advantage that this feature can also be used by end-users for other use-cases.

  3. Instead of doing this:

    The gardener managed namespaces should have privileged level by default.

    Can we simply exempt the gardener-resource-manager (by username, ref https://kubernetes.io/docs/concepts/security/pod-security-admission/#exemptions) instead of exemption the whole namespace? This way, end-users could still decide to make kube-system restricted if they prefer without breaking us.

@shafeeqes
Copy link
Contributor

shafeeqes commented Jul 26, 2022

  1. The step

    Disable PodSecurityPolicy admission controller for these clusters after migration.

    is mentioned before the step

    For safe migration of shoot clusters for end users, we will introduce a field [...]

    However, it should be connected with each other, right? Only when the end-user confirms that the PSP migration was done from his side (by setting this disablePSP=true) then we should disable the PodSecurityPolicy admission plugin in the kube-apiserver and delete all PodSecurityPolicy API objects that we created.

Sorry for the confusion, I just meant it as steps, that's why I said "for these clusters after migration."

2. nstead of introducing the new disablePSP field, can we re-use

// AdmissionPlugins contains the list of user-defined admission plugins (additional to those managed by Gardener), and, if desired, the corresponding
// configuration.
// +patchMergeKey=name
// +patchStrategy=merge
// +optional
AdmissionPlugins []AdmissionPlugin `json:"admissionPlugins,omitempty" patchStrategy:"merge" patchMergeKey:"name" protobuf:"bytes,2,rep,name=admissionPlugins"`

and extend it with the option to disable admission plugins? Right now, this API only allows to enable admission plugins and/or provide configuration for them, but we don't support disabling admission plugins yet. This way, it would be (a) cleaner (a field disablePSP is quite "ugly" in the spec) and (b) has the advantage that this feature can also be used by end-users for other use-cases.

Good suggestion. Thanks.

How about a Disabled field?

// AdmissionPlugin contains information about a specific admission plugin and its corresponding configuration.
type AdmissionPlugin struct {
	// Name is the name of the plugin.
	Name string
	// Disabled describes whether this plugin should be disabled in the kube-apiserver
	Disabled *bool
	// Config is the configuration of the plugin.
	Config *runtime.RawExtension
}

3. Instead of doing this:
> The gardener managed namespaces should have privileged level by default.

Can we simply exempt the `gardener-resource-manager` (by username, ref https://kubernetes.io/docs/concepts/security/pod-security-admission/#exemptions) instead of exemption the whole namespace? This way, end-users could still decide to make `kube-system` restricted if they prefer without breaking us.

Sure.

Will update the issue comment soon.

@rfranzke
Copy link
Member Author

Thanks!

How about a Disabled field?

Sounds reasonable, I think we already have this for globally enabled extensions as well, so it "fits":

// Disabled allows to disable extensions that were marked as 'globally enabled' by Gardener administrators.
// +optional
Disabled *bool `json:"disabled,omitempty" protobuf:"varint,3,opt,name=disabled"`

@shafeeqes
Copy link
Contributor

/close
All tasks are completed.

@gardener-prow gardener-prow bot closed this as completed Oct 19, 2022
@gardener-prow
Copy link
Contributor

gardener-prow bot commented Oct 19, 2022

@shafeeqes: Closing this issue.

In response to this:

/close
All tasks are completed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/open-source Open Source (community, enablement, contributions, conferences, CNCF, etc.) related kind/enhancement Enhancement, improvement, extension priority/2 Priority (lower number equals higher priority)
Projects
None yet
Development

No branches or pull requests

5 participants