Update SeccompDefault KEP upgrade strategy #2773

saschagrunert · 2021-06-02T10:17:26Z

This patch reflects the latest changes from the implementation PR:

Adding the kubelet configuration and CLI flag.
Making the rollout strategy more verbose and add mitigations.

Refers to: #2413
Triggered by: kubernetes/kubernetes#101943

This patch reflects the latest changes from the implementation PR: - Adding the kubelet configuration and CLI flag. - Making the rollout strategy more verbose and add mitigations. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>

saschagrunert · 2021-06-14T06:52:38Z

@liggitt @mmdriley @mrunalp PTAL

liggitt · 2021-06-14T16:24:08Z

lgtm, default behavior is unchanged from previous releases; kubelet gains a configuration option so the node owner can define what seccomp profile should be used for pods/containers which do not specify a seccomp profile

saschagrunert · 2021-06-15T07:34:29Z

@derekwaynecarr please take a look too and approve if this change is fine

mrunalp · 2021-06-15T16:43:27Z

/lgtm

mrunalp · 2021-06-15T16:43:34Z

/assign @derekwaynecarr

mmdriley · 2021-06-15T17:34:18Z

@saschagrunert thank you for the changes! I think the new approach is a good balance that lets cluster operators implement more aggressive security features without posing an acute threat to compatibility.

I'm still a bit worried we're deferring a lot of the complexity of real-world rollouts based on the idea this is "off by default". Once this feature is in, there will be pressure to turn it on by default in a future release, and we'll face the same problems around deployment and compatibility. We can avoid future headaches by thinking through this stuff now.

Since this is implemented as a node flag, responsibility for rolling it out is likely to fall to platform teams (cluster admins) rather than application owners. It will need to be enabled for all applications in the cluster at once unless admins play scheduling tricks to put only "blessed" applications on new nodes with the flag enabled.

The existence of even one non-compliant application within a cluster will pose a significant rollout hurdle.

The current rollout guidance assumes that cluster admins have access to full and comprehensive test suites for all deployed applications, including high coverage for all third-party dependencies, and can run them on demand. That doesn't match the reality I've seen. So most cluster admins will have to go down the path we call "optional" where they run applications under a permissive/complain seccomp filter for a while -- but, for that, they have to write their own report-mode seccomp filter. Maybe we can write that filter for them?

There's also a ton of cluster-specific complexity hiding behind "[i]f it's possible to monitor audit logs within the cluster". Do we think that is possible in many/most customer clusters? If so, we should gesture at how it can be done. And if not, we shouldn't be relying on it as a rollout risk mitigation.

I'm sorry I'm joining this conversation late -- for the original KEP, was there any discussion of applying these policies to workloads rather than nodes? Something like upgrading the seccompProfile on admitted pods to RuntimeDefault if unspecified. That way the rollout could be done (or avoided) on a per-application basis, which reduces rollout risk and increases the overall likelihood that more cluster workloads are made subject to a basic seccomp profile.

saschagrunert · 2021-06-16T07:17:28Z

… but, for that, they have to write their own report-mode seccomp filter. Maybe we can write that filter for them?

We followed the idea of having container runtimes define the profiles and not kubernetes. Sure, we could add documentation about how to modify the default profile to "log instead of block". We could also add more docs about how to investigate the audit logs on a node.

There's also a ton of cluster-specific complexity hiding behind "[i]f it's possible to monitor audit logs within the cluster". Do we think that is possible in many/most customer clusters? If so, we should gesture at how it can be done. And if not, we shouldn't be relying on it as a rollout risk mitigation.

I definitely see your point. I think we also can enhance the docs (or write a blog post and link it to the docs) about how this can be done in technical detail. Sure, we will not cover every platform/setup, but for example most systems ship auditd which is a good starting point.

I'm sorry I'm joining this conversation late -- for the original KEP, was there any discussion of applying these policies to workloads rather than nodes? Something like upgrading the seccompProfile on admitted pods to RuntimeDefault if unspecified. That way the rollout could be done (or avoided) on a per-application basis, which reduces rollout risk and increases the overall likelihood that more cluster workloads are made subject to a basic seccomp profile.

I could imagine doing this by using a selector, but how would that be specified? We could add it to the kubelet configuration, but I'm not in favor of mixing application and cluster logic into a single config. We could also add a new annotation/field, but this would require API changes. I think we still can elaborate on adding such a feature on top of the graduation. For example:

alpha: everything is off by default, we have a kubelet flag as well as feature gate
beta: everything is on by default, but we now have an additional API
stable: everything is on, maybe the flag get's deprecated

The point of not changing the API is already part of the KEP, but I see no blocker in revisiting on graduation.

derekwaynecarr · 2021-06-22T14:46:22Z

/lgtm
/approve

k8s-ci-robot · 2021-06-22T14:46:32Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: derekwaynecarr, saschagrunert

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-node/OWNERS~~ [derekwaynecarr]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

derekwaynecarr · 2021-06-22T14:51:17Z

@mmdriley i appreciate your summary on the distinction between platform owner and application owner with the outlined approach. i agree with @saschagrunert that a workload level choice can be explored later as well.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 2, 2021

k8s-ci-robot requested review from dchen1107 and derekwaynecarr June 2, 2021 10:17

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Jun 2, 2021

saschagrunert mentioned this pull request Jun 2, 2021

Add kubelet SeccompDefault alpha feature kubernetes/kubernetes#101943

Merged

k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 2, 2021

saschagrunert force-pushed the kep-update branch from bab0adc to c1249fe Compare June 2, 2021 10:19

Update SeccompDefault KEP upgrade strategy

a70cc18

This patch reflects the latest changes from the implementation PR: - Adding the kubelet configuration and CLI flag. - Making the rollout strategy more verbose and add mitigations. Signed-off-by: Sascha Grunert <sgrunert@redhat.com>

saschagrunert force-pushed the kep-update branch from c1249fe to a70cc18 Compare June 2, 2021 10:21

saschagrunert mentioned this pull request Jun 14, 2021

Add documentation about SeccompDefault feature kubernetes/website#27957

Merged

k8s-ci-robot assigned mrunalp Jun 15, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 15, 2021

k8s-ci-robot assigned derekwaynecarr Jun 15, 2021

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2021

k8s-ci-robot merged commit b44af54 into kubernetes:master Jun 22, 2021

k8s-ci-robot added this to the v1.22 milestone Jun 22, 2021

saschagrunert deleted the kep-update branch June 22, 2021 14:51

saschagrunert mentioned this pull request Jun 22, 2021

Kubelet option to enable seccomp by default #2413

Closed

12 tasks

mmdriley mentioned this pull request Aug 17, 2021

Add seccomp default feature blog post kubernetes/website#28951

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update SeccompDefault KEP upgrade strategy #2773

Update SeccompDefault KEP upgrade strategy #2773

saschagrunert commented Jun 2, 2021

saschagrunert commented Jun 14, 2021

liggitt commented Jun 14, 2021

saschagrunert commented Jun 15, 2021

mrunalp commented Jun 15, 2021

mrunalp commented Jun 15, 2021

mmdriley commented Jun 15, 2021

saschagrunert commented Jun 16, 2021

derekwaynecarr commented Jun 22, 2021

k8s-ci-robot commented Jun 22, 2021

derekwaynecarr commented Jun 22, 2021

Update SeccompDefault KEP upgrade strategy #2773

Update SeccompDefault KEP upgrade strategy #2773

Conversation

saschagrunert commented Jun 2, 2021

saschagrunert commented Jun 14, 2021

liggitt commented Jun 14, 2021

saschagrunert commented Jun 15, 2021

mrunalp commented Jun 15, 2021

mrunalp commented Jun 15, 2021

mmdriley commented Jun 15, 2021

saschagrunert commented Jun 16, 2021

derekwaynecarr commented Jun 22, 2021

k8s-ci-robot commented Jun 22, 2021

derekwaynecarr commented Jun 22, 2021