Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update apparmor to allow confined runc to kill containers #10123

Merged
merged 1 commit into from Apr 24, 2024

Conversation

woky
Copy link
Contributor

@woky woky commented Apr 23, 2024

/usr/sbin/runc is confined with "runc" profile[1] introduced in AppArmor v4.0.0. This change breaks stopping of containers, because the profile assigned to containers doesn't accept signals from the "runc" peer. AppArmor >= v4.0.0 is currently part of Ubuntu Mantic (23.10) and later.

The issue is reproducible both with nerdctl and ctr clients. In the case of ctr, the --apparmor-default-profile flag has to be specified, otherwise the container processes would inherit the runc profile, which behaves as unconfined, and so the subsequent runc process invoked to stop it would be able to signal it.

Test commands:

root@cloudimg:~# nerdctl run -d --name foo nginx:latest
3d1e74bfe6e7b2912d9223050ae8a81a8f4b73de0846e6d9c956c1e411cdd95a
root@cloudimg:~# nerdctl stop foo
FATA[0000] 1 errors:
unknown error after kill: runc did not terminate successfully: exit status 1: unable to signal init: permission denied
: unknown

or

root@cloudimg:~# ctr pull docker.io/library/nginx:latest
...
root@cloudimg:~# ctr run -d --apparmor-default-profile ctr-default docker.io/library/nginx:latest foo
root@cloudimg:~# ctr task kill foo
ctr: unknown error after kill: runc did not terminate successfully: exit status 1: unable to signal init: permission denied
: unknown

Relevant syslog messages (with long lines wrapped):

Apr 23 22:03:12 cloudimg kernel: audit:
  type=1400 audit(1713909792.064:262): apparmor="DENIED"
  operation="signal" class="signal" profile="nerdctl-default"
  pid=13483 comm="runc" requested_mask="receive"
  denied_mask="receive" signal=quit peer="runc"

or

Apr 23 22:05:32 cloudimg kernel: audit:
  type=1400 audit(1713909932.106:263): apparmor="DENIED"
  operation="signal" class="signal" profile="ctr-default"
  pid=13574 comm="runc" requested_mask="receive"
  denied_mask="receive" signal=quit peer="runc"

This change extends the default profile with rules that allow receiving signals from processes that run confined with either runc or crun profile (crun[2] is an alternative OCI runtime that's also confined in AppArmor >= v4.0.0, see [1]). It is backward compatible because the peer value is a regular expression (AARE) so the referenced profile doesn't have to exist for this profile to successfully compile and load.

[1] https://gitlab.com/apparmor/apparmor/-/commit/2594d936
[2] https://github.com/containers/crun

@k8s-ci-robot
Copy link

Hi @woky. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AkihiroSuda
Copy link
Member

/ok-to-test

# runc may send signals to container processes.
signal (receive) peer=runc,
# crun may send signals to container processes.
signal (receive) peer=crun,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be set only when the corresponding profile exist?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what about youki, gvisor, kata, etc ?

Copy link
Contributor Author

@woky woky Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be set only when the corresponding profile exist?

I will update the template to check for known profiles.

Also, what about youki, gvisor, kata, etc ?

These currently don't have any default profiles, see https://gitlab.com/apparmor/apparmor/-/tree/master/profiles/apparmor.d.

I don't know much about gVisor and kata runtimes, but I think those run containers in virtual machines, so their kill shouldn't involve sending signals (to container processes).

Copy link
Contributor Author

@woky woky Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkihiroSuda What is the deadline for this change to be approved and merged to make it into v1.7.16?

I'm not sure what method would you recommend to conditionally add the signal receive rules. I can think of 4 ways:

  1. Abuse macroExists(profile) to check if /etc/apparmor.d/$profile exists for each profile in a hardcoded list of OCI runtime profiles known to exist in latest AppArmor
  2. Use isLoaded(profile) to check if the profile is loaded for each profile as above.
  3. Bring back parseVersion that was removed here contrib/apparmor: remove code related to apparmor_parser version #8069 and conditionally add profiles based on known OCI runtime profiles in the parsed AppArmor version.
  4. Keep it as it is, rely on the fact that it's regular expression and it won't match profiles that are not loaded. This is almost like option 2.

Which option would you go with or do you know of a better way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can confirm that signal (receive) peer=crun, does not cause an error when the crun profile is missing, I think we can just leave it as is and call it for a day.

If it causes an error, checking /etc/apparmor.d/$profile seems good

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can confirm that signal (receive) peer=crun, does not cause an error when the crun profile is missing, I think we can just leave it as is and call it for a day.

You can call it a day ;-) - the rule will just sit there and do nothing if the crun profile doesn't exist.

/usr/sbin/runc is confined with "runc" profile[1] introduced in AppArmor
v4.0.0. This change breaks stopping of containers, because the profile
assigned to containers doesn't accept signals from the "runc" peer.
AppArmor >= v4.0.0 is currently part of Ubuntu Mantic (23.10) and later.

The issue is reproducible both with nerdctl and ctr clients. In the case
of ctr, the --apparmor-default-profile flag has to be specified,
otherwise the container processes would inherit the runc profile, which
behaves as unconfined, and so the subsequent runc process invoked to
stop it would be able to signal it.

  Test commands:

    root@cloudimg:~# nerdctl run -d --name foo nginx:latest
    3d1e74bfe6e7b2912d9223050ae8a81a8f4b73de0846e6d9c956c1e411cdd95a
    root@cloudimg:~# nerdctl stop foo
    FATA[0000] 1 errors:
    unknown error after kill: runc did not terminate successfully: exit status 1: unable to signal init: permission denied
    : unknown

    or

    root@cloudimg:~# ctr pull docker.io/library/nginx:latest
    ...
    root@cloudimg:~# ctr run -d --apparmor-default-profile ctr-default docker.io/library/nginx:latest foo
    root@cloudimg:~# ctr task kill foo
    ctr: unknown error after kill: runc did not terminate successfully: exit status 1: unable to signal init: permission denied
    : unknown

  Relevant syslog messages (with long lines wrapped):

    Apr 23 22:03:12 cloudimg kernel: audit:
      type=1400 audit(1713909792.064:262): apparmor="DENIED"
      operation="signal" class="signal" profile="nerdctl-default"
      pid=13483 comm="runc" requested_mask="receive"
      denied_mask="receive" signal=quit peer="runc"

    or

    Apr 23 22:05:32 cloudimg kernel: audit:
      type=1400 audit(1713909932.106:263): apparmor="DENIED"
      operation="signal" class="signal" profile="ctr-default"
      pid=13574 comm="runc" requested_mask="receive"
      denied_mask="receive" signal=quit peer="runc"

This change extends the default profile with rules that allow receiving
signals from processes that run confined with either runc or crun
profile (crun[2] is an alternative OCI runtime that's also confined in
AppArmor >= v4.0.0, see [1]). It is backward compatible because the peer
value is a regular expression (AARE) so the referenced profile doesn't
have to exist for this profile to successfully compile and load.

[1] https://gitlab.com/apparmor/apparmor/-/commit/2594d936
[2] https://github.com/containers/crun

Signed-off-by: Tomáš Virtus <nechtom@gmail.com>
@AkihiroSuda AkihiroSuda added cherry-pick/1.6.x Change to be cherry picked to release/1.6 branch cherry-pick/1.7.x Change to be cherry picked to release/1.7 branch impact/changelog labels Apr 23, 2024
@samuelkarp
Copy link
Member

I can't reproduce this failure with ctr on a Mantic system. Is there something I need to do to enable the new AppArmor profile?

@woky
Copy link
Contributor Author

woky commented Apr 24, 2024

I can't reproduce this failure with ctr on a Mantic system. Is there something I need to do to enable the new AppArmor profile?

Did you run ctr as I mentioned in the commit message? By default, ctr does not apply any AppArmor profile. You can apply the default one shipped with containerd with --apparmor-default-profile ANY_PROFILE_NAME.

@samuelkarp
Copy link
Member

Coming back from moby/moby#47749 (comment):

It seems like this is either a bug in AppArmor or in the packaging that Ubuntu is doing for runc, rather than a containerd-specific bug. I don't object to taking this here, but it should also be fixed in either the upstream AppArmor repo or in Ubuntu's runc package.

@samuelkarp samuelkarp added this pull request to the merge queue Apr 24, 2024
@samuelkarp
Copy link
Member

/cherrypick release/1.7
/cherrypick release/1.6

@k8s-infra-cherrypick-robot

@samuelkarp: once the present PR merges, I will cherry-pick it on top of release/1.7 in a new PR and assign it to you.

In response to this:

/cherrypick release/1.7
/cherrypick release/1.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@samuelkarp
Copy link
Member

/cherrypick release/1.6

@k8s-infra-cherrypick-robot

@samuelkarp: once the present PR merges, I will cherry-pick it on top of release/1.6 in a new PR and assign it to you.

In response to this:

/cherrypick release/1.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@thaJeztah
Copy link
Member

Yes, I have no strong objections, if this is urgently needed, but ultimately this would only affect containerd packages as packaged by Ubuntu / Debian, so could be a patch on their side.

I've written a longer comment on the Moby ticket; moby/moby#47749 (comment) (quote below)

So, perhaps a very silly question, as I don't know what the common approach is for this, but those profiles are now added to the apparmor package, on behalf of other packages (like runc and crun). Are such changes considered to be temporary solutions (i.e., runc itself doesn't install a profile, so the apparmor package creates a stub), and should the "northern star" here be for runc itself to be installing the profile?

Also asking that, because the profile added is packaging-specific; canonical packaging it separate and using a different location (sbin vs bin), which to some extent would make it feel more natural to have the profile also included in the runc (and crun) package; allowing it to define the profile with the correct location of the binary (if installed).

In that situation, Docker (and other packagers) can ship the appropriate profile for their packages (with the location they choose to use for the binary), without apparmor trying to guess / detect if it's installed.

To some extent it could even be considered for the canonical packages of docker to have the profile adjusted to match the other changes, but mostly curious what common procedure is for these

More generally, I wonder if a generic peer is something that would be possible (e.g. oci-runtime); based on the linked changes, it looks like it's possible that "other" runtimes will also get assigned their own, and I think we also want to avoid having to adjust the profile in this repository with an unbounded list of "possible" runtimes that exist.

I guess (if possible);

  • Perhaps the profile should be included in the runc and crun packages (instead of the AppArmor package trying to detect presence of binaries); this would also take the guesswork out, and make sure crun or runc (or others)
  • Still curious if these binaries should / could have a more generic way (i.e. peer=oci-runtime instead of peer=runc|crun); can they have both? I'm not deeply familiar with what's possible here, but curious if there could be a "generic" way (any oci-runtime can do this), but also keep the more specific ones (if specific rules must be applied for specific ones)
  • If a oci-runtime one could be used, this could (potentially?) allow for the profile to be tightened up; considering that the unconfined can be removed, and an oci-runtime profile must be present for any runtime that's to be used?

@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 24, 2024
@samuelkarp samuelkarp added this pull request to the merge queue Apr 24, 2024
Merged via the queue into containerd:main with commit 01ed3ff Apr 24, 2024
47 checks passed
@k8s-infra-cherrypick-robot

@samuelkarp: new pull request created: #10129

In response to this:

/cherrypick release/1.7
/cherrypick release/1.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-infra-cherrypick-robot

@samuelkarp: new pull request created: #10130

In response to this:

/cherrypick release/1.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kiashok
Copy link
Contributor

kiashok commented Apr 25, 2024

@AkihiroSuda @woky is this needed for containerd/1.7 ? If so, @woky could you cherry-pick and send a PR out?

@AkihiroSuda
Copy link
Member

@kiashok
Copy link
Contributor

kiashok commented Apr 25, 2024

Yes, the bot has already opened cherry-picks:

sounds good thanks! I think its good two sign offs - could we merge them?

@jrjohansen
Copy link

Yes, I have no strong objections, if this is urgently needed, but ultimately this would only affect containerd packages as packaged by Ubuntu / Debian, so could be a patch on their side.

I've written a longer comment on the Moby ticket; moby/moby#47749 (comment) (quote below)

I guess (if possible);

* Perhaps the profile should be included in the `runc` and `crun` packages (instead of the AppArmor package trying to detect presence of binaries); this would also take the guesswork out, and make sure `crun` or `runc` (or others)

having packaging ship and maintain profiles is nice but doesn't actually fix the issue here. Some rules to communicate with those applications in the profile they ship still would be needed. Note having these run unconfined is no longer a viable option with the user namespace restrictions.

* Still curious if these binaries should / could have a more generic way (i.e. `peer=oci-runtime` instead of `peer=runc|crun`); can they have _both_? I'm not deeply familiar with what's possible here, but curious if there could be a "generic" way (any `oci-runtime` can do this), but also keep the more specific ones (if specific rules must be applied for specific ones)

Yes a more generic way is possible. They could each define the oci-runtime profile, but this would cause issues if they were both installed at the same time. There could also be a variable defined

@{oci_runtime}=

which could be updated locally. It can even support having multiple values. eg crun, and runc

@{oci_runtime}={crun,runc,unconfined}

however the best way for packaging to update it would be to setup the definition of oci_runtime to be extended by files in a subdir, and have the packaging drop a file into the policy with the extension

$ cat /etc/apparmor.d/tunnables/oci_runtime
@{oci_runtime}=

include if exists "oci_extensions"

$ cat /etc/apparmor.d/tunnables/oci_runtime/oci_extensions/crun
@{oci_runtime}+=crun

and the generated profile would need to include the variable define, then proposed rule could use that instead.

include <tunnables/oci_runtime>

# ...


  signal (receive) peer=@{oci_runtime},

Ubuntu/the upstream apparmor other packaging would need to carry the base variable define

* If a `oci-runtime` one could be used, this could (potentially?) allow for the profile to be tightened up; considering that the `unconfined` can be removed, and an `oci-runtime` profile must be present for any runtime that's to be used?

yes removing unconfined would be good, or at least hiding it behind a variable when it is used so at least the semantics of who you want to allow are preserved.

On systems with the user namespace restrictions unconfined isn't even a viable peer any more. Support for unconfined in some form has to stay atm for systems that are still using it for the peer.

Short term I am fine with the idea of Ubuntu distro patching their fix in. I am more concerned right now with getting this right so containerd et al, only have to be patched once and it works going forward.

woky added a commit to woky/common that referenced this pull request Apr 28, 2024
…op, kill)")

This makes projects using AppArmor bits from
golang-github-containers-common (notably podman) work with AppArmor
v4.0.0.

There is a similar issue with containerd clients and docker. The fix to
containerd was merged in upstream[1]. The fix to moby (docker) was
submitted but seems to have stalled[2]. Upstream notes we should fix
regressions we introduced in Ubuntu or perhaps at least introduce a
generic way to refer to OCI runtimes under a single peer name. I suspect
we would get similar objections in containers/common. That's why I
haven't yet submitted a patch to upstream.

In the meantime, patch this library so that podman can work with OCI
runtimes we currently confine.

[1] containerd/containerd#10123
[2] moby/moby#47749

Bug: https://bugs.launchpad.net/ubuntu/+source/libpod/+bug/2040483
woky added a commit to woky/common that referenced this pull request Apr 28, 2024
…op, kill)")

This makes projects using AppArmor bits from
golang-github-containers-common (notably podman) work with AppArmor
v4.0.0.

There is a similar issue with containerd clients and docker. The fix was
merged to the containerd upstream[1]. The fix to moby (docker) was
submitted but seems to have stalled[2]. Upstream notes we should fix
regressions we introduced in Ubuntu or perhaps at least introduce a
generic way to refer to OCI runtimes under a single peer name. I suspect
we would get similar objections in containers/common. That's why I
haven't yet submitted the patch to the upstream.

In the meantime, patch this library so that podman can work with OCI
runtimes we currently confine.

[1] containerd/containerd#10123
[2] moby/moby#47749

Bug: https://bugs.launchpad.net/ubuntu/+source/libpod/+bug/2040483
woky added a commit to woky/common that referenced this pull request Apr 28, 2024
…op, kill)")

This makes projects using AppArmor bits from
golang-github-containers-common (notably podman) work with AppArmor
v4.0.0.

There is a similar issue with containerd clients and docker. The fix was
merged to the containerd upstream[1]. The fix to moby (docker) was
submitted but seems to have stalled[2]. Upstream notes we should fix
regressions we introduced in Ubuntu or perhaps at least introduce a
generic way to refer to OCI runtimes under a single peer name. I suspect
we would get similar objections in containers/common. That's why I
haven't yet submitted the patch to the upstream.

In the meantime, patch this library so that podman can work with OCI
runtimes we currently confine.

[1] containerd/containerd#10123
[2] moby/moby#47749

Bug: https://bugs.launchpad.net/ubuntu/+source/libpod/+bug/2040483
@dmcgowan dmcgowan changed the title apparmor: Allow confined runc to kill containers Update Apparmor profile to allow confined runc to kill containers May 17, 2024
@dmcgowan dmcgowan changed the title Update Apparmor profile to allow confined runc to kill containers Update apparmor to allow confined runc to kill containers May 17, 2024
@dmcgowan dmcgowan added the area/runtime Runtime label May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/runtime Runtime cherry-pick/1.6.x Change to be cherry picked to release/1.6 branch cherry-pick/1.7.x Change to be cherry picked to release/1.7 branch impact/changelog ok-to-test size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants