Skip to content

Latest commit

 

History

History
64 lines (55 loc) · 22.1 KB

built_in_rules.md

File metadata and controls

64 lines (55 loc) · 22.1 KB

The Policy Modes and Built-in Rules

English | 简体中文

The Policy Modes

The modes can be specified through the spec.policy.mode field of VarmorPolicy/VarmorClusterPolicy objects. The modes supported by different enforcers are shown in the following table.

Policy Mode AppArmor BPF Seccomp Description
AlwaysAllow [x] [x] - No mandatory access control rules are imposed on container.
RuntimeDefault [x] [x] - Basic protection is provided using the same default policy as the container runtime components (such as containerd's cri-containerd.apparmor.d).
EnhanceProtect [x] [x] [x] - It offers 5 types of built-in rules and custom interfaces to meet various protection requirements.
- Enhanced protection is based on the RuntimeDefault mode by default. (The spec.policy.privileged field is nil or false)
- Also supports enhanced protection on the basis of the AlwaysAllow mode. (The spec.policy.privileged field is true)
BehaviorModeling [x] [ ] [x] - Utilize BPF and Audit technologies to perform behavior modeling on multiple workloads.
- The behavior model will be stored in the corresponding ArmorProfileModel object.
- Dynamic switching mode is not supported.
- Please refer to the BehaviorModeling Mode for more details.
DefenseInDepth [x] - [x] Protect the workloads based on the ArmorProfileModel object.

Note: vArmor policy supports dynamic switching of running modes (limited to AlwaysAllow, EnhanceProtect, RuntimeDefault, DefenseInDepth) and updating sandbox rules without having to restart the workloads. However, when using the Seccomp enforcer, the workload must be restarted for changes to the Seccomp Profile to take effect.

The Built-in Rules

vArmor supports defining protection policies (VarmorPolicy/VarmorClusterPolicy) using built-in rules and custom interfaces in EnhanceProtect mode. The currently supported built-in rules and categories are shown in the following table.

Note:
- The built-in rules and syntax supported by different enforcers are still under development.
- There are some limitations in the rules and syntax supported by different enforcers. For example, the AppArmor enforcer does not support fine-grained network access control, and BPF does not support access control for specified executables.

Category Subcategory Rule Name & ID Applicable Container Description Principle & Impact Supported Enforcer
Hardening Securing Privileged Containers Prohibit modifying procfs' core_pattern

disallow-write-core-pattern
Privileged Attackers may attempt container escape by modifying the procfs core_pattern in a privileged container or, in a container (w/ CAP_SYS_ADMIN), unmounting specific mount points and then modifying the procfs core_pattern to execute a container escape. Disallow writing to the procfs' core_pattern file. AppArmor
BPF
Prohibit mounting securityfs

disallow-mount-securityfs
Privileged Attackers may attempt container escape in containers (w/ CAP_SYS_ADMIN) by mounting securityfs with read-write permissions and subsequently modifying it. Disallow mounting of new security file systems. AppArmor
BPF
Prohibit remounting procfs

disallow-mount-procfs
Privileged Attackers may attempt container escape in containers (w/ CAP_SYS_ADMIN) by remounting procfs with read-write permissions and subsequently modifying the core_pattern, among other things. 1. Disallow mounting of new proc file systems.

2. Prohibit using bind, rbind, move, remount options to remount /proc**.

3. When using BPF enforcer, it also prevents unmounting /proc**.
AppArmor
BPF
Prohibit modifying cgroupfs' release_agent

disallow-write-release-agent
Privileged Attackers may attempt container escape within privileged container by directly modifying the cgroupfs release_agent. Disallow writing to the cgroupfs' release_agent file. AppArmor
BPF
Prohibit remounting cgroupfs

disallow-mount-cgroupfs
Privileged Attackers may attempt to escape from containers (w/ CAP_SYS_ADMIN) by remounting cgroupfs with read-write permissions. Subsequently, they can modify release_agent and device access permissions, among other things. 1. Disallow mounting new cgroup file systems.

2. Prohibit using bind, rbind, move, remount options to remount /sys/fs/cgroup**.

3. Prohibit using rbind option to remount /sys**.

4. When using BPF enforcer, it also prevents unmounting /sys**.
AppArmor
BPF
Prohibit debugging of disk devices

disallow-debug-disk-device
Privileged Attackers may attempt to read and write host machine files by debugging host machine disk devices within a privileged container.

It is recommended to use this rule in conjunction with disable_cap_mknod to prevent attackers from bypassing the rule with mknod.
Dynamically acquire host disk devices and restrict container access them with read-write permissions. AppArmor
BPF
Prohibit mounting of host's disk devices

disallow-mount-disk-device
Privileged Attackers may attempt to mount host machine disk devices within a privileged container, thereby gaining read-write access to host machine files.

It is recommended to use this rule in conjunction with disable_cap_mknod to prevent attackers from bypassing the rule with mknod.
Dynamically acquire host machine disk device files and prevent mounting within containers. AppArmor
BPF
Disable the mount system call

disallow-mount
Privileged MOUNT(2) is often used for privilege escalation, container escapes, and other attacks. Most microservices applications do not require mount operations. Therefore, it is recommended to use this rule to restrict container processes from using the mount() system call.

Note: The mount system call will be disabled by default if the spec.policy.privileged field is false.
Disable the mount system call. AppArmor
BPF
Disable the umount system call

disallow-umount
ALL UMOUNT(2) can be used to remove the attachment of topmost mount points(such as maskedPaths), leading to privilege escalation and information disclosure. Most microservices applications do not require umount operations. Therefore, it is recommended to use this rule to restrict container processes from using the umount() system call. Disable the umount system call. AppArmor
BPF
Prohibit loading kernel modules

disallow-insmod
Privileged Attackers may attempt to inject code into the kernel within a container (w/ CAP_SYS_MODULE) by executing kernel module loading command. Disable CAP_SYS_MODULE AppArmor
BPF
Prohibit loading eBPF programs

disallow-load-ebpf
ALL Attackers may load eBPF programs within a container (w/ CAP_SYS_ADMIN & CAP_BPF) to theft data or create rootkit.

Note: CAP_BPF was introduced starting from Linux 5.8.
Disable CAP_SYS_ADMIN & CAP_BPF AppArmor
BPF
Prohibit accessing process's root directory

disallow-access-procfs-root
ALL This policy prohibits processes within containers from accessing the root directory of the process filesystem (i.e., /proc/[PID]/root), preventing attackers from exploiting shared PID namespaces to launch attacks.

Attackers may attempt to access the process filesystem outside the container by reading and writing to /proc/*/root in environments where the PID namespace is shared with the host or other containers. This could lead to information disclosure, privilege escalation, lateral movement, and other attacks.
Disable PTRACE_MODE_READ permission AppArmor
BPF
Prohibit accessing kernel exported symbol

disallow-access-kallsyms
ALL Attackers may attempt to leak the base address of kernel modules from containers (w/ CAP_SYSLOG) by reading the kernel's exported symbol definitions file. This assists attackers in bypassing KASLR protection to exploit kernel vulnerabilities more easily. Disallow reading /proc/kallsyms file AppArmor
BPF
Disable Capabilities Disable all capabilities

disable-cap-all
ALL Disable all capabilities - AppArmor
BPF
Disable all capabilities except for NET_BIND_SERVICE

disable-cap-all-except-net-bind-service
ALL Disable all capabilities except for NET_BIND_SERVICE.

This rule complies with the Restricted Policy of the Pod Security Standards.
- AppArmor
BPF
Disable privileged capabilities

disable-cap-privileged
ALL Disable all privileged capabilities (those that can directly lead to escapes or affect host availability). Only allow the default capabilities.

This rule complies with the Baseline Policy of the Pod Security Standards, except for the net_raw capability.
- AppArmor
BPF
Disable specified capability

disable-cap-XXXX
ALL Disable any specified capabilities, replacing XXXX with the values from 'capabilities(7),' for example, disable-cap-net-raw. - AppArmor
BPF
Blocking Exploit Vectors Prohibit abusing user namespaces

disallow-abuse-user-ns
ALL User namespaces can be used to enhance container isolation. However, it also increases the kernel's attack surface, making certain kernel vulnerabilities easier to exploit. Attackers can use a container to create a user namespace, gaining full privileges and thereby expanding the kernel's attack surface

Disallowing container processes from abusing CAP_SYS_ADMIN privileges via user namespaces can reduce the kernel's attack surface and block certain exploitation paths for kernel vulnerabilities.

This rule can be used to harden containers on systems where kernel.unprivileged_userns_clone=0 or user.max_user_namespaces=0 is not set.
Disable CAP_SYS_ADMIN AppArmor
BPF
Prohibit creating user namespace

disallow-create-user-ns
ALL User namespaces can be used to enhance container isolation. However, it also increases the kernel's attack surface, making certain kernel vulnerabilities easier to exploit. Attackers can use a container to create a user namespace, gaining full privileges and thereby expanding the kernel's attack surface

Disallowing container processes from creating new user namespaces can reduce the kernel's attack surface and block certain exploitation paths for kernel vulnerabilities.

This rule can be used to harden containers on systems where kernel.unprivileged_userns_clone=0 or user.max_user_namespaces=0 is not set.
Disallow creating user namespace Seccomp
Attack Protection Mitigating Information Leakage Mitigating ServiceAccount token leakage.

mitigate-sa-leak
ALL This rule prohibits container processes from reading sensitive Service Account-related information, including tokens, namespaces, and CA certificates. It helps prevent security risks arising from the leakage of Default ServiceAccount or misconfigured ServiceAccount. In the event that attackers gain access to a container through an RCE vulnerability, they often seek to further infiltrate by leaking ServiceAccount information.

In most user scenarios, there is no need for Pods to communicate with the API Server using ServiceAccounts. However, by default, Kubernetes still sets up default ServiceAccounts for Pods that do not require communication with the API Server.
Disallow reading ServiceAccount-related files. AppArmor
BPF
Mitigating host disk device number leakage

mitigate-disk-device-number-leak
ALL Attackers may attempt to obtain host disk device numbers for subsequent container escape by reading the container process's mount information. Disallow reading /proc/[PID]/mountinfo and /proc/partitions files AppArmor
BPF
Mitigating container overlayfs path leakage

mitigate-overlayfs-leak
ALL Attackers may attempt to obtain the overlayfs path of the container's rootfs on the host by accessing the container process's mount information, which could be used for subsequent container escape. Disallow reading /proc/mounts, /proc/[PID]/mounts, and /proc/[PID]/mountinfo files.

This rule may impact some functionality of the 'mount' command or syscall within containers
AppArmor
BPF
Mitigating host IP leakage

mitigate-host-ip-leak
ALL After gaining access to a container through an RCE vulnerability, attackers often attempt further network penetration attacks. Therefore, restricting attackers from obtaining sensitive information such as host IP, MAC, and network segments through this vector can increase the difficulty and cost of their network penetration activities. Disallow reading ARP address resolution tables (/proc/net/arp, /proc/[PID]/net/arp, etc.) AppArmor
BPF
Disallow access to the metadata service

disallow-metadata-service
ALL This rule prohibits container processes from accessing the cloud server's Instance Metadata Service, including two reserved local addresses: 100.96.0.96 and 169.254.169.254.

Attackers, upon gaining code execution privileges within a container, may attempt to access the cloud server's Metadata Service for information disclosure. In certain scenarios, attackers may obtain sensitive information, leading to privilege escalation and lateral movement.
Prohibit connections to Instance Metadata Services' IP addresses BPF
Disable Sensitive Operations Prohibit writing to the /etc directory

disable-write-etc
ALL Attackers may attempt privilege escalation by modifying sensitive files in the /etc directory, such as altering /etc/bash.bashrc for watering hole attacks, editing /etc/passwd and /etc/shadow to add users for persistence, or modifying nginx.conf or /etc/ssh/ssh_config for persistence. Disallow writing to the /etc directory AppArmor
BPF
Prohibit the execution of busybox command

disable-busybox
ALL Some application services are packaged using base images like busybox or Alpine. This also provides attackers with a lot of convenience, as they can use busybox to execute commands and assist in their attacks. Prohibit the execution of busybox.

If containerized services rely on busybox or related bash commands, enabling this policy may lead to runtime errors.
AppArmor
BPF
Prohibit the creation of Unix shells

disable-shell
ALL After gaining remote code execution privileges through an RCE vulnerability, attackers may use a reverse shell to gain arbitrary command execution capabilities within the container.

This rule prohibits container processes from creating new Unix shells, thus defending against reverse shell.
Prohibit the creation of Unix shells

Some base images may symlink sh to /bin/busybox. In this scenario, it's also necessary to prohibit the execution of busybox.
AppArmor
BPF
Prohibit the execution of wget command

disable-wget
ALL Attackers may use the wget command to download malicious programs for subsequent attacks, such as persistence, privilege escalation, network scanning, cryptocurrency mining, and more.

This rule limits file downloads by prohibiting the execution of the wget command.
Prohibit the execution of wget

Some base images may symlink wget to /bin/busybox. In this scenario, it's also necessary to prohibit the execution of busybox.
AppArmor
BPF
Prohibit the execution of curl command

disable-curl
ALL Attackers may use the curl command to initiate network access and download malicious programs from external sources for subsequent attacks, such as persistence, privilege escalation, network scanning, cryptocurrency mining, and more.

This rule limits network access by prohibiting the execution of the curl command.
Prohibit the execution of curl command. AppArmor
BPF
Prohibit the execution of chmod command

disable-chmod
ALL When attackers gain control over a container through vulnerabilities, they typically attempt to download additional attack code or tools into the container for further attacks, such as privilege escalation, lateral movement, cryptocurrency mining, and more. In this attack chain, attackers often use the chmod command to modify file permissions for execution. Prohibit the execution of chmod command.

Some base images may symlink wget to /bin/busybox. In this scenario, it's also necessary to prohibit the execution of busybox command.
AppArmor
BPF
Prohibit setting the execute/search bit of a file

disable-chmod-x-bit
ALL When attackers gain control over a container through vulnerabilities, they typically attempt to download additional attack code or tools into the container for further attacks, such as privilege escalation, lateral movement, cryptocurrency mining, and more. In this attack chain, attackers might use the chmod syscalls to modify file permissions for execution. Prohibit setting the execute/search bit of a file with chmod/fchmod/fchmodat/fchmodat2 syscalls Seccomp
Prohibit setting the SUID/SGID bit of a file

disable-chmod-s-bit
ALL In some scenarios, attackers may attempt to invoke chmod syscalls to perform privilege elevation attacks by setting the file's s-bit (set-user-ID, set-group-ID). Prohibit setting the set-user-ID/set-group-ID bit of a file with chmod/fchmod/fchmodat/fchmodat2 syscalls Seccomp
Prohibit the execution of su/sudo command

disable-su-sudo
ALL When processes within a container run as non-root users, attackers often need to escalate privileges to the root user for further attacks. The sudo/su commands are common local privilege escalation avenues. Prohibit the execution of su/sudo command.

Some base images may symlink su to /bin/busybox. In this scenario, it's also necessary to prohibit the execution of busybox command.
AppArmor
BPF
Restrict Specific Executable - ALL This rule extends the use cases of 'Mitigating Information Leakage' and 'Disabling Sensitive Operations', it allows user to apply restrictions only to specific executable programs within containers.

Restricting specified executable programs serves two purposes:
1). Preventing sandbox policies from affecting the execution of application services within containers.
2).Restricting specified executable programs within containers increases the cost and difficulty for attackers

For example, this feature can be used to restrict programs like busybox, bash, sh, curl within containers, preventing attackers from using them to execute sensitive operations. Meanwhile, the application services is unaffected by sandbox policies and can continue to access ServiceAccount tokens and perform other tasks normally.

Note: Due to the implementation principles of BPF LSM, this feature cannot be provided by the BPF enforcer.
Enable sandbox restrictions for specified executable programs. AppArmor
Vulnerability Mitigation - Mitigate cgroups & lxcfs escape

cgroups-lxcfs-escape-mitigation
ALL If users mount the host's cgroupfs into a container or use lxcfs to provide a resource view for the container, there may be a risk of container escape in both scenarios. Attackers could manipulate cgroupfs from within the container to achieve container escape.

This rule can also be used to defend against CVE-2022-0492 vulnerability exploitation.
AppArmor Enforcer prevents writing to:
/**/release_agent,
/**/devices/device.allow,
/**/devices/**/device.allow,
/**/devices/cgroup.procs,
/**/devices/**/cgroup.procs,
/**/devices/task,
/**/devices/**/task,

BPF Enforcer prevents writing to:
/**/release_agent
/**/devices.allow
/**/cgroup.procs
/**/devices/tasks
AppArmor
BPF
- Mitigate the ability to override runc to escape

runc-override-mitigation
ALL The rule is designed to mitigate vulnerabilities such as CVE-2019-5736 that exploit container escape by tampering with the host machine's runc. Disallow writing to /**/runc files AppArmor
BPF
- Mitigate the 'Dirty Pipe' exploit to escape

dirty-pipe-mitigation
ALL The rule is designed to defend against attacks exploiting the CVE-2022-0847 (Dirty Pipe) vulnerability for container escape. You can use this rule to harden container, before upgrading or patching the kernel.

Note: While this rule may cause issues in some software packages, blocking the syscall usually does not have an effect on legitimate applications, since use of this syscall is relatively rare.
Disallow calling splice syscall Seccomp
THIS_IS_A_PLACEHOLDER_PLACEH