Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimal kernel version plan: probably 4.18/4.19+ #116799

Open
9 of 22 tasks
pacoxu opened this issue Mar 21, 2023 · 18 comments
Open
9 of 22 tasks

Minimal kernel version plan: probably 4.18/4.19+ #116799

pacoxu opened this issue Mar 21, 2023 · 18 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@pacoxu
Copy link
Member

pacoxu commented Mar 21, 2023

Summary

checked means EOF

  • 3.10, Current System Validator check
  • 3.16, Namespaced IP Local Reserved Ports (safe-sysctl)
  • 3.18, containerd suggestion when using btrfs.(containerd, btrfs)
  • 4.x, containerd suggested min kernel version (containerd, core code and snapshotters)
  • 4.1, reuse min supported kernel version (kube-proxy)
  • 4.5, namespaced net.ipv4.tcp_keepalive_time (safe-sysctl)
  • 4.9 LTS, EOF in Jan 2023. (linux, LTS)
  • ~~4.11 EOF~
    • calico flex can reduce CPU usage by setting iptablesRefreshInterval (calico,cpu)
    • net.ipv4.ip_unprivileged_port_start (safe-sysctl)
  • 4.14 LTS EOF in Jan 2024 (linux, LTS)
  • 4.19 / 4.18 RHEL 🌟🌟🌟
    • cilium bump in v1.4 for eBPF supports(cilium,ebpf)
    • RSS problem with docker(fixed in 4.19.77 or 5.2.19+, 5.3.4+) (docker)
    • 4.19 LTS EOF Dec 2024. (linux, LTS)
  • 5.4 LTS EOF Dec 2025
  • 5.8+, Cgroup V2 suggestions (cgroup v2) 🌟🌟
  • 5.10 LTS EOF Dec 2026
  • 5.12
    • Recursively Read-only (RRO) hostPath mounts (RRO)
    • Support idmapped mounts (UserNamespace)
  • 5.15 LTS EOF Oct 2026
  • 6.1 LTS Dec 2026

What would you like to be added?

Currently, there is a kernel version check-in system validator that is used by kubeadm. (// Requires 3.10+, or newer)

  • This issue aims to track which version should minimum kernel version be raised to. The current status and the plan.

Versions: []string{`^3\.[1-9][0-9].*$`, `^([4-9]|[1-9][0-9]+)\.([0-9]+)\.([0-9]+).*$`}, // Requires 3.10+, or newer

During the weekly sig-node meeting, there was a discussion about adding a kernel-version-sensitive safe sysctl in kubelet. It was noted that 3.16 is considered to be a very low kernel version and it was suggested that we introduce a minimal kernel version for Kubernetes. Alternatively, we should have a warning for low kernel versions(I think that we should add a warning if the kernel version is less than 3.18 or 4.0).

const ipLocalReservedPortsMinNamespacedKernelVersion = "3.16"

Why is this needed?

Some historical issues related and some CNCF projects like cilium: #30706

connReuseMinSupportedKernelVersion = "4.1"

refresh. Note: the default for this value is lower than the other
refresh intervals as a workaround for a Linux kernel bug that was
fixed in kernel version 4.11. If you are using v4.11 or greater
you may want to set this to, a higher value to reduce Felix CPU
usage. [Default: 10s]'

https://github.com/containerd/containerd#runtime-requirements

There are specific features used by containerd core code and snapshotters that will require a minimum kernel version on Linux. With the understood caveat of distro kernel versioning, a reasonable starting point for Linux is a minimum 4.x kernel version.

@pacoxu pacoxu added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 21, 2023
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 21, 2023
@pacoxu
Copy link
Member Author

pacoxu commented Mar 21, 2023

/sig node
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 21, 2023
@pacoxu
Copy link
Member Author

pacoxu commented May 23, 2023

A new use case is #117873 (comment) which is scoped sysctl since v4.5.

@pacoxu
Copy link
Member Author

pacoxu commented May 23, 2023

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 23, 2023
@pacoxu pacoxu added this to Triaged in SIG Node Bugs May 23, 2023
@pacoxu
Copy link
Member Author

pacoxu commented May 24, 2023

"KEP-3857: Recursively Read-only (RRO) hostPath mounts"

The "rro" bind mount options is implemented by calling mount_setattr(2)
with MOUNT_ATTR_RDONLY and AT_RECURSIVE.

Requires runc >= 1.1 && kernel >= 5.12.

See https://github.com/kubernetes/enhancements/pull/3858/files#diff-a5e3a174567e889120ab32af5f159c2bcddd459cc36abdf56f3eceb5ad86d6c5R183 for more details.

@pacoxu
Copy link
Member Author

pacoxu commented Jul 14, 2023

net.ipv4.ip_unprivileged_port_start requires kernel version 4.11 or higher.

xref #56374 and #102612 for more details.

@pacoxu
Copy link
Member Author

pacoxu commented Jul 26, 2023

https://endoflife.date/linux

Some status from the Linux community :

  • 4.9 (LTS) | 6 years ago(11 Dec 2016) | Ended 6 months and 2 weeks ago(07 Jan 2023)

@pacoxu
Copy link
Member Author

pacoxu commented Sep 6, 2023

Containerd update: containerd/containerd#5890 Support idmapped mounts (kernel 5.12)

@danwinship
Copy link
Contributor

#120895 discusses moving the existing GetKernelVersion API out of pkg/proxy/ipvs (since it's also being used by kubelet); it seems like wherever we move that to could easily become the place where we document the required/preferred/optional kernel versions for different components...

@tzneal
Copy link
Contributor

tzneal commented Oct 9, 2023

/cc

@pacoxu
Copy link
Member Author

pacoxu commented Nov 13, 2023

docker/for-linux#693 (A RSS problem in old kernel with docker. Not sure if it will be with other container runtime)

Then found torvalds/linux@f9c6456 which was included in 4.19 kernel from 4.19.77 (and in newer ones since 5.2.19 and 5.3.4)

@pacoxu pacoxu changed the title Minimal kernel version plan Minimal kernel version plan: probably 4.18/4.19+ Nov 14, 2023
@pacoxu
Copy link
Member Author

pacoxu commented Nov 14, 2023

https://endoflife.date/linux

Key kernel versions:

  • ✅ 3.10, Current System Validator check
  • 4.x, containerd suggested min kernel version (containerd, core code, and snapshotters)
  • 4.9 LTS, EOF in Jan 2023. (Linux, LTS)
  • 4.14 LTS EOF in Jan 2024 (Linux, LTS)
  • 🎯 4.19 LTS EOF Dec 2024. (Linux, LTS)
  • 5.4 LTS EOF Dec 2025
  • 5.8+, Cgroup V2 suggestions (cgroup v2) 🌟🌟
  • 5.10 LTS EOF Dec 2026
  • 5.15 LTS EOF Oct 2026
  • 6.1 LTS Dec 2026

According to key points and Linux LTS status, I would suggest to change the suggested minimal kernel version to 4.19 in Kubernetes v1.30 to follow the EOL date of Linux. Any inputs?

@danwinship
Copy link
Contributor

The RHEL 8 kernel is 4.18 plus backports of way more than what's in 4.19...

Presumably the current minimum kernel is 3.10 because that's the nominal version of the RHEL 7 kernel, so having the new minimum version be the nominal version of the RHEL 8 kernel seems like it makes sense...

@pacoxu
Copy link
Member Author

pacoxu commented Dec 11, 2023

Low-level ARM specific problems (example: Instant::now() was 70x slower on ARM before this Pull Request to Rust: rust-lang/rust#88652)

@pacoxu
Copy link
Member Author

pacoxu commented Dec 22, 2023

The cgroup v1 deprecation in 1.30 was discussed in recent sig node meeting.

@pacoxu
Copy link
Member Author

pacoxu commented Jan 25, 2024

#118996 (kubernetes/release#3076) will land in Kubernetes 1.29, containing a switch to debian:bookworm. This means we now require glibc 2.36 (compared to 2.31 on bullseye) for non static binaries like the kubelet.

kubernetes/release#3246 is about glibc.

@kannon92
Copy link
Contributor

@pacoxu we have #124060 where we are exploring requiring tmpfs noswap option (where tmpfs mounts are not allowed to use swap memory).

@haircommander
Copy link
Contributor

a limitation about this approach is often downstream kernels backport needed features so would technically be compliant. I think it's worth defining features we need and maybe guess which kernels support, but it's more robust to attempt to use the feature and smartly not use it if it's not available

@danwinship
Copy link
Contributor

it's more robust to attempt to use the feature and smartly not use it if it's not available

That can work for features, but bugfixes are often impossible to autodetect without essentially doing an e2e test.

When we added pkg/util/kernel I had a vague idea that maybe there could eventually be an API like utilkernel.Require(utilkernel.IPVSConnReuseModeFixed), where the argument is an array of regexps and minimum versions so you can say "if the kernel version matches /.*\.el9\..*/ then the version must be greater than 5.14.0-322 or else if it matches /.*/ then the version must be greater than 6.1".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

No branches or pull requests

6 participants