Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/1.5 backport] cgroup2: monitor OOMKill instead of OOM to prevent missing container events #6735

Merged
merged 1 commit into from Mar 24, 2022

Commits on Mar 24, 2022

  1. cgroup2: monitor OOMKill instead of OOM to prevent missing container …

    …OOM events
    
    With the cgroupv2 configuration employed by Kubernetes, the pod cgroup (slice)
    and container cgroup (scope) will both have the same memory limit applied. In
    that situation, the kernel will consider an OOM event to be triggered by the
    parent cgroup (slice), and increment 'oom' there. The child cgroup (scope) only
    sees an oom_kill increment. Since we monitor child cgroups for oom events,
    check the OOMKill field so that we don't miss events.
    
    This is not visible when running containers through docker or ctr, because they
    set the limits differently (only container level). An alternative would be to
    not configure limits at the pod level - that way the container limit will be
    hit and the OOM will be correctly generated. An interesting consequence is that
    when spawning a pod with multiple containers, the oom events also work
    correctly, because:
    
    a) if one of the containers has no limit, the pod has no limit so OOM events in
       another container report correctly.
    b) if all of the containers have limits then the pod limit will be a sum of
       container events, so a container will be able to hit its limit first.
    
    Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
    (cherry picked from commit 7275411)
    Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
    jepio authored and AkihiroSuda committed Mar 24, 2022
    Copy the full SHA
    1c68f50 View commit details
    Browse the repository at this point in the history