Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression for systemd cgroups following last vendoring #2868

Closed
c3d opened this issue Oct 20, 2021 · 7 comments · Fixed by #2959
Closed

Regression for systemd cgroups following last vendoring #2868

c3d opened this issue Oct 20, 2021 · 7 comments · Fixed by #2959
Assignees

Comments

@c3d
Copy link
Member

c3d commented Oct 20, 2021

As very briefly documented in a comment attached to PR #2332 and in a question asked on containerd/cgroups, it looks like there was a regression introduced with the last vendored version of containerd/cgroups regarding the systemd case.

@fgiudici is presently investigating more in depth, as it appears to have broken metrics at least in the OpenShift case.

This is distantly related to #2539 insofar as it seems to be impacted by the same vendored code.

@c3d
Copy link
Member Author

c3d commented Oct 20, 2021

From @fgiudici:

The cgroups created by kata shimv2 2.3.x when the kubernetes systemd driver in place looks broken: path is not honored, a single dir at root level is created.
Example cgroup dir created:

kata_kubepods-besteffort-pod7eb9f39a_2c73_486a_b988_e48f7f961bd2.slice:crio:d6cff3b316418beb69a26fc2214f7d486b3e438eeb962014788cc4ae77b2872d

This will not allow the cgroup path to be found by cAdvisor, so both the metrics and the resource usage monitoring in kubernetes will be affected.

Once fixed the cgroups issue with the systemd driver, we wil still miss the metrics collection as the kata_ prefix will skip cAdvisor detection (by purpose, see the code). We may need to add a new parameter in the toml kata configuration file to enable skipping the (new) kata_ prefix for environments where the metrics are collected by cAdvisor (as in OCP).

@c3d
Copy link
Member Author

c3d commented Oct 20, 2021

From @fgiudici:

The cgroups creation complying with the systemd cgroups management has been dropped .
See kata code

What we need seems to be to create the cgroups as explained in the containerd/cgroups README

@c3d
Copy link
Member Author

c3d commented Oct 20, 2021

From private exchanges between @fgiudici and myself, it looks like a regression introduced by revendoring:

The cgroups creation (through containerd/cgroup code) seems now cgroupfs compliant only (coming to this statement following the links in the above comment)).

Yep, it is a regression. @sameo changed the cgroups library we used this is why not trivial to spot.
Previous library had a boolean flag to control the creation of the cgroup in cgroupfs vs systemd way, managing all of this transparently using an interface and different back-ends implementing the interface.
Now, with the containerd cgroup code, we have to pass the right hierarchy type (i.e. pass cgroups.Systemd) and optionally pre-parse the cgroup path (to comply with the systemd slice format) based on the cgroup driver in use (cgroupfs vs systemd) as showed in the links in the above comment. Samuel just set it statically the cgroupfs way.

@c3d
Copy link
Member Author

c3d commented Oct 20, 2021

It's unclear if we can autodetect this. There is a Mode() function in containerd/cgroups that tells us if it's Unified or Legacy or Hybrid, but there does not appear to be a clear way to auto-select Systemd. There is a Systemd hierarchy, but I believe that we need to figure out how to auto-detect (or make it a configuration option).

As an aside, there is a difference in the API between V1 and V2, e.g. the Resources structure replaces the old LinuxReources that was passed to functions like cgroups.New().

I asked the authors of the cgroups module about the intent, but did not get a response so far.

@c3d c3d assigned c3d and snir911 and unassigned c3d and snir911 Oct 20, 2021
@egernst
Copy link
Member

egernst commented Oct 21, 2021

Francesco, Christophe -- is there an e2e / integration test in place that should've caught this? (I expect there wasn't?) Can you share a test for reproducing the failure?

@c3d
Copy link
Member Author

c3d commented Oct 21, 2021

@egernst As far as I know, @wainersm has been working on a test to catch that, but last I read, it correctly detects issues in the 2.2 branch, but incorrectly reports success on main.

@snir911
Copy link
Member

snir911 commented Oct 28, 2021

Once fixed the cgroups issue with the systemd driver, we wil still miss the metrics collection as the kata_ prefix will skip cAdvisor detection (by purpose, see the code). We may need to add a new parameter in the toml kata configuration file to enable skipping the (new) kata_ prefix for environments where the metrics are collected by cAdvisor (as in OCP).

Apparently we used to skip using this prefix for the systemd-cgroup driver

Now, with the containerd cgroup code, we have to pass the right hierarchy type (i.e. pass cgroups.Systemd) and optionally

Other than that, it seems to me containerd's cgroup is immature when it comes to support in systemd cgroups driver; in runc docs it is mentioned either .scope or .slice can be used when asking systemd (using dbus) to create transient unit, however, it seems containerd's cgroups implementation supports only .slice with systemd cgroups driver, which apparently it is not compatible with cAdvisor supported [*.scope] scheme and breaks metrics collection (and TBH according to systemd's doc sounds like .scope is the only valid option)

snir911 added a commit to snir911/kata-containers that referenced this issue Nov 3, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 3, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 3, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
@katacontainersbot katacontainersbot moved this from To do to In progress in Issue backlog Nov 3, 2021
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 3, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 4, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 10, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 10, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api
This is currently done only when there's single sandbox
cgroup (sandbox_cgroup_only=true), otherwise we set it
as static cgroup path as it used to be (until a proper
soultion for overhead cgroup under systemd will be
suggested)

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 11, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api
This is currently done only when there's single sandbox
cgroup (sandbox_cgroup_only=true), otherwise we set it
as static cgroup path as it used to be (until a proper
soultion for overhead cgroup under systemd will be
suggested)

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 11, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api
This is currently done only when there's single sandbox
cgroup (sandbox_cgroup_only=true), otherwise we set it
as static cgroup path as it used to be (until a proper
soultion for overhead cgroup under systemd will be
suggested)

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
snir911 added a commit to snir911/kata-containers that referenced this issue Nov 14, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api
This is currently done only when there's single sandbox
cgroup (sandbox_cgroup_only=true), otherwise we set it
as static cgroup path as it used to be (until a proper
soultion for overhead cgroup under systemd will be
suggested)

Backport-from: kata-containers#2959
Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Issue backlog automation moved this from In progress to Done Nov 15, 2021
egernst pushed a commit to egernst/kata-containers that referenced this issue Nov 17, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api
This is currently done only when there's single sandbox
cgroup (sandbox_cgroup_only=true), otherwise we set it
as static cgroup path as it used to be (until a proper
soultion for overhead cgroup under systemd will be
suggested)

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
egernst pushed a commit to egernst/kata-containers that referenced this issue Nov 17, 2021
As github.com/containerd/cgroups doesn't support scope
units which are essential in some cases lets create
the cgroups manually and load it trough the cgroups
api
This is currently done only when there's single sandbox
cgroup (sandbox_cgroup_only=true), otherwise we set it
as static cgroup path as it used to be (until a proper
soultion for overhead cgroup under systemd will be
suggested)

Fixes: kata-containers#2868
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
@katacontainersbot katacontainersbot moved this from Done to In progress in Issue backlog Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Issue backlog
  
In progress
3 participants