New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
only configure swap if swap is enabled #120784
only configure swap if swap is enabled #120784
Conversation
Hi @elezar. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cc @iholder101 @pacoxu |
/cc |
/triage accepted |
/ok-to-test |
6e6273c
to
73107ae
Compare
/retest |
2 similar comments
/retest |
/retest |
Thank you @elezar! |
LGTM label has been added. Git tree hash: ee094ed71c873dc446ef4634e38341d6c210e2d5
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment otherwise LGTM
if swapControllerAvailable() { | ||
if swapConfigurationHelper := newSwapConfigurationHelper(*m.machineInfo); utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) { | ||
// NOTE(ehashman): Behaviour is defined in the opencontainers runtime spec: | ||
// https://github.com/opencontainers/runtime-spec/blob/1c3f411f041711bbeecf35ff7e93461ea6789220/config-linux.md#memory | ||
switch m.memorySwapBehavior { | ||
case kubelettypes.LimitedSwap: | ||
swapConfigurationHelper.ConfigureLimitedSwap(lcr, pod, container) | ||
default: | ||
swapConfigurationHelper.ConfigureUnlimitedSwap(lcr) | ||
} | ||
} else { | ||
swapConfigurationHelper.ConfigureNoSwap(lcr) | ||
} | ||
} else { | ||
swapConfigurationHelper.ConfigureNoSwap(lcr) | ||
} else if utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) { | ||
klog.ErrorS(errors.New("no cgroup swap controller present"), "ignoring NodeSwap feature", "pod", klog.KObj(pod), "containerName", container.Name) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some way to cleanup these if/elses to read more nicely. In partifular, this is harad to look at:
if swapConfigurationHelper := newSwapConfigurationHelper(*m.machineInfo); utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) {
What about:
if !swapControllerAvailable() && !utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) {
// Nothing to do
}
if !swapControllerAvailable() && utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) {
klog.ErrorS(errors.New("no cgroup swap controller present"), "ignoring NodeSwap feature", "pod", klog.KObj(pod), "containerName", container.Name)
}
if swapControllerAvailable() && !utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) {
swapConfigurationHelper := newSwapConfigurationHelper(*m.machineInfo)
swapConfigurationHelper.ConfigureNoSwap(lcr)
}
if swapControllerAvailable() && utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) {
// NOTE(ehashman): Behaviour is defined in the opencontainers runtime spec:
// https://github.com/opencontainers/runtime-spec/blob/1c3f411f041711bbeecf35ff7e93461ea6789220/config-linux.md#memory
swapConfigurationHelper := newSwapConfigurationHelper(*m.machineInfo)
switch m.memorySwapBehavior {
case kubelettypes.LimitedSwap:
swapConfigurationHelper.ConfigureLimitedSwap(lcr, pod, container)
default:
swapConfigurationHelper.ConfigureUnlimitedSwap(lcr)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering about the nesting too. What about adding an addSwapResources
method that we call here instead.
func (m *kubeGenericRuntimeManager) addSwapResources(...) {
if !swapControllerAvailable() && !utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) {
return
}
if !swapControllerAvailable() {
klog.ErrorS(errors.New("no cgroup swap controller present"), "ignoring NodeSwap feature", "pod", klog.KObj(pod), "containerName", container.Name)
return
}
swapConfigurationHelper := newSwapConfigurationHelper(*m.machineInfo)
if !utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) {
swapConfigurationHelper.ConfigureNoSwap(lcr)
return
}
// NOTE(ehashman): Behaviour is defined in the opencontainers runtime spec:
// https://github.com/opencontainers/runtime-spec/blob/1c3f411f041711bbeecf35ff7e93461ea6789220/config-linux.md#memory
swapConfigurationHelper := newSwapConfigurationHelper(*m.machineInfo)
switch m.memorySwapBehavior {
case kubelettypes.LimitedSwap:
swapConfigurationHelper.ConfigureLimitedSwap(lcr, pod, container)
default:
swapConfigurationHelper.ConfigureUnlimitedSwap(lcr)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely seems cleaner to me, though I'd still prefer including some variant of utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) on all if statements to make it clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@klueska I reworked this a bit with quick-returns where applicable. If you really want me to check the feature gate on all if statements, I can do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some feedback - hope it helps
// swap is only configured if a swap controller is available. | ||
if swapControllerAvailable() { | ||
if swapConfigurationHelper := newSwapConfigurationHelper(*m.machineInfo); utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) { | ||
// NOTE(ehashman): Behaviour is defined in the opencontainers runtime spec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// NOTE(ehashman): Behaviour is defined in the opencontainers runtime spec: | |
// NOTE(ehashman): Behavior is defined in the opencontainers runtime spec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) In tests, please write “QoS” not “Qos”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
} else { | ||
swapConfigurationHelper.ConfigureNoSwap(lcr) | ||
} else if utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) { | ||
klog.ErrorS(errors.New("no cgroup swap controller present"), "ignoring NodeSwap feature", "pod", klog.KObj(pod), "containerName", container.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try to write the error logging as if the feature is already stable. The value of the feature gate doesn't seem very relevant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sftim. I elected to always log (at InfoS
) that we're not configuring swap but to include the SwapBehavior
value as well. This should be enough to give users a signal that its being skipped without making the logic in the function overly complex. Hope this is sufficient from your side.
swapConfigurationHelper.ConfigureLimitedSwap(lcr, pod, container) | ||
default: | ||
swapConfigurationHelper.ConfigureUnlimitedSwap(lcr) | ||
// swap is only configured if a swap controller is available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// swap is only configured if a swap controller is available. | |
// swap is only configured if a swap cgroup controller is available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sftim. Should be addressed in latest revision. Note that I factored out this logic into a method, so the comment is now included as part of its docstring.
This change bypasses all logic to set swap in the linux container resources if a swap controller is not available on node. Failing to do so may cause errors in runc when starting a container with a swap configuration -- even if this is set to 0. Signed-off-by: Evan Lezar <elezar@nvidia.com>
73107ae
to
394bcaf
Compare
@elezar thanks for tracking down this bug and fixing it! /lgtm |
LGTM label has been added. Git tree hash: 55212c517565fed5dae6a3642f336905ac1c4b7d
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elezar, klueska The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…784-upstream-release-1.28 Automated cherry pick of #120784: Use local isCgroup2UnifiedMode consistently
What type of PR is this?
/kind bug
/kind regression
What this PR does / why we need it:
This PR fixes the startup of kubernetes on systems using cgroupv2 where swap is not enabled. This can be demonstrated on such as system using kind and any kubernetes version of 1.28.0, 1.28.1, or 1.28.2.
Since the
memory.swap.max
value is set, this causes containerd and runc to try to write a value for the cgroup causing containers to fail when starting with messages such as:This is independent of whether
NodeSwap
is enabled or not.Note that the issue seemed to have been introduced in #118764 with the fix extending the logic added in #119486 to also be applicable to cgroupv2 systems with swap disabled.
Special notes for your reviewer:
Assuming:
/sys/fs/cgroup/cgroup.controllers
is present)swapon -s
shows nothing)This should be reproducible in kind with the following:
which will create a single-node kind cluster using k8s
v1.28.0
. This will cause an error when starting the control-plane node.Running:
Will ensure that the kubelet logs are available. These can be inspected using
Where this particular example shows the etcd container failing to start.
Furthermore, we can confirm that this error is not present in an earlier k8s version:
Does this PR introduce a user-facing change?