Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure MemoryRequest for InPlace pod resize in cgroupv2 systems #121218

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Karthik-K-N
Copy link
Contributor

@Karthik-K-N Karthik-K-N commented Oct 13, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

cgroupv2 systems provides the ability to minimum memory values with memory.min , This PR adds ability configure minimum memory request for container

Which issue(s) this PR fixes:

Fixes #114202

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

KEP: kubernetes/enhancements#1287
Previous PR: #102884

In place pod resize now provides the ability to set memory request

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 13, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Karthik-K-N
Once this PR has been reviewed and has the lgtm label, please assign derekwaynecarr for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 13, 2023
@@ -227,6 +230,11 @@ func (m *kubeGenericRuntimeManager) calculateLinuxResources(cpuRequest, cpuLimit
// more info.
"memory.oom.group": "1",
}
if utilfeature.DefaultFeatureGate.Enabled(kubefeatures.InPlacePodVerticalScaling) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a positive and negative test for this feature? You can set features during unit tests so we should verify that this function behaves as expected with feature gate on/off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure will add, Just wanted to know opinion about the overall approach so just differed it.

@Karthik-K-N
Copy link
Contributor Author

/test pull-kubernetes-unit

@Karthik-K-N
Copy link
Contributor Author

/cc @vinaykul

@Jeffwan
Copy link
Contributor

Jeffwan commented Oct 17, 2023

/cc @kubernetes/sig-node-pr-reviews

@k8s-ci-robot
Copy link
Contributor

@Jeffwan: GitHub didn't allow me to request PR reviews from the following users: kubernetes/sig-node-pr-reviews.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @kubernetes/sig-node-pr-reviews

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bart0sh
Copy link
Contributor

bart0sh commented Oct 20, 2023

/triage accepted
/priority important-longterm
/assign @vinaykul

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 20, 2023
@bart0sh bart0sh moved this from Triage to Needs Reviewer in SIG Node PR Triage Oct 20, 2023
@ndixita
Copy link
Contributor

ndixita commented Nov 3, 2023

@Karthik-K-N @vinaykul I have been looking into the cgroup v2 memory controllers as a part of https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2570-memory-qos

From what I understand, setting memory.min will ensure the cgroup retains the configured minimum memory which kind of makes it unreclaimable. If there's no unprotected reclaimable memory available, OOM killer is invoked. So, IMO setting memory.min could lead to more OOM killer invocations since it makes memory unreclaimable in cgroups. I would recommend testing different configurations before we set memory.min value.

@ndixita
Copy link
Contributor

ndixita commented Nov 3, 2023

/cc @ndixita

@Karthik-K-N
Copy link
Contributor Author

@Karthik-K-N @vinaykul I have been looking into the cgroup v2 memory controllers as a part of https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2570-memory-qos

From what I understand, setting memory.min will ensure the cgroup retains the configured minimum memory which kind of makes it unreclaimable. If there's no unprotected reclaimable memory available, OOM killer is invoked. So, IMO setting memory.min could lead to more OOM killer invocations since it makes memory unreclaimable in cgroups. I would recommend testing different configurations before we set memory.min value.

@ndixita Thanks for your feedback, I can check the behaviors on various configurations. Please let me know if you have any specific configurations to be tested.

@ndixita
Copy link
Contributor

ndixita commented Nov 6, 2023

@Karthik-K-N Sure, let's touch base again after kubecon next week. Please reach out on slack and we can discuss.

@Karthik-K-N
Copy link
Contributor Author

@Karthik-K-N Sure, let's touch base again after kubecon next week. Please reach out on slack and we can discuss.

sure thank you.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 4, 2024
@Jeffwan
Copy link
Contributor

Jeffwan commented Feb 6, 2024

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 6, 2024
@ndixita
Copy link
Contributor

ndixita commented Mar 22, 2024

@Jeffwan I'd not recommend setting memory.min equal to the requested memory. I have tried the cgroupv2 controller, and it results in more OOM kills since kernel will reserve memory.min amount of memory even when the usage by Pod is below that value. The extra memory not being used will result in that unused memory being unavailable to other workloads.

@tallclair
Copy link
Member

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Needs Reviewer
SIG Node PR Triage
Needs Reviewer
Development

Successfully merging this pull request may close these issues.

[FG:InPlacePodVerticalScaling] Get MemoryRequest from ContainerStatus for cgroupv2
9 participants