Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quota scopes cannot handle the transition case from one scope to another when the target object is updated. #124436

Open
carlory opened this issue Apr 22, 2024 · 2 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@carlory
Copy link
Member

carlory commented Apr 22, 2024

What happened?

I create 2 quotas with different scopes, one is Terminating, and the other is NotTerminating. I create a pod that belongs to NotTerminating, then update the pod's spec.activeDeadlineSeconds with 5, the pod should belong to Terminating. But the quota is not updated.

Every 2.0s: kubectl get pod,quota -A

NAMESPACE     NAME                          READY   STATUS    RESTARTS   AGE
default       pod/test-pod                  0/1     Pending   0          28s
kube-system   pod/coredns-58cd89f5d-wtpxv   0/1     Pending   0          5m45s

NAMESPACE   NAME                               AGE     REQUEST           LIMIT
default     resourcequota/not-terminating   5m16s   count/pods: 1/1
default     resourcequota/terminating          5m16s   count/pods: 0/1

What did you expect to happen?

Quota scopes can handle the transition case from one scope to another when the target object is updated.

How can we reproduce it (as minimally and precisely as possible)?

  1. Create 2 quotas.
apiVersion: v1
kind: ResourceQuota
metadata:
  name: terminating
spec:
  hard:
    count/pods: "1"
  scopes:
  - Terminating
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: not-terminating
spec:
  hard:
    count/pods: "1"
  scopes:
  - NotTerminating
  1. Create a pod which belongs to not-terminating
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
 
  containers:
  - name: test
    image: ubuntu
    command: ["/bin/sh"]
    args: ["-c", "while true; do echo hello; sleep 10;done"]
  1. Update pod's spec.activeDeadlineSeconds with 5. the pods should belong to terminating
kubectl patch pod test-pod -p '{"spec":{"activeDeadlineSeconds":5}}'
  1. Get all quotas, you will see the terminating quota is not updated.
kubectl get quota -A

Anything else we need to know?

The root cause is, when the pod is updated, the delta is calculated based on the old object and the new object, it doesn't consider the scope change where the updated object belongs to, so the final delta is not correct.

In this case, the final delta is zero, so the terminating quota is not updated.

Related k/k code:

deltaUsage, err := evaluator.Usage(inputObject)
if err != nil {
return quotas, err
}
// ensure that usage for input object is never negative (this would mean a resource made a negative resource requirement)
if negativeUsage := quota.IsNegative(deltaUsage); len(negativeUsage) > 0 {
return nil, admission.NewForbidden(a, fmt.Errorf("quota usage is negative for resource(s): %s", prettyPrintResourceNames(negativeUsage)))
}
if admission.Update == a.GetOperation() {
prevItem := a.GetOldObject()
if prevItem == nil {
return nil, admission.NewForbidden(a, fmt.Errorf("unable to get previous usage since prior version of object was not found"))
}
// if we can definitively determine that this is not a case of "create on update",
// then charge based on the delta. Otherwise, bill the maximum
metadata, err := meta.Accessor(prevItem)
if err == nil && len(metadata.GetResourceVersion()) > 0 {
prevUsage, innerErr := evaluator.Usage(prevItem)
if innerErr != nil {
return quotas, innerErr
}
deltaUsage = quota.SubtractWithNonNegativeResult(deltaUsage, prevUsage)
}
}
// ignore items in deltaUsage with zero usage
deltaUsage = quota.RemoveZeros(deltaUsage)
// if there is no remaining non-zero usage, short-circuit and return
if len(deltaUsage) == 0 {
return quotas, nil
}
// verify that for every resource that had limited by default consumption
// enabled that there was a corresponding quota that covered its use.
// if not, we reject the request.
hasNoCoveringQuota := limitedResourceNamesSet.Difference(restrictedResourcesSet)
if len(hasNoCoveringQuota) > 0 {
return quotas, admission.NewForbidden(a, fmt.Errorf("insufficient quota to consume: %v", strings.Join(hasNoCoveringQuota.List(), ",")))
}
// verify that for every scope that had limited access enabled
// that there was a corresponding quota that covered it.
// if not, we reject the request.
scopesHasNoCoveringQuota, err := evaluator.UncoveredQuotaScopes(limitedScopes, restrictedScopes)
if err != nil {
return quotas, err
}
if len(scopesHasNoCoveringQuota) > 0 {
return quotas, fmt.Errorf("insufficient quota to match these scopes: %v", scopesHasNoCoveringQuota)
}
if len(interestingQuotaIndexes) == 0 {
return quotas, nil
}
outQuotas, err := copyQuotas(quotas)
if err != nil {
return nil, err
}
for _, index := range interestingQuotaIndexes {

When I implement #124360, I find this issue.

Kubernetes version

(base) (⎈|local-up-cluster:N/A)➜  __testdata git:(kep-3751-quota-2) ✗ kubectl version
Client Version: v1.30.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.31.0-alpha.0.47+5a31a46d4bfe1f-dirty

Cloud provider

N/A

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@carlory carlory added the kind/bug Categorizes issue or PR as related to a bug. label Apr 22, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 22, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@carlory
Copy link
Member Author

carlory commented Apr 22, 2024

/sig api-machinery
/assign @liggitt @deads2k @derekwaynecarr

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

5 participants