Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeleteCollection API fails if request body is non-empty #111985

Closed
Arnavion opened this issue Aug 24, 2022 · 13 comments
Closed

DeleteCollection API fails if request body is non-empty #111985

Arnavion opened this issue Aug 24, 2022 · 13 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@Arnavion
Copy link

Arnavion commented Aug 24, 2022

What happened?

In v1.25.0, invoking the DeleteCollection API with a non-empty request body fails with HTTP 500 - the error message is no kind "DeleteOptions" is registered for version "meta.k8s.io/v1" in scheme "pkg/api/legacyscheme/scheme.go:30"

All DeleteCollection APIs are specced (in the OpenAPI spec, eg deleteCoreV1CollectionNamespacedPod for the repro below) to take a DeleteOptions in the request body, and the simplest such value is {}. I maintain the Rust API client library, and my library's generated code always sets the request body to {} when all the delete options are default (the serialized form of a DeleteOptions with no set fields), which is how I noticed this bug.

What did you expect to happen?

This works with v1.24 and earlier.

How can we reproduce it (as minimally and precisely as possible)?

(This repro uses curl for full control of the request headers and body.)

  1. Create kind cluster of v1.25.0

  2. Create client-cert.pem containing the private key and cert extracted from kubectl config. (curl's --cert expects PEM-encoded private key followed by PEM-encoded cert in the file.)

    (kubectl config view -o json --raw | jq '.users[0].user["client-certificate-data"]' -r | base64 -d; kubectl config view -o json --raw | jq '.users[0].user["client-key-data"]' -r | base64 -d) >client-cert.pem
  3. Try to delete all pods with label foo=bar

    $ base="$(kubectl config view -o json --raw | jq '.clusters[0].cluster.server' -r)"
    
    $ curl -kD - -XDELETE --cert client-cert.pem -H 'content-type:application/json' -H 'content-length:2' --data-raw '{}' "${base}/api/v1/namespaces/default/pods?labelSelector=foo%3Dbar"
    
    HTTP/2 500 
    audit-id: b3c6ed2c-1e53-48a7-8a7c-324706c86ae3
    cache-control: no-cache, private
    content-type: application/json
    x-kubernetes-pf-flowschema-uid: 9fb6dbeb-e6fc-4635-aa6a-7f719dfb1580
    x-kubernetes-pf-prioritylevel-uid: cf4b1a36-f72a-4231-acd7-8e133364d160
    content-length: 235
    date: Wed, 24 Aug 2022 03:11:33 GMT
    
    {
      "kind": "Status",
      "apiVersion": "v1",
      "metadata": {},
      "status": "Failure",
      "message": "no kind \"DeleteOptions\" is registered for version \"meta.k8s.io/v1\" in scheme \"pkg/api/legacyscheme/scheme.go:30\"",
      "code": 500
    }

    The same request without a request body succeeds:

    $ curl -kD - -XDELETE --cert client-cert.pem -H 'content-type:application/json' "${base}/api/v1/namespaces/default/pods?labelSelector=foo%3Dbar"
    
    HTTP/2 200 
    audit-id: 896f90da-7aa1-415d-9c0e-e94b4bd3bea1
    cache-control: no-cache, private
    content-type: application/json
    x-kubernetes-pf-flowschema-uid: 9fb6dbeb-e6fc-4635-aa6a-7f719dfb1580
    x-kubernetes-pf-prioritylevel-uid: cf4b1a36-f72a-4231-acd7-8e133364d160
    content-length: 111
    date: Wed, 24 Aug 2022 03:12:53 GMT
    
    {
      "kind": "PodList",
      "apiVersion": "v1",
      "metadata": {
        "resourceVersion": "5203"
      },
      "items": []
    }

Anything else we need to know?

No response

Kubernetes version

$ kubectl version

Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.3", GitCommit:"aef86a93758dc3cb2c658dd9657ab4ad4afc21cb", GitTreeState:"clean", BuildDate:"2022-07-19T00:00:00Z", GoVersion:"go1.18.4", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.0", GitCommit:"a866cbe2e5bbaa01cfd5e969aa3e033f3282a8a2", GitTreeState:"clean", BuildDate:"2022-08-24T01:36:05Z", GoVersion:"go1.19", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

N/A

OS version

$ cat /etc/os-release

NAME="openSUSE Tumbleweed"
# VERSION="20220818"
ID="opensuse-tumbleweed"
ID_LIKE="opensuse suse"
VERSION_ID="20220818"
PRETTY_NAME="openSUSE Tumbleweed"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:tumbleweed:20220818"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://www.opensuse.org/"
DOCUMENTATION_URL="https://en.opensuse.org/Portal:Tumbleweed"
LOGO="distributor-logo-Tumbleweed"

$ uname -a

Linux futaba 5.19.1-1-default #1 SMP PREEMPT_DYNAMIC Thu Aug 11 11:32:52 UTC 2022 (a5bf6c0) x86_64 x86_64 x86_64 GNU/Linux

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)


Edit: Still an issue as of v1.25.1

@Arnavion Arnavion added the kind/bug Categorizes issue or PR as related to a bug. label Aug 24, 2022
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 24, 2022
@pacoxu
Copy link
Member

pacoxu commented Aug 24, 2022

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 24, 2022
@leilajal
Copy link
Contributor

/cc @Jefftree
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 25, 2022
Arnavion added a commit to Arnavion/k8s-openapi that referenced this issue Aug 31, 2022
Also change the request bodies of Delete and DeleteCollection requests to be
empty if the DeleteOptional has no fields set.

Ref: kubernetes/kubernetes#111985
Arnavion added a commit to Arnavion/k8s-openapi that referenced this issue Aug 31, 2022
Also change the request bodies of Delete and DeleteCollection requests to be
empty if the DeleteOptional has no fields set.

Ref: kubernetes/kubernetes#111985

Fixes #125
@clux
Copy link

clux commented Sep 17, 2022

This is pretty bad because it's not even a backwards-compatible change to serializers, it's a breaking change to at exactly 1.25:

Serializing kind/apiVersion breaks on Kubernetes < 1.25

delete_collections returns:

{ status: "Failure", message: "no kind \"DeleteOptions\" is registered for version \"meta.k8s.io/v1\" in scheme \"k8s.io/apimachinery@v1.24.1-k3s1/pkg/runtime/scheme.go:100\"", reason: "", code: 500 }

Not serializing kind/apiVersion breaks on Kubernetes >= 1.25

delete_collection returns:

{ status: "Failure", message: "Object 'Kind' is missing in '{}'", reason: "", code: 500 }

So now we have to detect versions in non-IO/protocol level modules to do this safely, and it forces the user on a zero version skew.

@Jefftree
Copy link
Member

#110110: This seems to be the culprit, though I need to do some more investigation to understand the problem.

cc @sxllwx

@sxllwx
Copy link
Member

sxllwx commented Oct 13, 2022

#110110: This seems to be the culprit, though I need to do some more investigation to understand the problem.

cc @sxllwx

Thanks to @Jefftree for tracking down. It is currently confirmed that this incompatible change is due to PR-110110. The root cause of the problem is being traced.

@sxllwx
Copy link
Member

sxllwx commented Oct 18, 2022

Update: the problem has been identified, and the specific reasons are as follows:

The reason for the issue is that the apiserver uses the Scheme in the global variable pkg/api/legacyscheme/scheme.go, and registers the DeleteOptions corresponding to each APIGroup in the Scheme. But DeleteOptions in meta.k8s.io/v1 is not registered, resulting in a notRegisteredErr.

@sxllwx
Copy link
Member

sxllwx commented Oct 24, 2022

@Arnavion @clux @Jefftree This issue has been fixed in PR-113133. Currently I have submitted PR-113286 to cherry-pick the fix to release-1.25.

@sxllwx
Copy link
Member

sxllwx commented Nov 1, 2022

@Arnavion @clux @Jefftree

PR #113133 PR #113286 have all been merged. Thanks for your feedback.😄

@clux
Copy link

clux commented Nov 1, 2022

Thank you! Does that mean we can expect it in 1.25.4?

@sxllwx
Copy link
Member

sxllwx commented Nov 1, 2022

Yes, subsequent releases will carry this bugfix. And, I have cherry-pick to 1.25.0.

ps: I don't find release-1.25.x released in k/k codebase so there is no way to cherry-pick to released release-1.25.x @xmudrii Can you give me some guidance?

@xmudrii
Copy link
Member

xmudrii commented Nov 1, 2022

@sxllwx With #113286 being merged, the fix will be (automatically) included in the upcoming 1.25.4 release (scheduled for November 9). There's nothing else that you need to do about this.

Just as a note to make it clear, it's not possible to cherry pick this fix to existing patch releases, such as 1.25.0 — only 1.25.4 and newer releases will have the fix.

@sxllwx
Copy link
Member

sxllwx commented Nov 2, 2022

@sxllwx With #113286 being merged, the fix will be (automatically) included in the upcoming 1.25.4 release (scheduled for November 9). There's nothing else that you need to do about this.

Just as a note to make it clear, it's not possible to cherry pick this fix to existing patch releases, such as 1.25.0 — only 1.25.4 and newer releases will have the fix.

Thanks for your explanation~

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

9 participants