Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.29-apiserver based aggregated APIServer does not work against 1.28 kube-apiserver #124533

Open
dgrisonnet opened this issue Apr 25, 2024 · 11 comments
Labels
kind/documentation Categorizes issue or PR as related to documentation. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@dgrisonnet
Copy link
Member

dgrisonnet commented Apr 25, 2024

What happened?

We bumped an aggregated apiserver library based on the generic apiserver library to 1.29 in kubernetes-sigs/custom-metrics-apiserver#170 and users mentioned that after pulling the new dependencies, the aggregated apiservers started failing on Kubernetes 1.28 clusters with the following errors:

W0425 14:25:27.110980       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
E0425 14:25:27.111035       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.PriorityLevelConfiguration: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
W0425 14:25:34.571189       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.FlowSchema: the server could not find the requested resource
E0425 14:25:34.571265       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.FlowSchema: failed to list *v1.FlowSchema: the server could not find the requested resource
W0425 14:25:39.776616       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
E0425 14:25:39.776683       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.PriorityLevelConfiguration: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
W0425 14:25:58.606055       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.FlowSchema: the server could not find the requested resource
E0425 14:25:58.606080       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.FlowSchema: failed to list *v1.FlowSchema: the server could not find the requested resource
W0425 14:26:01.846754       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
E0425 14:26:01.846786       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.PriorityLevelConfiguration: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
W0425 14:25:27.211376       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
E0425 14:25:27.211394       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.PriorityLevelConfiguration: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
W0425 14:25:34.498183       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.FlowSchema: the server could not find the requested resource
E0425 14:25:34.498228       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.FlowSchema: failed to list *v1.FlowSchema: the server could not find the requested resource
W0425 14:25:35.545134       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
E0425 14:25:35.545287       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.PriorityLevelConfiguration: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
W0425 14:25:49.677632       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
E0425 14:25:49.677657       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.PriorityLevelConfiguration: failed to list *v1.PriorityLevelConfiguration: the server could not find the requested resource
W0425 14:25:51.325292       1 reflector.go:539] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: failed to list *v1.FlowSchema: the server could not find the requested resource
E0425 14:25:51.325325       1 reflector.go:147] pkg/mod/k8s.io/client-go@v0.29.4/tools/cache/reflector.go:229: Failed to watch *v1.FlowSchema: failed to list *v1.FlowSchema: the server could not find the requested resource

What did you expect to happen?

I expected 1.29 libraries to still be compatible with Kubernetes 1.28 based on the kube-apiserver version skew.

How can we reproduce it (as minimally and precisely as possible)?

I haven't tested it with the sample-apiserver yet, but I assume that running its 1.29 version on a 1.28 kind cluster will result in the same failure.

If not, feel free to reach out to me I can share a reproducer with prometheus-adapter, and aggregated apiserver where the problem exists.

Anything else we need to know?

The error appeared with the graduation of P&F to stable in 1.29. I chatted briefly with @tkashem about it and he mentioned that the controller logic was moved from v1beta3 to v1 in 1.29 which is also the version that introduces the v1 API. Meaning that retro compatibility with 1.28 was broken.

@dgrisonnet dgrisonnet added the kind/bug Categorizes issue or PR as related to a bug. label Apr 25, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 25, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dgrisonnet
Copy link
Member Author

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 25, 2024
@dgrisonnet
Copy link
Member Author

cc @tkashem @MikeSpreitzer

@benluddy
Copy link
Contributor

Has this skew direction (aggregated apiserver newer than kube-apiserver) been supported historically?

@dgrisonnet
Copy link
Member Author

That's a good point, when I looked at the official doc we didn't mention it anywhere so I guess that it might just not be supported? Thinking about it again, it would make more sense to me that we would not support this.

Feel free to close if we don't support this skew direction

@benluddy
Copy link
Contributor

IIUC, at least for Kube components, nothing can be newer than the kube-apiserver (or newer than the oldest kube-apiserver, if there is skew between kube-apiserver instances): https://kubernetes.io/releases/version-skew-policy/. I would feel better about closing if someone else can +1 my understanding.

@sftim
Copy link
Contributor

sftim commented Apr 26, 2024

We can tweak the docs to clarify the situation, whichever way we decide / confirms it goes.

@sftim
Copy link
Contributor

sftim commented Apr 26, 2024

nothing can be newer than the kube-apiserver (or newer than the oldest kube-apiserver, if there is skew between kube-apiserver instances)

I think you mean “no component other than the API server can be newer than the API server, and API servers that are not the oldest API server version can be at most one minor version newer than the oldest API server version”. Does that sound right @benluddy?

@benluddy
Copy link
Contributor

I'm not sure. The language needs to distinguish between the kube-apiserver and aggregated apiservers. The existing docs are clear about skew between kube-apiserver instances:

In highly-available (HA) clusters, the newest and oldest kube-apiserver instances must be within one minor version.

And the supported skew between kube-apiserver and the named components in the doc is relative to the version of the oldest kube-apiserver instance.

The ambiguity here is with the supported skew between kube-apiserver and any aggregated apiserver based on a particular version of k8s.io/apiserver.

@liggitt liggitt changed the title The generic apiserver version skew is broken in 1.29 1.29-apiserver based aggregated APIServer does not work against 1.28 kube-apiserver May 1, 2024
@liggitt
Copy link
Member

liggitt commented May 1, 2024

Controller clients outside kube-apiserver can't be newer than the kube-apiserver they are talking to (true for kubelet, kube-controller-manager and kube-scheduler, true for aggregated apiservers as well)

kube-apiserver supports -1 skew because it talks to itself for APIs it needs for running admission stuff like flowcontrol and webhook/CEL admission, so it is guaranteed to have access to the latest API version.

Controllers that talk to kube-apiserver have to wait until kube-apiserver is upgraded to have the same guarantee.

We should describe that better in skew docs

/remove-kind bug
/kind documentation

@fedebongio
Copy link
Contributor

This was discussed at length today in the SIG meeting: https://www.youtube.com/watch?v=0TXm-DGcK1k, starting at minute 33:00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

6 participants