Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade controller to Kustomize v4 #343

Merged
merged 13 commits into from
Jun 9, 2021
Merged

Upgrade controller to Kustomize v4 #343

merged 13 commits into from
Jun 9, 2021

Conversation

stefanprodan
Copy link
Member

@stefanprodan stefanprodan commented May 11, 2021

This PR brings kustomize-controller on a par with Kustomize v4:

Breaking changes

The major breaking change in v4 is that the set of URLs accepted by kustomize in the resources filed is reduced to only file system paths or values compatible with git clone. This means you can no longer use resources from archives (zip, tgz, etc) nor S3, GCS, Minio, Mercurial, etc.

More breaking changes here: fluxcd/flux2#918

Kustomize panic workaround

This PR comes with a workaround to a kyaml bug (fix: #341, fix: #310) by serialising the kustomize build runs to avoid the OpenAPI concurrent map read/write panic kubernetes-sigs/kustomize#3659.

Note that this workaround comes with a severe performance penalty.

Known Kustomize v4 bugs

After this upgrade the controller will be affected by these Kustomize bugs:

Test image

To test this PR please use the following image:

ghcr.io/fluxcd/kustomize-controller:kustomize-v4-rc-2ada9f21

To upgrade kustomize-controller on an cluster bootstrapped with flux, add the following images section to your flux-system/kustomization.yaml:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - gotk-components.yaml
  - gotk-sync.yaml
images:
- name: ghcr.io/fluxcd/kustomize-controller
  newName: ghcr.io/fluxcd/kustomize-controller
  newTag: kustomize-v4-rc-2ada9f21

@hiddeco
Copy link
Member

hiddeco commented May 11, 2021

The dependency tree that results in the replace requirements can be explained as follows:

github.com/fluxcd/kustomize-controller
+-> sigs.k8s.io/cli-utils@v0.25.0
	+-> k8s.io/kubectl@v0.20.4
		+-> k8s.io/kube-openapi@v0.0.0-20201113171705-d219536bb9fd
		+-> sigs.k8s.io/kustomize@v2.0.3+incompatible
	+-> sigs.k8s.io/kustomize/kyaml@v0.10.16
		+-> k8s.io/kube-openapi@v0.0.0-20210421082810-95288971da7e
+-> sigs.k8s.io/kustomize/api@v0.8.9
    +-> k8s.io/kube-openapi@v0.0.0-20210421082810-95288971da7e
    +-> sigs.k8s.io/kustomize/kyaml@v0.10.18

The problem here is that k8s.io/kube-openapi@v0.0.0-20210421082810-95288971da7e of sigs.k8s.io/kustomize/api@v0.8.9 is not compatible with sigs.k8s.io/kustomize@v2.0.3+incompatible some K8s core components depend on (i.e. k8s.io/kubectl and k8s.io/cli-runtime).

This does however raise the question how we can deal with this in the future, as we will be unable to make any upgrades of some packages till we can cut loose from the sigs.k8s.io/kustomize@v2.0.3+incompatible dependency, or they magically have made it compatible again.

@kingdonb
Copy link
Member

I ran into kubernetes-sigs/kustomize#3446 and echo the sentiment in kubernetes-sigs/kustomize#3446 (comment)

It's basically the same use case as described in your report there, integer key for use in a HelmRelease triggers this issue -- my nginx ingress opens a TCP port 2222 to accommodate an extra service in the load balancer, ssh for deis-builder service, it does this by adding a section in values that looks like this:

  values:
    tcp:
      2222: "deis/deis-builder:2222"

The output looks like this:

80s         Normal   error               kustomization/flux-system                        kustomize build failed: map[string]interface {}{"apiVersion":"helm.toolkit.fluxcd.io/v2beta1", "kind":"HelmRelease", "metadata":map[string]interface {}{"name":"ingress-nginx", "namespace":"ingress-nginx"}, "spec":map[string]interface {}{"chart":map[string]interface {}{"spec":map[string]interface {}{"chart":"ingress-nginx", "interval":"1m", "sourceRef":map[string]interface {}{"kind":"HelmRepository", "name":"ingress-nginx", "namespace":"flux-system"}, "version":"3.23.0"}}, "interval":"5m", "values":map[string]interface {}{"controller":map[string]interface {}{"extraArgs":map[string]interface {}{"enable-ssl-passthrough":""}, "kind":"DaemonSet", "service":map[string]interface {}{"nodePorts":map[string]interface {}{"http":32082, "https":32443}, "type":"NodePort"}}, "tcp":map[interface {}]interface {}{2222:"deis/deis-builder:2222"}}}}: json: unsupported type: map[interface {}]interface {}

Basically no way I would have understood that error message without following the links, and I'm not sure of any way to work around it. This would block me from upgrading.

@stefanprodan
Copy link
Member Author

@kingdonb try:

values:
    tcp:
      "2222": "deis/deis-builder:2222"

@kingdonb
Copy link
Member

That sorted it for me, perfect 👍 no other issues at this time

@stefanprodan
Copy link
Member Author

@kingdonb Given the nginx ingress popularity, I imagine this will impact many Flux users. Indeed that error doesn't say much, but now the Kubernetes object is embedded in the error message so at least you know exactly which object caused the issue.

@nitinpatil1992
Copy link

nitinpatil1992 commented May 23, 2021

Thank you for the PR.

Bootstrapping new cluster with this PR gives kustomize build failed error as raised under issue -308

$ kustomize-controller git:(kustomize-v4) kg deploy -n flux-system kustomize-controller -o jsonpath="{.spec.template.spec.containers[0].image}"
ghcr.io/fluxcd/kustomize-controller:kustomize-v4-rc-de0310e7

$ kubectl logs -lapp=kustomize-controller -n flux-system -f
...
{"level":"info","ts":"2021-05-23T23:36:03.151Z","logger":"controller.kustomization","msg":"Starting workers","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","worker count":4}
{"level":"info","ts":"2021-05-23T23:36:03.161Z","logger":"controller.kustomization","msg":"Source is not ready, artifact not found","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"sandboxuk-spot","namespace":"flux-system"}
{"level":"error","ts":"2021-05-23T23:36:04.160Z","logger":"controller.kustomization","msg":"Reconciliation failed after 1.008442786s, next try in 10m0s","reconciler group":"kustomize.toolkit.fluxcd.io","reconciler kind":"Kustomization","name":"flux-system","namespace":"flux-system","revision":"main/a3b95a02fe75bb477a01355b7cbc0ea23ac77723","error":"kustomize build failed: accumulating resources: accumulation err='accumulating resources from './clusters/sandboxuk-spot': read /tmp/flux-system534296925/clusters/sandboxuk-spot: is a directory': recursed merging from path '/tmp/flux-system534296925/clusters/sandboxuk-spot': may not add resource with an already registered id: ~G_v1_Namespace|~X|flux-system"}

Tried removing GitRepository object as per logs but didn't help out!

@stefanprodan
Copy link
Member Author

stefanprodan commented May 24, 2021

@nitinpatil1992 do you use $patch: delete to remove the flux-system namespace? I don't see how that error is related to #308, can you provide an example repo please?

@nitinpatil1992
Copy link

@stefanprodan I didn't patch/delete flux-system namespace.
Here is my repo structure:

bootstrap-repo
├── README.md
├── bases
│   ├── gotk-components.yaml
│   ├── gotk-sync.yaml
│   └── kustomization.yaml
└── clusters
    └── sandboxuk-spot
        ├── infrastructure.yaml
        └── kustomization.yaml

Files from bases are from flux v2 bootstrap document(flux namespace and all CRDs).
My custom files are as follows

#clusters/sandboxuk-spot/kustomization.yaml
kind: Kustomization
bases:
- ./../../bases
resources:
- infrastructure.yaml

# clusters/sandboxuk-spot/infrastructure.yaml
---
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
  name: sandboxuk-spot
  namespace: flux-system
spec:
  interval: 2m0s
  ref:
    branch: kustomize
  url: ssh://git@github.com/<org>/<infrastructure-repo>
---
apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
  name: sandboxuk-spot
  namespace: flux-system
spec:
  interval: 5m0s
  path: ./kustomize/sandboxuk-spot.yaml
  prune: true
  sourceRef:
    kind: GitRepository
    name: sandboxuk-spot
  validation: client

The thing is bootstrap worked very well for the first time installation.
Then I switched the directory structure and these duplicate resources errors came into picture.

@stefanprodan
Copy link
Member Author

stefanprodan commented May 26, 2021

Then I switched the directory structure and these duplicate resources errors came into picture.

So those errors have nothing to do with this PR, why comment here if you're testing out some repo structure?

Also this is not a valid path:

apiVersion: kustomize.toolkit.fluxcd.io/v1beta1
kind: Kustomization
metadata:
  name: sandboxuk-spot
  namespace: flux-system
spec:
  interval: 5m0s
  path: ./kustomize/sandboxuk-spot.yaml

Please check out the docs on how Flux Kustomization works: https://fluxcd.io/docs/components/kustomize/kustomization/

@nitinpatil1992
Copy link

nitinpatil1992 commented May 26, 2021

Apologies! But new structure lead to the similar issue of resource duplication which (I assume)was fixed in kyaml and hence in your new image.

./kustomize/sandboxuk-spot.yaml path is from infrastructure repo. Isn't that path refer to GitRepository Objects repo?

  sourceRef:
    kind: GitRepository
    name: sandboxuk-spot

@stefanprodan
Copy link
Member Author

@nitinpatil1992 Flux Kustomization does not accept a file as path, but a dir, please see the docs I posted above.

@ash2k
Copy link

ash2k commented May 26, 2021

Re. cli-utils - I'm trying to bump the dependencies here kubernetes-sigs/cli-utils#361. I wonder if this would fix the issues in this PR.

@stefanprodan
Copy link
Member Author

stefanprodan commented May 26, 2021

@ash2k thank you so much for this, I've switched to your PR and all tests are passing 🍻

@ash2k
Copy link

ash2k commented Jun 3, 2021

FYI @stefanprodan the PR has been merged, so this one should be unblocked now. Cheers.

@stefanprodan
Copy link
Member Author

stefanprodan commented Jun 3, 2021

@ash2k I'm using the master branch and works great, I'll keep an eye on for the next cli-utils semver release. Thanks again for helping out with this, it would've been a major pain for us to maintain a fork of kstatus.

@bbriggs
Copy link

bbriggs commented Jun 7, 2021

@stefanprodan I had the same issue and this patch resolved it for me as well. Any ETA on when it might land in prod?

@stefanprodan
Copy link
Member Author

@bbriggs what issue are you referring to?

Any ETA on when it might land in prod?

I'm waiting for controller-runtime v0.9 release. As documented in the PR I had to pin controller-runtime to v0.9.0-beta due to breaking changes in Kubernetes client-go.

@bbriggs
Copy link

bbriggs commented Jun 7, 2021

@bbriggs what issue are you referring to?

the Kustomize panic when applying things with $patch: delete

@bbriggs
Copy link

bbriggs commented Jun 7, 2021

I'm waiting for controller-runtime v0.9 release. As documented in the PR I had to pin controller-runtime to v0.9.0-beta due to breaking changes in Kubernetes client-go.

Thank you for your hard work on this! It was about to be a showstopper for us and publishing the workaround image in a PR allowed us to verify the bug and move past it so we could work on other problems. Not many folks do that, so we're very grateful 👍 🎉

- Upgrade sigs.k8s.io/kustomize/api from v0.7.4 to v0.8.9
- Upgrade sigs.k8s.io/cli-utils from v0.22.4 to v0.25.0
- Pin sigs.k8s.io/kustomize/kyaml to v0.10.17 (cli-utils compat)
- Pin k8s.io/kube-openapi v0.0.0-20201113171705-d219536bb9fd (cli-utils compat)

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Serialize kustomize build runs to avoid kyaml OpenAPI concurrent map read/write panic
kubernetes-sigs/kustomize#3659

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
- make kstatus work with the latest version of kyaml by using a forked version of cli-utils fluxcd/cli-utils#1
- update Kubernetes packages to v0.21.1
- update controller-runtime to v0.9.0-beta.5 due to breaking changes in client-go v0.21.1

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
- Upgrade sigs.k8s.io/kustomize/api from v0.7.4 to v0.8.9
- Upgrade sigs.k8s.io/cli-utils from v0.22.4 to v0.25.0
- Pin sigs.k8s.io/kustomize/kyaml to v0.10.17 (cli-utils compat)
- Pin k8s.io/kube-openapi v0.0.0-20201113171705-d219536bb9fd (cli-utils compat)

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
- make kstatus work with the latest version of kyaml by using a forked version of cli-utils fluxcd/cli-utils#1
- update Kubernetes packages to v0.21.1
- update controller-runtime to v0.9.0-beta.5 due to breaking changes in client-go v0.21.1

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
- Bump controller-runtime to v0.9.0
- Bump controller-gen to v0.5.0
- Use Environment.AddUser to generate the envtest cluster admin kubeconfig

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Copy link
Member

@hiddeco hiddeco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @stefanprodan 💯

@hiddeco hiddeco added area/kustomize Kustomize related issues and pull requests enhancement New feature or request labels Jun 9, 2021
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci CI related issues and pull requests area/kustomize Kustomize related issues and pull requests enhancement New feature or request
Projects
None yet
6 participants