Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nameReference breaks when adding outer Kustomization (multiple identical base imports) #4880

Open
paul-bormans opened this issue Nov 21, 2022 · 8 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@paul-bormans
Copy link

I need to include a base manifest multiple times and since (obviously) their names will clash i like to use a Kustomization to make the names / namespace unique I do this using a nameSuffix + namespace and then use a nameReference to fix all references to the namespace name. This technique works just fine when i apply it on only 1 included overlay but as soon as a include another then i get no errors but the nameReferences no longer work.

Reproduce in following minimized setup:

./base/prometheus/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- prometheus.yaml

./base/prometheus/prometheus.yaml

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s
  namespace: my-placeholder
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - my-placeholder

./overlay/a/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - namespace.yaml
  - ../../base/prometheus

nameSuffix: -suffix-a
namespace: namespace-a
configurations:
  - namespace-reference.yaml

./overlay/a/namespace-reference.yaml

nameReference:
  - kind: Namespace
    apiVersion: v1
    fieldSpecs:
      - kind: Prometheus
        path: spec/affinity/podAntiAffinity/preferredDuringSchedulingIgnoredDuringExecution[]/podAffinityTerm/namespaces[]

./overlay/a/namespace.yaml

---
apiVersion: v1
kind: Namespace
metadata:
  name: my-placeholder

./overlay/b/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - namespace.yaml
  - ../../base/prometheus

nameSuffix: -suffix-b
namespace: namespace-b
configurations:
  - namespace-reference.yaml

./overlay/b/namespace-reference.yaml

nameReference:
  - kind: Namespace
    apiVersion: v1
    fieldSpecs:
      - kind: Prometheus
        path: spec/affinity/podAntiAffinity/preferredDuringSchedulingIgnoredDuringExecution[]/podAffinityTerm/namespaces[]

./overlay/b/namespace.yaml

---
apiVersion: v1
kind: Namespace
metadata:
  name: my-placeholder

./overlay/kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- a
- b

Expected output

apiVersion: v1
kind: Namespace
metadata:
  name: namespace-a
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s-suffix-a
  namespace: namespace-a
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - namespace-a
---
apiVersion: v1
kind: Namespace
metadata:
  name: namespace-b
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s-suffix-b
  namespace: namespace-b
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - namespace-b

Actual output

apiVersion: v1
kind: Namespace
metadata:
  name: namespace-a
---
apiVersion: v1
kind: Namespace
metadata:
  name: namespace-b
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s-suffix-a
  namespace: namespace-a
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - my-placeholder
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s-suffix-b
  namespace: namespace-b
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - my-placeholder

As can be seen the nameSuffix and namespace specifiers both behave as expected but the nameReference no longer updates the "placeholders".

Kustomize version

{Version:kustomize/v4.5.7 GitCommit:56d82a8378dfc8dc3b3b1085e5a6e67b82966bd7 BuildDate:2022-08-02T16:35:54Z GoOs:linux GoArch:amd64}

Platform

Linux

@paul-bormans paul-bormans added the kind/bug Categorizes issue or PR as related to a bug. label Nov 21, 2022
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 21, 2022
@paul-bormans paul-bormans changed the title nameReference breaks when adding outer Kustomization nameReference breaks when adding outer Kustomization (multiple identical base imports) Nov 22, 2022
@paul-bormans
Copy link
Author

We found a solution by using replacements iso nameReference, so effectively changing:

configurations:
  - namespace-reference.yaml

into:

replacements:
- path: replacements.yaml

having (for instance):

- source: 
    kind: Namespace
    version: v1
  targets:
  - select: 
      kind: Prometheus
    fieldPaths:
    - spec.affinity.podAntiAffinity.preferredDuringSchedulingIgnoredDuringExecution.0.podAffinityTerm.namespaces.0

@brianpursley
Copy link
Member

I find it curious that individually it replaces my-placeholder

Just overlay/a

$ kustomize build overlay/a
apiVersion: v1
kind: Namespace
metadata:
  name: namespace-a
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s-suffix-a
  namespace: namespace-a
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - namespace-a

Just overlay/b

$ kustomize build overlay/b
apiVersion: v1
kind: Namespace
metadata:
  name: namespace-b
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s-suffix-b
  namespace: namespace-b
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - namespace-b

overlay

$ kustomize build overlay
apiVersion: v1
kind: Namespace
metadata:
  name: namespace-a
---
apiVersion: v1
kind: Namespace
metadata:
  name: namespace-b
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s-suffix-a
  namespace: namespace-a
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - my-placeholder
---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: k8s-suffix-b
  namespace: namespace-b
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          namespaces:
          - my-placeholder

@KnVerey
Copy link
Contributor

KnVerey commented Nov 23, 2022

This reminds me a lot of a bug I fixed earlier this year in #4700.
I suspect what is happening is that Kustomize ends up finding the name reference ambiguous, but is ignoring it instead of throwing an error (not what we want!). Unlike most transformers, NameReference only runs once, at the very end of the build, acting on internal name tracking information that has been added to the resources during the evaluation of the bases and overlays. In the final setup, we will see that a name reference in spec/affinity/podAntiAffinity/preferredDuringSchedulingIgnoredDuringExecution/podAffinityTerm/namespaces needs to be resolved from "my-placeholder" to something else, but when we look at the namespace candidates, we see that both of them originally had that name, and don't know which to pick.

This theory is corroborated by the fact that the following workaround is effective: use a different placeholder value in each base. E.g. if I switch the placeholder to "other-placeholder" in overlay/b, building overlay works correctly.

Would you be willing to submit a test demonstrating this situation as a starting point for fixing it? #4683 is a good example of how to do this. A PR with a fix would also be welcomed, of course. 😄

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 23, 2022
@paul-bormans
Copy link
Author

@KnVerey I'm not sure what you mean with submitting a testcase? Since i already isolated the issue and provided all the files? If it needs to be in a specific format let me know i can do this.

Secondly; a different "placeholder" isn't going to work since the use-case is to import the same base; for us we try to deploy a Prometheus instance multiple times...

Furthermore your hypothesis sounds plausible to me; any reason why nameReference isn't throwing an exception? Because i rather face an error than silently unexpected results.

Paul

@brianpursley
Copy link
Member

OK, so I ended up investigating this to see what is happening and below is what I discovered.

There are two possible behaviors that occur when you have multiple overlays that each rename a namespace having same placeholder:

  1. If the resource has a prefix or suffix, then it simply does not attempt to update the name reference at all. This is the bug identified in issue nameReference breaks when adding outer Kustomization (multiple identical base imports) #4880, where it leaves my-placeholder in the output without emitting any error or warning, and the exit code will be 0.

  2. If the resource does NOT have a prefix or suffix, The user will see the Too many possible referral targets error, along with the candidates that matched, and the exit code will be 1.

I just opened a PR adding some unit test cases covering the above behaviors

I think we want Behavior 2 in both cases, as it would show an error and exit with 1 instead of seeming like it worked when it did not.

I looked at what this would take to fix, and it looks like the problem occurs in nameref.prefixSuffixEquals because it is unaware that PrefixTransformer and SuffixTransformer skip certain GVKs as shown here:

var suffixFieldSpecsToSkip = types.FsSlice{
{Gvk: resid.Gvk{Kind: "CustomResourceDefinition"}},
{Gvk: resid.Gvk{Group: "apiregistration.k8s.io", Kind: "APIService"}},
{Gvk: resid.Gvk{Kind: "Namespace"}},
}

I think we need to somehow make nameref aware of which GVKs are skiped by the PrefixTransformer and SuffixTransformer, so that it doesn't eliminate them. I'm not sure at this point the best way to do this.

The easiest way would be to hard-code this list in nameref (since it is unexported anyway and can't change). But there is a TODO comment about making the skip list configurable and hard-coding the list in nameref would seem to make that TODO more difficult to accomplish.


Alternatively, if we wanted to, I think it would be possible to implement the capability to replace the same placeholder in two different overlays without failing, but it would take more effort and I'm not sure if it would be worth it or not.

To do this, I think we would need to add another build annotation so that kustomize can know the "context" of where the resource came from (ie. overlay1 or overlay2). Then we could sieve based on the context matching in nameref.aelectReferral to narrow the candidates down to the single, correct one. This is assuming you should not be able to rename across contexts. I haven't tried this out, but it seems like the concept might work. It does add more complexity though and I'm not sure the benefit is worth it for what is sort of an edge case.

@k8s-triage-robot
Copy link

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jan 19, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 18, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
Development

No branches or pull requests

5 participants