Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make verify-crds intermittently produces different output #1928

Closed
randomvariable opened this issue May 31, 2023 · 12 comments · Fixed by #2004
Closed

make verify-crds intermittently produces different output #1928

randomvariable opened this issue May 31, 2023 · 12 comments · Fixed by #2004
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@randomvariable
Copy link
Member

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

If you run make verify-crds repeatedly, about 20% of the time, the generated CRDs are actually different, usually adding "cluster-api" as a category or setting an atomic map type.

What did you expect to happen:

Consistent CRD generation

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api-provider-vsphere version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 31, 2023
@chrischdi
Copy link
Member

I can confirm that this is still the case on main.

@chrischdi
Copy link
Member

And also happens at updated controller-gen / controller-tools.

@sbueringer
Copy link
Member

We had a similar bug like this in a previous controller-tools release (but that one was fixed). Basically the problem was some non-determinism.

Unfortunately I can't find the old issue/PR anymore. But overall probably requires some deep-dive controller-gen debugging

@sbueringer
Copy link
Member

Probably was this one: kubernetes-sigs/controller-tools#680

@sbueringer
Copy link
Member

Can you try controller-tools < v0.9.0 if it happens there?

@chrischdi
Copy link
Member

With v0.8.0, I only hit the categories added/not added case. The atomic case is gone.

@chrischdi
Copy link
Member

Note: I will debug into controller-gen to get down on this.

@chrischdi
Copy link
Member

chrischdi commented Jul 5, 2023

I did take a further look. I can see three cases:

  • case A: having categories: cluster-api set / not set on a top-level CRD object
  • case B: having x-kubernetes-map-type: atomic set / not set at a property of type object. Often this is some kind of object reference (e.g. bootstrapRef).
  • case C: adding/removing that the property apiGroup is required, this also seems to happen on object references, e.g. addressFromPools. At the same time, this may also have the case B.

Some more observations for the Case A:

  • if one of the api versions does not define the categories=cluster-api marker, this flakyness happens. Controller-gen seems to be flaky/non-deterministic in this case, maybe because of the random order of reading parsing the files (this is a guess)?
    • Fixable either by taking a deeper look into controller-gen to make the output deterministic
    • Or by adding the categories=cluster-api marker to every api version.

One more observation for all the cases:

  • Output seems to be deterministic when only generating for v1beta1.

TODO: inspect more into the other cases, a workaround does not look as simple as for case A (by debugging controller-gen).

@chrischdi
Copy link
Member

chrischdi commented Jul 6, 2023

Simplified reproduction script:

function test:generate() {
  rm -rf tmp tmp2
  mkdir -p tmp tmp2
  eval $CONTROLLER_GEN ${1}
  for i in {1..100}; do
    printf "> %02d\n" "$i";
    eval $CONTROLLER_GEN ${1}2
    diff tmp${2} tmp2${2} || break
  done
}

CONTROLLER_GEN="$(pwd)/hack/tools/bin/controller-gen"
GENARGS="paths=./apis/v1alpha3 \
    paths=./apis/v1alpha4 \
    paths=./apis/v1beta1 \
    crd:crdVersions=v1 \
    output:crd:dir=tmp"

test:generate ${GENARGS}

Note: we have deterministic output when dropping v1alpha3 & v1alpha4 from GENARGS:

@chrischdi
Copy link
Member

I did a git bisect on controller-gen/controller-tools.

Looks like this was introduced in kubernetes-sigs/controller-tools#663

There was already a fix after that PR (kubernetes-sigs/controller-tools#687) but this did not fix the issue here.

@ntoofu
Copy link
Contributor

ntoofu commented Jul 6, 2023

This issue looks the same as #1757.

I sent a PR, but it is not merged yet.

@chrischdi
Copy link
Member

Oh wow, that's actually great news 👍 just verified again that the PR would fix the issue 🎉

So yep, definitely seems to be a duplicate of #1757

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants