Race condition in CRS controller results in ovewriting updates #10655
Labels
area/clusterresourceset
Issues or PRs related to clusterresourcesets
kind/bug
Categorizes issue or PR as related to a bug.
priority/backlog
Higher priority than priority/awaiting-more-evidence.
triage/accepted
Indicates an issue or PR is ready to be actively worked on.
What steps did you take and what happened?
Create multiple
ClusterResourceSets
(e.g. 5) that should apply to the same cluster. Watch how they are all successfully applied, but theClusterResourceSetBinding
for the cluster receives multiple updates that overwrite each other, resulting inaccurate data inClusterResourceSetBinding
. AlsoownerReferences
are also inaccurately registered, again due to being overwritten.This seemed especially apparent when there was network latency between the management cluster and the workload cluster.
What did you expect to happen?
The
ClusterResourceSetBinding
should contain the correct state of all appliedClusterResourceSets
for the relevantCluster
.Cluster API version
v1.7.2
Kubernetes version
v1.29.4
Anything else you would like to add?
From https://kubernetes.slack.com/archives/C8TSNPY4T/p1716231649471649:
CRS which says it has been applied, cluster matches correctly, etc
CRSBinding, missing binding and ownerRefs for this CRS aws-ccm-v1.29.2-quick-start-z87nz9 (other ones are correct, plus there are others missing):
trace for updates to the CRSBinding (each resource output is the same but for each update):
Label(s) to be applied
/kind bug
/area clusterresourceset
The text was updated successfully, but these errors were encountered: