Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BanzaiCloud KafkaCluster healthcheck returns Degraded state during rolling update when broker config changes #17993

Open
3 tasks done
musubi7726 opened this issue Apr 26, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@musubi7726
Copy link
Contributor

Checklist:

  • I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug

When performing a sync on a KafkaCluster resource, if there are any broker configuration changes, the Kafka operator will perform a rolling update on all brokers and mark certain brokers with a state of "ConfigOutOfSync". During this process, ArgoCD will immediately report the resource in "Degraded" state until all brokers are updated, after which it goes into "Healthy" state.

I believe "Progressing" state would be more appropriate in this circumstance instead of "Degraded", since it is in the process of updating and not an error state.

In addition, this causes issues when performing a sync/wait operation from a workflow task since it will immediately return when it is in "Degraded" state and will not wait until the sync is healthy.

Below are the states of each broker and the KafkaCluster status during the rolling update:

  1. Sync initiated and before any broker redeploy:
kafkacluster - state: ClusterRollingUpgrading
broker 101 - configurationState: ConfigOutOfSync
broker 201 - configurationState: ConfigInSync
broker 301 - configurationState: ConfigOutOfSync
  1. Redeploy broker 101:
kafkacluster - state: ClusterRollingUpgrading
broker 101 - configurationState: ConfigInSync
broker 201 - configurationState: ConfigInSync
broker 301 - configurationState: ConfigOutOfSync
  1. Broker 101 completes. Redeploy broker 301:
kafkacluster - state: ClusterRollingUpgrading
broker 101 - configurationState: ConfigInSync
broker 201 - configurationState: ConfigOutOfSync
broker 301 - configurationState: ConfigInSync
  1. Broker 301 completes. Redeploy broker 201:
kafkacluster - state: ClusterRollingUpgrading
broker 101 - configurationState: ConfigInSync
broker 201 - configurationState: ConfigInSync
broker 301 - configurationState: ConfigInSync
  1. After rolling update completes:
kafkacluster - state: ClusterRunning
broker 101 - configurationState: ConfigInSync
broker 201 - configurationState: ConfigInSync
broker 301 - configurationState: ConfigInSync

To Reproduce

  1. Setup a KafkaCluster with 3 brokers
  2. Modify helm chart or values.yaml file to update the broker config for the KafkaCluster (such as adding/removing allow.everyone.if.no.acl.found: true from spec.readOnlyConfig)
  3. Perform a sync.

Expected behavior

After invoking a sync, the KafkaCluster is in "Progressing" state while it is performing the rolling update to the brokers. After the rolling update is complete, the state changes to "Healthy".

Screenshots

N/A

Version

argocd-server: v2.8.15+e94f6a8
  BuildDate: 2024-04-04T23:36:03Z
  GitCommit: e94f6a8f840714b55006cf8c8e7f3017b3a0d2ea
  GitTreeState: clean
  GoVersion: go1.20.10
  Compiler: gc
  Platform: linux/amd64
  Kustomize Version: v5.1.0 2023-06-19T16:58:18Z
  Helm Version: v3.12.1+gf32a527
  Kubectl Version: v0.24.2
  Jsonnet Version: v0.20.0

Logs

N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant