Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEP-19] Migrate monitoring stack to prometheus-operator #9065

Closed
46 tasks done
rfranzke opened this issue Jan 23, 2024 · 2 comments
Closed
46 tasks done

[GEP-19] Migrate monitoring stack to prometheus-operator #9065

rfranzke opened this issue Jan 23, 2024 · 2 comments
Assignees
Labels
area/dev-productivity Developer productivity related (how to improve development) area/ipcei IPCEI (Important Project of Common European Interest) area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension kind/epic Large multi-story topic

Comments

@rfranzke
Copy link
Member

rfranzke commented Jan 23, 2024

How to categorize this issue?

/area dev-productivity monitoring
/kind enhancement

What would you like to be added:
The monitoring stack should be migrated from the current custom-built Helm charts to the prometheus-operator as proposed in GEP-19.

Why is this needed:
GEP-19 has been accepted and merged a long while ago, hence we should strive for completing its implementation. Also, the garden cluster (managed via gardener-operator, ref #7016) does not have a monitoring stack yet. Also, this increases the development productivity by cleaning up technical debt and improving the code.

Tasks:


General notes for the migration (taken from #6319):

  • Add temporary migration code for the Persistent volume. This ensures that no data is lost.
    1. Find the "old" pvc and its pv and set persistentVolumeReclaimPolicy=Retain.
    2. Delete the "old" pvc.
    3. Create a Prometheus Object with a volumeClaimTemplate that references the pv with volumeName=<existing-pv>
    4. Migrate the data using an init container
    5. Remove the migration code after 1-2 releases
  • Add all existing prometheus configuration to an additionalScrapeConfig. This will allow us to switch to the prometheus-operator without creating PodMonitors and ServiceMonitors for each component and instead do that migration step by step.
  • Add all extension prometheus configuration to the same additionalScrapeConfig. This will allow extensions time to migrate as well.
  • Existing rules should be replaced with PrometheusRules.
  • Once all of these steps are completed, most of the configuration in the additionalScrapeConfig can be migrated to PodMonitors and ServiceMonitors.
@rfranzke rfranzke self-assigned this Jan 23, 2024
@gardener-prow gardener-prow bot added area/dev-productivity Developer productivity related (how to improve development) area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension labels Jan 23, 2024
@rfranzke rfranzke pinned this issue Jan 23, 2024
@rfranzke rfranzke added kind/epic Large multi-story topic area/ipcei IPCEI (Important Project of Common European Interest) labels May 23, 2024
@rfranzke
Copy link
Member Author

All tasks have been completed.
/close

@gardener-prow gardener-prow bot closed this as completed May 29, 2024
Copy link
Contributor

gardener-prow bot commented May 29, 2024

@rfranzke: Closing this issue.

In response to this:

All tasks have been completed.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@rfranzke rfranzke unpinned this issue May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dev-productivity Developer productivity related (how to improve development) area/ipcei IPCEI (Important Project of Common European Interest) area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension kind/epic Large multi-story topic
Projects
None yet
Development

No branches or pull requests

1 participant