Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade GKE prometheus set up to prometheus-community/kube-prometheus-stack #10132

Merged
merged 2 commits into from
Aug 30, 2022

Conversation

npepinpe
Copy link
Member

@npepinpe npepinpe commented Aug 21, 2022

Description

This PR updates the prometheus-values.yaml we use to set up our monitoring stack on our GKE clusters. These are the latest values used, adapted for the new chart.

At the same time, I've already migrated us from the old deprecated chart to the new chart (prometheus-community/kube-prometheus-stack), and upgraded from 9.x to 16.0.0. In order to migrate, I did the following (based on this issue from our SREs):

  • Modify the PV reclaim policy to retain instead of delete; this allows us to delete the old PVC but keep the persistent volume, retaining our data
  • Pre-create the PVC that the new chart expects; it will then pick up on creation and won't create a new one, and we keep the old PV/data intact.
  • Follow these unofficial upgrade instructions; essentially we need to re-create the CRDs as helm upgrade doesn't install CRDs, so we need to pick up the CRDs from the updated operator version.
  • Migrate from the old chart to the new chart using helm upgrade metrics --debug --namespace default --dependency-update -f prometheus-operator-values.yml --version 10.0.0 prometheus-community/kube-prometheus-stack (first run with a --dry-run to ensure the PVC and so on will be kept)
  • Once done, follow the upgrade instructions for each major version upgrade as you go along, using the command above but updating the version. This was done until version 16.0.0, which removes the last component using deprecated APIs (kube-state-metrics).

With that done, we could then upgrade the Kubernetes clusters to 1.23 without any issues. The next time we need to do all of this will be when upgrading to k8s 1.25, which removes further APIs. While it's possible to upgrade k8s first and then fix the Helm release, it's easier to first upgrade the charts to make sure nothing using the deprecated APIs, and then upgrade k8s.

One last thing: we could upgrade to 17.x and remove our pinned version of Grafana to upgrade Grafana to 8.x (like we have in SaaS). To do that, just edit the values file, remove the pinned tag for Grafana, update the necessary CRDs as described on the chart readme (link is above), and then run helm upgrade metrics --debug --namespace default --dependency-update -f prometheus-operator-values.yml --version 17.0.0 prometheus-community/kube-prometheus-stack.

Related issues

closes #9074

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The change has been verified by a QA run
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • New content is added to the release announcement
  • If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Please refer to our review guidelines.

@@ -9,6 +9,9 @@ grafana:
userKey: admin-user
passwordKey: admin-password
grafana.ini:
server:
# REPLACE THIS WITH THE ACTUAL ROOT URL
root_url: "http://localhost:3000"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use some suggestions here. You need to set the root URL correctly as otherwise the GitHub authentication will not work. If you configure nothing, then it will use localhost:3000, which it will pass to GitHub as the "redirect_uri", and GitHub will reject authentication calls saying it doesn't match what's configured in the OAuth app.

We could hardcode the right value, but since we use the same values file for each cluster, this has the danger than we overwrite the URL in one cluster or the other. Any ideas? I'd like to keep it simple so we can keep using the same file, but I don't know. Maybe I missed some config option?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether it get it. So you have to set here the ingress URL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here it should be the ingress URL for each ingress that we have.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one of the issue I mentioned was this: if we leave it to localhost:3000 and someone upgrades without changing it, then we break the OAuth. But if we leave it to a particular URL, then we risk breaking the OAuth for one of the two Grafana we have.

I'm thinking of looking into Helmfile to manage our multiple deployments, sharing the same file with some overrides.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 21, 2022

Test Results

   853 files  +    5     853 suites  +5   1h 36m 35s ⏱️ - 5m 9s
6 422 tests  - 211  6 412 ✔️  - 210  10 💤  - 1  0 ±0 
6 606 runs   - 211  6 596 ✔️  - 210  10 💤  - 1  0 ±0 

Results for commit c90aff8. ± Comparison against base commit abd5fb4.

♻️ This comment has been updated with latest results.

Copy link
Member

@Zelldon Zelldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for doing it 🚀 🤗

@@ -1,5 +1,5 @@
alertmanager:
enabled: false
enabled: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for your benchmark on the long running cluster?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but in general it'll be useful in the future to have alerts :)

@Zelldon
Copy link
Member

Zelldon commented Aug 27, 2022

One last thing: we could upgrade to 17.x and remove our pinned version of Grafana to upgrade Grafana to 8.x (like we have in SaaS). To do that, just edit the values file, remove the pinned tag for Grafana, update the necessary CRDs as described on the chart readme (link is above), and then run helm upgrade metrics --debug --namespace default --dependency-update -f prometheus-operator-values.yml --version 17.0.0 prometheus-community/kube-prometheus-stack.

I would like that we do it right now, so we are on sync with saas and check whether our dashboards work etc.

@Zelldon
Copy link
Member

Zelldon commented Aug 30, 2022

We had this week issues with accessing grafana. Our medic (@saig0) was not able to connect to our instances.

Idk why I was able to connect 🤷

What I did to fix it for now (hotfix):

$ helm list
NAME   	NAMESPACE	REVISION	UPDATED                                 	STATUS  	CHART                       	APP VERSION
metrics	default  	56      	2022-08-21 17:33:28.495923251 +0200 CEST	deployed	kube-prometheus-stack-16.0.0	0.47.1     


$ helm get values metrics > values.yaml # get installed values
$ vim values.yaml # replace localhost with out grafana instance ingress URL

$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts # add the helm repo
$ helm repo update

$ helm get manifest metrics > actual.yaml # get the actual manifest
$ helm template metrics prometheus-community/kube-prometheus-stack -f values.yaml --version 16.0.0  > afterUpgrade.yaml # do an dry-run to compare the output

Running diff actual.yaml afterUpgrade.yaml

Shows:

536c536
<     root_url = http://localhost:3000
---
>     root_url = http://34.77.165.228/
37244c37244
<         checksum/config: 2f61443ba5962e030d1d7a31c6e76232fb7a9f3dffcd94eb7673d4c0c5dde3c4
---
>         checksum/config: e4fb69f5cf1d57bf13e53d7e87ce8c96ac084c62adfddd2b5599655c545198dc
40716c40716,41035
< 
---
> ---
> # Source: kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/psp.yaml
> apiVersion: policy/v1beta1
> kind: PodSecurityPolicy
> metadata:
>   name: metrics-kube-prometheus-st-admission
>   annotations:
>     "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
>     "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
>   labels:
>     app: kube-prometheus-stack-admission
>     
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "16.0.0"
>     app.kubernetes.io/part-of: kube-prometheus-stack
>     chart: kube-prometheus-stack-16.0.0
>     release: "metrics"
>     heritage: "Helm"
> spec:
>   privileged: false
>   # Required to prevent escalations to root.
>   # allowPrivilegeEscalation: false
>   # This is redundant with non-root + disallow privilege escalation,
>   # but we can provide it for defense in depth.
>   #requiredDropCapabilities:
>   #  - ALL
>   # Allow core volume types.
>   volumes:
>     - 'configMap'
>     - 'emptyDir'
>     - 'projected'
>     - 'secret'
>     - 'downwardAPI'
>     - 'persistentVolumeClaim'
>   hostNetwork: false
>   hostIPC: false
>   hostPID: false
>   runAsUser:
>     # Permits the container to run with root privileges as well.
>     rule: 'RunAsAny'
>   seLinux:
>     # This policy assumes the nodes are using AppArmor rather than SELinux.
>     rule: 'RunAsAny'
>   supplementalGroups:
>     rule: 'MustRunAs'
>     ranges:
>       # Forbid adding the root group.
>       - min: 0
>         max: 65535
>   fsGroup:
>     rule: 'MustRunAs'
>     ranges:
>       # Forbid adding the root group.
>       - min: 0
>         max: 65535
>   readOnlyRootFilesystem: false
> ---
> # Source: kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/serviceaccount.yaml
> apiVersion: v1
> kind: ServiceAccount
> metadata:
>   name:  metrics-kube-prometheus-st-admission
>   namespace: default
>   annotations:
>     "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
>     "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
>   labels:
>     app: kube-prometheus-stack-admission    
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "16.0.0"
>     app.kubernetes.io/part-of: kube-prometheus-stack
>     chart: kube-prometheus-stack-16.0.0
>     release: "metrics"
>     heritage: "Helm"
> ---
> # Source: kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/clusterrole.yaml
> apiVersion: rbac.authorization.k8s.io/v1
> kind: ClusterRole
> metadata:
>   name:  metrics-kube-prometheus-st-admission
>   annotations:
>     "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
>     "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
>   labels:
>     app: kube-prometheus-stack-admission    
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "16.0.0"
>     app.kubernetes.io/part-of: kube-prometheus-stack
>     chart: kube-prometheus-stack-16.0.0
>     release: "metrics"
>     heritage: "Helm"
> rules:
>   - apiGroups:
>       - admissionregistration.k8s.io
>     resources:
>       - validatingwebhookconfigurations
>       - mutatingwebhookconfigurations
>     verbs:
>       - get
>       - update
>   - apiGroups: ['policy']
>     resources: ['podsecuritypolicies']
>     verbs:     ['use']
>     resourceNames:
>     - metrics-kube-prometheus-st-admission
> ---
> # Source: kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/clusterrolebinding.yaml
> apiVersion: rbac.authorization.k8s.io/v1
> kind: ClusterRoleBinding
> metadata:
>   name:  metrics-kube-prometheus-st-admission
>   annotations:
>     "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
>     "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
>   labels:
>     app: kube-prometheus-stack-admission    
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "16.0.0"
>     app.kubernetes.io/part-of: kube-prometheus-stack
>     chart: kube-prometheus-stack-16.0.0
>     release: "metrics"
>     heritage: "Helm"
> roleRef:
>   apiGroup: rbac.authorization.k8s.io
>   kind: ClusterRole
>   name: metrics-kube-prometheus-st-admission
> subjects:
>   - kind: ServiceAccount
>     name: metrics-kube-prometheus-st-admission
>     namespace: default
> ---
> # Source: kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/role.yaml
> apiVersion: rbac.authorization.k8s.io/v1
> kind: Role
> metadata:
>   name:  metrics-kube-prometheus-st-admission
>   namespace: default
>   annotations:
>     "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
>     "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
>   labels:
>     app: kube-prometheus-stack-admission    
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "16.0.0"
>     app.kubernetes.io/part-of: kube-prometheus-stack
>     chart: kube-prometheus-stack-16.0.0
>     release: "metrics"
>     heritage: "Helm"
> rules:
>   - apiGroups:
>       - ""
>     resources:
>       - secrets
>     verbs:
>       - get
>       - create
> ---
> # Source: kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/rolebinding.yaml
> apiVersion: rbac.authorization.k8s.io/v1
> kind: RoleBinding
> metadata:
>   name:  metrics-kube-prometheus-st-admission
>   namespace: default
>   annotations:
>     "helm.sh/hook": pre-install,pre-upgrade,post-install,post-upgrade
>     "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
>   labels:
>     app: kube-prometheus-stack-admission    
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "16.0.0"
>     app.kubernetes.io/part-of: kube-prometheus-stack
>     chart: kube-prometheus-stack-16.0.0
>     release: "metrics"
>     heritage: "Helm"
> roleRef:
>   apiGroup: rbac.authorization.k8s.io
>   kind: Role
>   name: metrics-kube-prometheus-st-admission
> subjects:
>   - kind: ServiceAccount
>     name: metrics-kube-prometheus-st-admission
>     namespace: default
> ---
> # Source: kube-prometheus-stack/charts/grafana/templates/tests/test.yaml
> apiVersion: v1
> kind: Pod
> metadata:
>   name: metrics-grafana-test
>   labels:
>     helm.sh/chart: grafana-6.9.1
>     app.kubernetes.io/name: grafana
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "7.4.5"
>     app.kubernetes.io/managed-by: Helm
>   annotations:
>     "helm.sh/hook": test-success
>   namespace: default
> spec:
>   serviceAccountName: metrics-grafana-test
>   containers:
>     - name: metrics-test
>       image: "bats/bats:v1.1.0"
>       imagePullPolicy: "IfNotPresent"
>       command: ["/opt/bats/bin/bats", "-t", "/tests/run.sh"]
>       volumeMounts:
>         - mountPath: /tests
>           name: tests
>           readOnly: true
>   volumes:
>   - name: tests
>     configMap:
>       name: metrics-grafana-test
>   restartPolicy: Never
> ---
> # Source: kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/job-createSecret.yaml
> apiVersion: batch/v1
> kind: Job
> metadata:
>   name:  metrics-kube-prometheus-st-admission-create
>   namespace: default
>   annotations:
>     "helm.sh/hook": pre-install,pre-upgrade
>     "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
>   labels:
>     app: kube-prometheus-stack-admission-create    
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "16.0.0"
>     app.kubernetes.io/part-of: kube-prometheus-stack
>     chart: kube-prometheus-stack-16.0.0
>     release: "metrics"
>     heritage: "Helm"
> spec:
>   template:
>     metadata:
>       name:  metrics-kube-prometheus-st-admission-create
>       labels:
>         app: kube-prometheus-stack-admission-create        
>         app.kubernetes.io/managed-by: Helm
>         app.kubernetes.io/instance: metrics
>         app.kubernetes.io/version: "16.0.0"
>         app.kubernetes.io/part-of: kube-prometheus-stack
>         chart: kube-prometheus-stack-16.0.0
>         release: "metrics"
>         heritage: "Helm"
>     spec:
>       containers:
>         - name: create
>           image: k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.2.0
>           imagePullPolicy: IfNotPresent
>           args:
>             - create
>             - --host=metrics-kube-prometheus-st-operator,metrics-kube-prometheus-st-operator.default.svc
>             - --namespace=default
>             - --secret-name=metrics-kube-prometheus-st-admission
>           resources:
>             {}
>       restartPolicy: OnFailure
>       serviceAccountName: metrics-kube-prometheus-st-admission
>       securityContext:
>         runAsGroup: 2000
>         runAsNonRoot: true
>         runAsUser: 2000
> ---
> # Source: kube-prometheus-stack/templates/prometheus-operator/admission-webhooks/job-patch/job-patchWebhook.yaml
> apiVersion: batch/v1
> kind: Job
> metadata:
>   name:  metrics-kube-prometheus-st-admission-patch
>   namespace: default
>   annotations:
>     "helm.sh/hook": post-install,post-upgrade
>     "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
>   labels:
>     app: kube-prometheus-stack-admission-patch    
>     app.kubernetes.io/managed-by: Helm
>     app.kubernetes.io/instance: metrics
>     app.kubernetes.io/version: "16.0.0"
>     app.kubernetes.io/part-of: kube-prometheus-stack
>     chart: kube-prometheus-stack-16.0.0
>     release: "metrics"
>     heritage: "Helm"
> spec:
>   template:
>     metadata:
>       name:  metrics-kube-prometheus-st-admission-patch
>       labels:
>         app: kube-prometheus-stack-admission-patch        
>         app.kubernetes.io/managed-by: Helm
>         app.kubernetes.io/instance: metrics
>         app.kubernetes.io/version: "16.0.0"
>         app.kubernetes.io/part-of: kube-prometheus-stack
>         chart: kube-prometheus-stack-16.0.0
>         release: "metrics"
>         heritage: "Helm"
>     spec:
>       containers:
>         - name: patch
>           image: k8s.gcr.io/ingress-nginx/kube-webhook-certgen:v1.2.0
>           imagePullPolicy: IfNotPresent
>           args:
>             - patch
>             - --webhook-name=metrics-kube-prometheus-st-admission
>             - --namespace=default
>             - --secret-name=metrics-kube-prometheus-st-admission
>             - --patch-failure-policy=Fail
>           resources:
>             {}
>       restartPolicy: OnFailure
>       serviceAccountName: metrics-kube-prometheus-st-admission
>       securityContext:
>         runAsGroup: 2000
>         runAsNonRoot: true
>         runAsUser: 2000

I think it is ok to ignore the webhooks, they are potentially used during the upgrade.

$ helm upgrade metrics prometheus-community/kube-prometheus-stack -f values.yaml --version 16.0.0
W0830 10:33:31.971785  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:32.829207  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:43.925216  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.594887  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.616641  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.673887  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.761684  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.782592  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.830896  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.913698  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.934810  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:44.982406  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.071017  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.091741  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.180680  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.267297  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.288054  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.334801  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.444645  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.466939  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.517075  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.606472  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.628519  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:33:45.683109  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:34:04.751433  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:34:05.552517  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
W0830 10:34:16.316840  175112 warnings.go:70] policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
Release "metrics" has been upgraded. Happy Helming!
NAME: metrics
LAST DEPLOYED: Tue Aug 30 10:33:24 2022
NAMESPACE: default
STATUS: deployed
REVISION: 57
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
  kubectl --namespace default get pods -l "release=metrics"

Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.

@Zelldon
Copy link
Member

Zelldon commented Aug 30, 2022

I logged myself out in order to reproduce the issue. What we see is the following:

grafana2

error=redirect_uri_mismatch&error_description=The+redirect_uri+MUST+match+the+registered+callback+URL+for+this+application.

@Zelldon
Copy link
Member

Zelldon commented Aug 30, 2022

Looks like we need to configure a github oauth application https://grafana.com/docs/grafana/v9.0/setup-grafana/configure-security/configure-authentication/github/

Did you do that @npepinpe ?!

@megglos
Copy link
Contributor

megglos commented Aug 30, 2022

@Zelldon the grafana instance is still setting that localhost:3000 redirect uri which github rejects (2nd request in your dev tools network tab)

https://github.com/login/oauth/authorize?access_type=online&client_id=ec46cc47d20609aad31f&redirect_uri=http%3A%2F%2Flocalhost%3A3000%2Flogin%2Fgithub&response_type=code&scope=user%3Aemail+read%3Aorg&state=CaYrharRrHYr0TzfauzhaoGXU2Dn8KlcIROwyahV14A%3D

Maybe this helps us here? https://grafana.com/tutorials/run-grafana-behind-a-proxy/

@npepinpe
Copy link
Member Author

It's related to the comment I mentioned about setting the root URL. I thought it was fixed since I tested it 🤔

@Zelldon
Copy link
Member

Zelldon commented Aug 30, 2022

Summary: The problem was that the pods were stuck in init because the PV was claimed by the previous pod, which is why my upgrade didn't worked.

Copy link
Member

@Zelldon Zelldon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@npepinpe
Copy link
Member Author

bors merge

@zeebe-bors-camunda
Copy link
Contributor

Build succeeded:

@zeebe-bors-camunda zeebe-bors-camunda bot merged commit 7bfa617 into main Aug 30, 2022
@zeebe-bors-camunda zeebe-bors-camunda bot deleted the np-upgrade-prometheus branch August 30, 2022 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrated to newer prometheus operator
3 participants