Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alerting: Implement /status for the notification system #33227

Merged
merged 19 commits into from Jun 15, 2021
Merged

Conversation

gotjosh
Copy link
Contributor

@gotjosh gotjosh commented Apr 21, 2021

Implements the necessary plumbing to have a /status endpoint on the
notification system.

Updated
Unmarshalling the cloud request used to fail with: no routes provided
because the response from the cloud returns the alertmanager configuration as string where there is no Route define:

{
    "cluster": {
        "peers": [],
        "status": "ready"
    },
    "config": {
        "original": "global:\n  resolve_timeout: 4m\n  http_config:\n    follow_redirects: true\n  smtp_from: youraddress@example.org\n  smtp_hello: localhost\n  smtp_smarthost: localhost:25\n  smtp_require_tls: true\n  pagerduty_url: https://events.pagerduty.com/v2/enqueue\n  opsgenie_api_url: https://api.opsgenie.com/\n  wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/\n  victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/\nroute:\n  receiver: example-email\n  continue: false\nreceivers:\n- name: example-email\n  email_configs:\n  - send_resolved: false\n    to: youraddress@example.org\n    from: youraddress@example.org\n    hello: localhost\n    smarthost: localhost:25\n    headers:\n      From: youraddress@example.org\n      Subject: '{{ template \"email.default.subject\" . }}'\n      To: youraddress@example.org\n    html: '{{template \"email.default.html\" .}}'\n    require_tls: true\ntemplates: []\n"
    },
    "uptime": "2021-05-27T14:02:48.341Z",
    "versionInfo": {
        "branch": "weekly-r134",
        "buildDate": "",
        "buildUser": "",
        "goVersion": "go1.16.3",
        "revision": "60e7248f",
        "version": "r134-60e7248f"
    }
}

therefore I have changed it so that also the grafana backend fork returns the configuration as string.
Therefore, I have implemented GettableStatus.UnmarshalJSON() that unmarshals config.original field from YAML and returns the same struct with grafana backend fork.

This fix requires prometheus/common#303 and/or prometheus/alertmanager#2607

Example requests:

curl http://admin:admin@localhost:3000/api/alertmanager/grafana/api/v2/status
curl http://admin:admin@localhost:3000/api/alertmanager/$alertManagerDatasourceID/api/v2/status

@gotjosh gotjosh requested a review from a team as a code owner April 21, 2021 15:54
@gotjosh gotjosh requested review from ying-jeanne and removed request for a team and ying-jeanne April 21, 2021 15:54
Implements the necessary plumbing to have a /status endpoint on the
notification system.
Copy link
Contributor

@papagian papagian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the impression that we wanted to modify only RouteGetAlertingConfig and handle the 404 responses and in that case to get the default configuration from the /alertmanager/api/v2/status (given that GET /api/alertmanager/{Recipient}/config/api/v1/alerts for grafana returns the default configuration if it does not exist)
but I'm fine also having a distinct endpoint for that.

@gotjosh gotjosh requested a review from a team as a code owner May 31, 2021 17:30
@gotjosh
Copy link
Contributor Author

gotjosh commented Jun 2, 2021

@domasx2 can you test this branch to make sure it matches your expectations?

@domasx2
Copy link
Contributor

domasx2 commented Jun 2, 2021

@domasx2 can you test this branch to make sure it matches your expectations?

on it

@domasx2
Copy link
Contributor

domasx2 commented Jun 2, 2021

Could we have backend unmarshal config.original into proper JSON, same as it does for the config endpoint?
Now it's an oddly inconsitent, that one endpoint unmarshalls and the other does not. We said we wouldn't have yaml in the API

@gotjosh
Copy link
Contributor Author

gotjosh commented Jun 4, 2021

Looks from here, @domasx2 can have the final call on when to merge this if you have addressed his comments.

@domasx2
Copy link
Contributor

domasx2 commented Jun 7, 2021

Heya, tested! It seems config.receivers config param is MIA, as well as config.route.continue. config.global.http_config has tls_config prop which should not be there. Did not test exhaustively

From this endpoint:

HTTP/1.1 200 OK
Cache-Control: no-cache
Content-Encoding: gzip
Content-Type: application/json
Expires: -1
Pragma: no-cache
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-Xss-Protection: 1; mode=block
Date: Mon, 07 Jun 2021 11:23:22 GMT
Content-Length: 459
Connection: close

{
  "cluster": {
    "peers": [],
    "status": "ready"
  },
  "config": {
    "global": {
      "resolve_timeout": "5m",
      "http_config": {
        "proxy_url": null,
        "tls_config": {
          "insecure_skip_verify": false
        },
        "follow_redirects": true
      },
      "smtp_from": "noreply@grafana.net",
      "smtp_hello": "localhost",
      "smtp_smarthost": "smtprelay:2525",
      "pagerduty_url": "https://events.pagerduty.com/v2/enqueue",
      "opsgenie_api_url": "https://api.opsgenie.com/",
      "wechat_api_url": "https://qyapi.weixin.qq.com/cgi-bin/",
      "victorops_api_url": "https://alert.victorops.com/integrations/generic/20131114/alert/"
    },
    "route": {
      "receiver": "email"
    },
    "templates": []
  },
  "uptime": "2021-06-03T21:22:56.279Z",
  "versionInfo": {
    "branch": "weekly-r135",
    "buildDate": "",
    "buildUser": "",
    "goVersion": "go1.16.3",
    "revision": "3a0c4d51",
    "version": "r135-3a0c4d51"
  }
}

config directly from alertmanager:

global:
  resolve_timeout: 5m
  http_config:
    follow_redirects: true
  smtp_from: noreply@grafana.net
  smtp_hello: localhost
  smtp_smarthost: smtprelay:2525
  smtp_require_tls: false
  pagerduty_url: https://events.pagerduty.com/v2/enqueue
  opsgenie_api_url: https://api.opsgenie.com/
  wechat_api_url: https://qyapi.weixin.qq.com/cgi-bin/
  victorops_api_url: https://alert.victorops.com/integrations/generic/20131114/alert/
route:
  receiver: email
  continue: false
receivers:
  - name: email
    email_configs:
      - send_resolved: false
        to: <example@email.com>
        from: noreply@grafana.net
        hello: localhost
        smarthost: smtprelay:2525
        headers:
          From: noreply@grafana.net
          Subject: '{{ template "email.default.subject" . }}'
          To: <example@email.com>
        html: '{{ template "email.default.html" . }}'
        require_tls: false
templates: []

Copy link
Contributor

@domasx2 domasx2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment :)

@papagian
Copy link
Contributor

papagian commented Jun 8, 2021

Heya, tested! It seems config.receivers config param is MIA, as well as config.route.continue.

good catch! I didn't noticed it

@papagian
Copy link
Contributor

papagian commented Jun 9, 2021

Heya, tested! It seems config.receivers config param is MIA, as well as config.route.continue. config.global.http_config has tls_config prop which should not be there. Did not test exhaustively

I have fixed config.receivers.
Regarding config.route.continue, it happens because of the of the omitempty in the JSON tag (missing from the YAML tag).
Regarding config.global.http_config.tls_config I believe it happens because InsecureSkipVerify does not have omitempty while all the other fields do and probably YAML and JSON unmarshalling handle differently this case:
https://github.com/prometheus/common/blob/9e276a1555878cfdaeeea42127f40f2518861145/config/http_config.go#L650-L661
I believe it doesn't harm to leave them in the JSON response
WDYT @domasx2 @gotjosh

@papagian papagian requested a review from domasx2 June 9, 2021 11:01
@domasx2
Copy link
Contributor

domasx2 commented Jun 14, 2021

Heya, tested! It seems config.receivers config param is MIA, as well as config.route.continue. config.global.http_config has tls_config prop which should not be there. Did not test exhaustively

I have fixed config.receivers.
Regarding config.route.continue, it happens because of the of the omitempty in the JSON tag (missing from the YAML tag).
Regarding config.global.http_config.tls_config I believe it happens because InsecureSkipVerify does not have omitempty while all the other fields do and probably YAML and JSON unmarshalling handle differently this case:
https://github.com/prometheus/common/blob/9e276a1555878cfdaeeea42127f40f2518861145/config/http_config.go#L650-L661
I believe it doesn't harm to leave them in the JSON response
WDYT @domasx2 @gotjosh

Dunno, if there's no semantic difference for alertmanager, works for me. Couldn't we just convert yaml to json without going through intermediate structs and all these problems though?

@papagian
Copy link
Contributor

Dunno, if there's no semantic difference for alertmanager, works for me. Couldn't we just convert yaml to json without going through intermediate structs and all these problems though?

I don't think there is an easy way to do so and I believe that such a conversion most probably would also suffer from the same issues (because it would follow the same JSON semantics)
@gotjosh do you think there will be any issue from the alertmanager site (since this is only how default/missing values are handled I don't think so)

Copy link
Contributor

@domasx2 domasx2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested more, lgtm 👍 preparing a frontend PR for it

@domasx2 domasx2 added this to the 8.1.0 milestone Jun 15, 2021
@domasx2
Copy link
Contributor

domasx2 commented Jun 15, 2021

relevant frontend changes: #35769

@papagian papagian merged commit f7ed353 into main Jun 15, 2021
@papagian papagian deleted the status-api branch June 15, 2021 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants