Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CC-26984] metric-export: add resource for prometheus integration #200

Merged
merged 2 commits into from May 17, 2024

Conversation

aa-joshi
Copy link
Contributor

@aa-joshi aa-joshi commented Apr 12, 2024

This change introduces resource MetricExportPrometheusConfig which manages prometheus metric export integration in gcp and aws cloud providers.

Commit checklist

  • Changelog
  • Doc gen (make generate)
  • Integration test(s)
  • Acceptance test(s)
  • Example(s)

@fantapop
Copy link
Collaborator

@aa-joshi @marksoper considering that there is no actual configuration for enabling prometheus what you do think of having it as an additional field on the cluster resource? It could be something like:

resource "cockroach_cluster" "dedicated" {
  name           = "cockroach-dedicated"
  cloud_provider = "GCP"
  prometheus_metric_export_enabled = true
  dedicated = {
  ....
  }

If we were planning to add configuration options to this at some point in the near future, it seems like keeping it as its own resource would be desirable but even in that case we could just have that resource just represent the configuration and keep the enabled / disabled flag on cluster.

For example,

resource "cockroach_cluster" "dedicated" {
  name           = "cockroach-dedicated"
  cloud_provider = "GCP"
  prometheus_metric_export_enabled = true
  dedicated = {
  ....
  }
resource "cockroach_cluster_prometheus_export_config" "dedicated" {
  cluster_id = cockroach_cluster.dedicated.id
  some_config_option = 1234
}

@fantapop
Copy link
Collaborator

I guess another consideration is whether we'd expose some of the outputs as part of the resource. I see that the api response has the following data:

PrometheusMetricExportInfo
{
    cluster_id*: string
    status: enum  // Allowed: NOT_DEPLOYED┃DISABLING┃ENABLING┃ENABLED┃ERROR
    targets: {   // targets is a map of ports exposing metrics to regions.
        [any-key]: string
    }
    user_message: string
}

@aa-joshi
Copy link
Contributor Author

@aa-joshi @marksoper considering that there is no actual configuration for enabling prometheus what you do think of having it as an additional field on the cluster resource? It could be something like:

resource "cockroach_cluster" "dedicated" {
  name           = "cockroach-dedicated"
  cloud_provider = "GCP"
  prometheus_metric_export_enabled = true
  dedicated = {
  ....
  }

If we were planning to add configuration options to this at some point in the near future, it seems like keeping it as its own resource would be desirable but even in that case we could just have that resource just represent the configuration and keep the enabled / disabled flag on cluster.

For example,

resource "cockroach_cluster" "dedicated" {
  name           = "cockroach-dedicated"
  cloud_provider = "GCP"
  prometheus_metric_export_enabled = true
  dedicated = {
  ....
  }
resource "cockroach_cluster_prometheus_export_config" "dedicated" {
  cluster_id = cockroach_cluster.dedicated.id
  some_config_option = 1234
}

We have existing metric export integration which has their own resources:

Hence we would like to keep it consistent with other metric export integrations. I will create a seperate backlog item to track integrations against the cluster resource.

@aa-joshi
Copy link
Contributor Author

I guess another consideration is whether we'd expose some of the outputs as part of the resource. I see that the api response has the following data:

PrometheusMetricExportInfo
{
    cluster_id*: string
    status: enum  // Allowed: NOT_DEPLOYED┃DISABLING┃ENABLING┃ENABLED┃ERROR
    targets: {   // targets is a map of ports exposing metrics to regions.
        [any-key]: string
    }
    user_message: string
}

Thanks for suggestion! I have made required changes to keep track of targets.

@fantapop
Copy link
Collaborator

Hence we would like to keep it consistent with other metric export integrations. I will create a seperate backlog item to track integrations against the cluster resource.

For future consideration, I think should always re-evaluate the best way to expose the api for any new resources. Existing resource patterns is of course valuable input into this decision but not the only consideration. In this case, since we want to expose the additional outputs, I agree that it should be a new resource.

Copy link
Collaborator

@fantapop fantapop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for taking this on. I left some feedback for you.

@arjunmahishi
Copy link
Contributor

Went over the changes in a pair programming session (learnt a lot about terraform lifecycle today!). This LGTM.
Will wait for @fantapop's ✅

Copy link
Collaborator

@fantapop fantapop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a looking a lot cleaner. Thanks for your work here. I added some followup comments and some that I missed earlier.

@arjunmahishi arjunmahishi force-pushed the CC-26984 branch 2 times, most recently from 48fae71 to 28cde10 Compare May 3, 2024 04:48
@arjunmahishi
Copy link
Contributor

@fantapop Can you please give it one final look?
Thanks!

@arjunmahishi
Copy link
Contributor

@fantapop we have made the requested changes

CHANGELOG.md Outdated Show resolved Hide resolved
vendor/modules.txt Outdated Show resolved Hide resolved
Copy link
Collaborator

@fantapop fantapop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some additional small nits to the review. Please take a look at these before merging. Also, please rebase off the latest main branch. Thanks for everyone's work on this!

This change introduces resource MetricExportPrometheusConfig which manages
prometheus metric export integration in gcp and aws cloud providers.
* declare a seperate timeout for metric export enable/disable/status
  refresh
* Remove debug logs
* Rename retryFunc -> deletePromMetricExport
@aa-joshi aa-joshi merged commit bd06581 into cockroachdb:main May 17, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants