Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor node-mixin as node-observ-lib #2861

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

v-zhuravlev
Copy link
Contributor

NOTE: I put this observ-lib(mixin) into separate folder due to some incompatibilities with old mixin like:

  • USE dashboards (node-rsrc-use.json, node-cluster-rsrc-use.json,node-multicluster-rsrc-use.json)

Similar to windows-observ-lib, node-observ-lib is created, using many panels from commonlib:. It also refactors everything from node-mixin by using grafonnet (v10 schema).

Node observability lib would allow:

  • Easily read and maintain dashboards as code
  • Easily modify any dashboard, panel, target, variable... before rendering final result
  • Change instanceLabels, groupLabels from default job,instance to use dashboards in environments with extra or custom labels
  • Instantiate node dashboards more than once in a single environment (Uid and filteringSelector are used to avoid conflicts).

To learn more about suggested mixin packaging format (observ-lib) you can check dataless example here.

What is recreated/moved from old node-mixin:

  • Dashboards except USE.
  • All alerts
  • All recording rules

Added/Updated:

  • Dashboards:
    • Fleet overview
    • Overview dashboard (added inventory row)
    • Drill down dashboards:
      • Memory
      • CPU and system
      • Network (Interfaces & Sockstat / Netstat)
      • Disks and filesystem
      • Logs (only added if enableLokiLogs:true)
    • MacOS overview (added inventory row)
    • MacOS logs (only added if enableLokiLogs:true)
  • Prometheus annotations (for all dashboards):
    • Reboot
    • OOM kill detected
    • Kernel update
  • New loki annotations (only added if enableLokiLogs:true)
    • Service failed
    • Critical system event
    • Session opened
    • Session closed
  • Alerts:
    • filesystem alerts are moved into separate alerts group to avoid hitting limits for number of alerts allowed in group (in Grafana Mimir)
  • Title case is used across all recreated panels and dashboard names

image
image
image

* Add node-observ-lib

* Remove trends support (not in 10.0 schema)

* Make filteringSelector for logs dashboard configurable

* Temp change dependency (until PR is merged for commonlib)

* Refactor config

* Update jsonnetfile.json

* Update README

* Add separate loki example

* Add sep file example

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
@gaetanars
Copy link

Just a little question : why don't you render dashboards with .json suffix?

@v-zhuravlev
Copy link
Contributor Author

Just a little question : why don't you render dashboards with .json suffix?

Nothing specific, It was a little bit more convenient to reference dashboards by key (dashboards.overview, dashboards.network etc..)..

@SuperQ
Copy link
Member

SuperQ commented Feb 3, 2024

If this is intended to replace the node-mixin, should we remove the old mixin?

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
@v-zhuravlev v-zhuravlev force-pushed the for-upstream-node-observ-lib branch from 150c28c to c75a766 Compare May 4, 2024 18:58
@v-zhuravlev
Copy link
Contributor Author

Just a little question : why don't you render dashboards with .json suffix?
@gaetanars, I think you are right, it is expected for end files to have .json suffix. Brought it back.

If this is intended to replace the node-mixin, should we remove the old mixin?
@SuperQ , we could, but we need to either to reimplement use dashboards as well, or agree that they could be dropped.

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants