Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: consider adopting procfs lib FS.Meminfo() for memory collector #2957

Open
tjhop opened this issue Mar 15, 2024 · 4 comments
Open

Feat: consider adopting procfs lib FS.Meminfo() for memory collector #2957

tjhop opened this issue Mar 15, 2024 · 4 comments

Comments

@tjhop
Copy link

tjhop commented Mar 15, 2024

As of #2952, the node exporter has been bumped to use procfs lib v0.13.0, which has a fix for safer meminfo parsing from /proc/meminfo. This means it's possible to move away from the custom meminfo parsing the node exporter currently does and use the updated library's parsing instead.

Considerations:
The node exporter memory collector's Update() func uses and expects memory info to be returned as a map[string]float64 from the various platform implementations, which means that even if we adopt the library's updated memory info parsing, we would then need to convert the struct into the expected map type. This can be done with a quick json Marshal/Unmarshal dance playground, if we're willing to pull encoding/json in as a dependency. I'd really rather avoid manually/explicitly parsing out the struct fields as it feels fragile and prone to breakage on procfs updates, so ideas welcome.

I'm willing to implement the changes if the concepts here are accepted 👍

@tjhop tjhop changed the title Feat: considering adopting procfs lib FS.Meminfo() for memory collector Feat: consider adopting procfs lib FS.Meminfo() for memory collector Mar 18, 2024
@discordianfish
Copy link
Member

I don't think we should marshal them to a map though, we should finally make the meminfo metrics follow more the best pratices. E.g using labels for metrics that can be summed up. For that I'd suggest creating a new meminfo collector and deprecate the old one, then in a next major release enabled the new one by default and disable the deprecated one.

@SuperQ wdyt?

@rexagod
Copy link
Contributor

rexagod commented May 28, 2024

I can work on that if @SuperQ is +1, and @tjhop has no plans in the near future to take this up.

@tjhop
Copy link
Author

tjhop commented May 29, 2024

Thanks @rexagod! I was mostly waiting on the green light to proceed, I'm still willing to take this on. However, I would be very happy/grateful if you would be willing to help review the PR once it's pushed and/or PR against my branch if you want to collaborate more.

Initial thoughts/questions for feedback:

  • do we want a feature flag to toggle the new collector? I would think so

  • prometheus/procfs is clearly pretty *nix oriented, do we also convert to the proposed new metrics format for darwin/netbsd/openbsd? I would think so, for at least consistency reasons

  • similar question above for the meminfo numa collector -- should it also get normalized to the new format?

  • I'd suggest creating a new meminfo collector and deprecate the old one -- should the metrics stay in the memory_ subsystem namespace still?

  • E.g using labels for metrics that can be summed up -- this is a great idea. Metric naming docs provide the following guidance:

    As a rule of thumb, either the sum() or the avg() over all dimensions of a given metric should be meaningful (though not necessarily useful). If it is not meaningful, split the data up into multiple metrics. For example, having the capacity of various queues in one metric is good, while mixing the capacity of a queue with the current number of elements in the queue is not.

    With this in mind, how many metrics/labels do we want to have? Some metrics in the darwin/netbsd/openbsd meminfo collectors are counters, should they remain counters (and thus a separate metric)?

  • There's lots of downstream repos that will likely need to be updated to account for these changes (monitoring mixin rules, etc), and likely not all of them under the purview of the prometheus project itself. How to best communicate intended changes?

(sorry for the stream of consciousness, like I said, initial thoughts 🙃 )

@discordianfish
Copy link
Member

do we want a feature flag to toggle the new collector? I would think so

If it's a new collector, it can be disabled/enabled - so no 'feature flag' specifically

prometheus/procfs is clearly pretty *nix oriented, do we also convert to the proposed new metrics format for darwin/netbsd/openbsd? I would think so, for at least consistency reasons

If we can support the other OSes with the new collector, cool - if not, we can add support for that later.

similar question above for the meminfo numa collector -- should it also get normalized to the new format?
If it fits the scope of the new collector, why not.

I'd suggest creating a new meminfo collector and deprecate the old one -- should the metrics stay in the memory_ subsystem namespace still?

Yes, I'd say we make the collectors mutually exclusive so you can use the same metric names where it makes sense

With this in mind, how many metrics/labels do we want to have? Some metrics in the darwin/netbsd/openbsd meminfo collectors are counters, should they remain counters (and thus a separate metric)?

The general best practices apply, so yeah we shouldn't mix counters and gauges. Only things where sum() makes sense should be labels in the same metric.

There's lots of downstream repos that will likely need to be updated to account for these changes (monitoring mixin rules, etc), and likely not all of them under the purview of the prometheus project itself. How to best communicate intended changes?

Thats why I suggest a new collector (and mark the old one deprecated eventually), downstream projects can still use the old one but get warnings that it is deprecated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants