kepler_node_core_joules_total=0 on RHEL9/x86_64 #1346

jharriga · 2024-04-11T17:55:08Z

What happened?

Downloaded and installed

https://github.com/sustainable-computing-io/kepler/releases/download/v0.7.9/kepler.rpm.tar.gz

On server running

5.14.0-417.kpq1.el9.x86_64
Red Hat Enterprise Linux 9.4 (Plow)

Ran several CPU intensive workloads and metric remained '0'

What did you expect to happen?

expected the metric reading to increase/track system cpu usage

How can we reproduce it (as minimally and precisely as possible)?

Download & install rpm
start service
root# systemctl start container-kepler --now
root# curl localhost:8888/metrics | grep

Anything else we need to know?

No response

Kepler image tag

v0.7.9

Kubernetes version

NONE

Cloud provider or bare metal

bare metal

OS version

# On Linux:
$ cat /etc/os-release
Red Hat Enterprise Linux 9.4 (Plow)

$ uname -a
Linux perf-intel-28.perf.eng.bos2.dc.redhat.com 5.14.0-417.kpq1.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Feb 2 14:05:04 EST 2024 x86_64 x86_64 x86_64 GNU/Linux
</details>


### Install tools

<details>
# rpm --version
RPM version 4.16.1.3
</details>


### Kepler deployment config

<details>
For standalone:
# put your Kepler command argument here
root# systemctl start container-kepler --now
root# curl localhost:8888/metrics | grep
</details>


### Container runtime (CRI) and version (if applicable)

<details>

</details>


### Related plugins (CNI, CSI, ...) and versions (if applicable)

<details>

</details>

rootfs · 2024-05-22T20:10:14Z

@jharriga can you double check if it is kepler_node_core_joules_total or kepler_node_package_joules_total?

Current Ampere xgene hwmon only reports the CPU and I/O power (per doc here). We cannot get DRAM power. So to align with the RAPL reporting, kepler only reports kepler_node_core_total (per code here)

jharriga · 2024-06-03T18:19:45Z

This was originally reported on x86. Running with v0.7.10 Running w/v0.7.10 on x86 I do see the metric kepler-node-core-joules-total does have value
root# curl localhost:8888/metrics | grep kepler_node_core_joules_total

kepler_node_core_joules_total{instance="nuc7",mode="dynamic",package="0",source="intel_rapl"} 39.07
kepler_node_core_joules_total{instance="nuc7",mode="idle",package="0",source="intel_rapl"} 61360.029

As for ARM, on Ampere server running v0.7.10 I see:

kepler_node_core_joules_total{instance="perf-arm-11.perf.eng.bos2.dc.redhat.com",mode="dynamic",package="0",source="intel_rapl"} 98036.551
kepler_node_package_joules_total{instance="perf-arm-11.perf.eng.bos2.dc.redhat.com",mode="dynamic",package="0",source="intel_rapl"} 98057.08

Both the kepler_node_core_joules_total and kepler_node_package_joules_total metrics do have a values.
This doesn't seem to align with what you expected in the previous comment.

At any rate I think this Issue can be CLOSED since the originally reported problem on x86 appears to have been resolved.

jharriga added the kind/bug report bug issue label Apr 11, 2024

rootfs self-assigned this Apr 12, 2024

rootfs added the non-k8s environment Kepler running on Linux without Kubernetes label Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kepler_node_core_joules_total=0 on RHEL9/x86_64 #1346

kepler_node_core_joules_total=0 on RHEL9/x86_64 #1346

jharriga commented Apr 11, 2024

rootfs commented May 22, 2024

jharriga commented Jun 3, 2024 •

edited

kepler_node_core_joules_total=0 on RHEL9/x86_64 #1346

kepler_node_core_joules_total=0 on RHEL9/x86_64 #1346

Comments

jharriga commented Apr 11, 2024

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kepler image tag

Kubernetes version

Cloud provider or bare metal

OS version

rootfs commented May 22, 2024

jharriga commented Jun 3, 2024 • edited

jharriga commented Jun 3, 2024 •

edited