Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU usage seems to be not working on OpenBSD 7.0-CURRENT #1239

Closed
1 of 5 tasks
chrissnell-okta opened this issue Feb 1, 2022 · 4 comments
Closed
1 of 5 tasks

CPU usage seems to be not working on OpenBSD 7.0-CURRENT #1239

chrissnell-okta opened this issue Feb 1, 2022 · 4 comments

Comments

@chrissnell-okta
Copy link

Describe the bug
I'm using this library via telegraf, running on OpenBSD. Telegraf is unable to get CPU usage stats from my OpenBSD system. Telegraf calls the library here and the library produces this error here

Digging further, I ran a ktrace(1) and I'm seeing what appears to be syscalls traversing the CPUs and failing on one of them:

 28984 telegraf CALL  kevent(3,0,0,0x2ba7277f0,64,0x2ba7277c8)
 28984 telegraf STRU  struct timespec { 0 }
 28984 telegraf STRU  struct kevent [4] { ident=10, filter=EVFILT_WRITE, flags=0x8021<EV_ADD|EV_CLEAR|EV_EOF>, fflags=0<>, data=0, udata=0x2d53529c0 } { ident=20, filter=EVFILT_WRITE, flags=0x21<EV_ADD|EV_CLEAR>, fflags=0<>, data=16384, udata=0x2d53531e8 } { ident=9, filter=EVFILT_WRITE, flags=0x21<EV_ADD|EV_CLEAR>, fflags=0<>, data=16384, udata=0x2d5352b90 } { ident=10, filter=EVFILT_READ, flags=0x8021<EV_ADD|EV_CLEAR|EV_EOF>, fflags=0<>, data=1456, udata=0x2d53529c0 }
 28984 telegraf RET   kevent 4
 28984 telegraf CALL  sysctl(6.24<hw.smt>,0,0xc00066c6e0,0,0)
 28984 telegraf RET   sysctl 0
 28984 telegraf CALL  sysctl(6.24<hw.smt>,0xc000fc2cd8,0xc00066c6e0,0,0)
 28984 telegraf RET   sysctl 0
 28984 telegraf CALL  sysctl(1.71.0<kern.cp_time2.0>,0,0xc00066c760,0,0)
 28984 telegraf RET   sysctl 0
 28984 telegraf CALL  sysctl(1.71.0<kern.cp_time2.0>,0xc000674180,0xc00066c760,0,0)
 28984 telegraf RET   sysctl 0
 28984 telegraf CALL  sysctl(1.71.2<kern.cp_time2.2>,0,0xc000599760,0,0)
 28984 telegraf RET   sysctl 0
 28984 telegraf CALL  sysctl(1.71.2<kern.cp_time2.2>,0xc0006741b0,0xc000599760,0,0)
 28984 telegraf RET   sysctl 0
 28984 telegraf CALL  sysctl(1.71.4<kern.cp_time2.4>,0,0xc000599760,0,0)
 28984 telegraf RET   sysctl -1 errno 2 No such file or directory
 28984 telegraf CALL  write(2,0xc000588000,0x68)
 28984 telegraf GIO   fd 2 wrote 104 bytes
       "2022-02-01T03:33:00Z E! [inputs.cpu] Error in plugin: error getting CPU info: no such file or directory
       "

I wish I knew more about OpenBSD syscalls to help with this one.

To Reproduce
On an OpenBSD system:
% doas pkg_add telegraf
Then, edit /etc/telegraf/telegraf.conf and enable the CPU input plugin:

[[inputs.cpu]]
  ## Whether to report per-cpu stats or not
  percpu = true
  ## Whether to report total system cpu stats or not
  totalcpu = false
  ## If true, collect raw CPU time metrics
  collect_cpu_time = true
  ## If true, compute and report the sum of all non-idle CPU states
  report_active = true

Then start telegraf:
% rcctl start telegraf

and watch /var/log/daemon

Expected behavior
Accurate CPU stats gathered.

Environment (please complete the following information):

  • Windows: [paste the result of ver]
  • Linux: [paste contents of /etc/os-release and the result of uname -a]
  • Mac OS: [paste the result of sw_vers and uname -a
  • FreeBSD: [paste the result of freebsd-version -k -r -u and uname -a]
  • OpenBSD: OpenBSD bermuda.island.nu 7.0 GENERIC.MP#293 amd64

Additional context

This machine a 4-core Intel system.

[Cross-compiling? Paste the command you are using to cross-compile and the result of the corresponding go env]

@chrissnell-okta
Copy link
Author

CC: @tklauser since you worked on some of this code (relatively) recently

@chrissnell-okta chrissnell-okta changed the title CPU usage seems to be not working on OpenBSD CPU usage seems to be not working on OpenBSD 7.0-CURRENT Feb 1, 2022
@chrissnell
Copy link

I don't think the cpu module works at all on OpenBSD current. Here's a simple piece of sample code that will produce the error when run on this OS:


import (
	"log"

	"github.com/shirou/gopsutil/v3/cpu"
)

func main() {
	perCPUTimes, err := cpu.Times(true)
	if err != nil {
		log.Fatalln("error getting cpu times:", err)
	}

	for k, v := range perCPUTimes {
		log.Printf("cpu %v: %+v", k, v)
	}

}
2022/02/03 21:57:38 error getting cpu times: no such file or directory

I will keep digging.

@chrissnell
Copy link

The problem is in the SMT detection. Either the smt() function incorrectly returns false for my CPU or the multiplication of the CPU counter by 2 here is wrong. Multiplying the counter by 2 causes the syscall to attempt to read a CPU that does not exist. If I simply remove the j *= 2, I am able to fetch CPU times for all four cores.

My money is on the SMT detection algorithm. I'll dig in further.

@Lomanic
Copy link
Collaborator

Lomanic commented Apr 21, 2022

With #1244 merged, can we close this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants