Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process createTime calculation for linux lxc guest is incorrect #1562

Open
drubinMeta opened this issue Dec 13, 2023 · 4 comments
Open

Process createTime calculation for linux lxc guest is incorrect #1562

drubinMeta opened this issue Dec 13, 2023 · 4 comments

Comments

@drubinMeta
Copy link

Describe the bug
createTime for processes running in lxc utilize in calculation container boot time, based on uptime (gopsutil/internal/common
/common_linux.go:78)
and /proc/[pid]/stats field 22 that represent process start time that based on host boot time (gopsutil/process/process_linux.go:1074)
As a result, when host boot time considerably differ from lxc boot time, createTime of lxc processes has future value.

To Reproduce

On long running host, start lxc container and query processes createTime

Expected behavior
Possible fix, in case of VirtualizationSystem is lxc and VirtualizationRole is guest, use host boot time from /proc/stat in conjunction with
/proc/[pid]/stats field 22 (gopsutil/process/process_linux.go:1074)

Environment:
Ubuntu 20.04.6 LTS
Linux 5.15.30-051530-generic
Lxc 4.0.9

drubinMeta added a commit to nsofnetworks/opentelemetry-collector-contrib that referenced this issue Dec 13, 2023
…cess createTime bogus value

Process createTime collected using gopsutils/process module and uses as
startTime of metric sequence.
When running inside lxc, function that calculates create time mixes host
and guest stats.
See shirou/gopsutil#1562 for more details
This leads to two issues:
1. Calculated createTime has future value eventually causing scrapper to
skip process collection.
2. dp series startTime is set to future in time value, the consequences
   are unknown.
In order to overcome this issue, we will set createTime to system boot time.
In case of lxc, container boot time
drubinMeta added a commit to nsofnetworks/opentelemetry-collector-contrib that referenced this issue Dec 13, 2023
…cess createTime bogus value

Process createTime collected using gopsutils/process module and uses as
startTime of metric sequence.
When running inside lxc, function that calculates create time mixes host
and guest stats.
See shirou/gopsutil#1562 for more details
This leads to two issues:
1. Calculated createTime has future value eventually causing scrapper to
skip process collection.
2. dp series startTime is set to future in time value, the consequences
   are unknown.
In order to overcome this issue, we will set createTime to system boot time.
In case of lxc, container boot time
drubinMeta added a commit to nsofnetworks/opentelemetry-collector-contrib that referenced this issue Dec 13, 2023
…cess createTime bogus value

Process createTime collected using gopsutils/process module and uses as
startTime of metric sequence.
When running inside lxc, function that calculates create time mixes host
and guest stats.
See shirou/gopsutil#1562 for more details
This leads to two issues:
1. Calculated createTime has future value eventually causing scrapper to
skip process collection.
2. dp series startTime is set to future in time value, the consequences
   are unknown.
In order to overcome this issue, we will set createTime to system boot time.
In case of lxc, container boot time
drubinMeta added a commit to nsofnetworks/opentelemetry-collector-contrib that referenced this issue Dec 14, 2023
…cess createTime bogus value

Process createTime collected using gopsutils/process module and uses as
startTime of metric sequence.
When running inside lxc, function that calculates create time mixes host
and guest stats.
See shirou/gopsutil#1562 for more details
This leads to two issues:
1. Calculated createTime has future value eventually causing scrapper to
skip process collection.
2. dp series startTime is set to future in time value, the consequences
   are unknown.
In order to overcome this issue, we will set createTime to system boot time.
In case of lxc, container boot time
drubinMeta added a commit to nsofnetworks/opentelemetry-collector-contrib that referenced this issue Dec 15, 2023
…cess createTime bogus value

Process createTime collected using gopsutils/process module and uses as
startTime of metric sequence.
When running inside lxc, function that calculates create time mixes host
and guest stats.
See shirou/gopsutil#1562 for more details
This leads to two issues:
1. Calculated createTime has future value eventually causing scrapper to
skip process collection.
2. dp series startTime is set to future in time value, the consequences
   are unknown.
In order to overcome this issue, we will set createTime to system boot time.
In case of lxc, container boot time
@shirou
Copy link
Owner

shirou commented Dec 17, 2023

Sorry, I don't have a time to dig it deep in this weekend, but is this problem related to this issue? If so, this only happened on LXC 2.0.7, but it still exists on 4.0.9?

@drubinMeta
Copy link
Author

drubinMeta commented Dec 17, 2023

The issue is different, both /proc/stats btime and /proc/[pid]/stats represent host time
But the calculation of process create time based on container boot time in process_linux.go.
This is actually causes the mix up between virtualized boot time and non-virtualized start time

drubinMeta added a commit to nsofnetworks/opentelemetry-collector-contrib that referenced this issue Dec 17, 2023
…cess createTime bogus value

Process createTime collected using gopsutils/process module and uses as
startTime of metric sequence.
When running inside lxc, function that calculates create time mixes host
and guest stats.
See shirou/gopsutil#1562 for more details
This leads to two issues:
1. Calculated createTime has future value eventually causing scrapper to
skip process collection.
2. dp series startTime is set to future in time value, the consequences
   are unknown.
In order to overcome this issue, we will set createTime to system boot time.
In case of lxc, container boot time
@shirou
Copy link
Owner

shirou commented Dec 24, 2023

Let me confirm my understanding. Please point out if I'm wrong.

  • host: uses /proc/stats and btime field. It shows host boot time.
  • lxc guest: uses /proc/uptime. It also shows host boot time.

LXC guest changed to see /proc/uptime at #390. Current behavior works on host.BootTime(). However, it goes wrong when using boottime to calculate Process.CreateTime().

What we want as the Process.CreateTime() is the ctime in /process/stat minus the "BootTime".

Where "BootTime" is definitely the Host BootTime in the case of host. However, in the case of lxc container guest, what we want is not the Host BootTime(== /proc/uptime), but LXC Container BootTime (== the btime in /proc/stat)

If my understanding is correct, then I would expect the same thing to happen with Docker containers.

@drubinMeta
Copy link
Author

Not quite:
lxc guest uses /proc/uptime. But it shows container uptime
However /proc/[pid]/stat field 22 based on host boot time
So for calculation of create time in lxc, /proc/stats btime field should be used

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants