Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional podman stats metrics #9258

Closed
srcshelton opened this issue Feb 7, 2021 · 32 comments · Fixed by #10696
Closed

Additional podman stats metrics #9258

srcshelton opened this issue Feb 7, 2021 · 32 comments · Fixed by #10696
Assignees
Labels
Good First Issue This issue would be a good issue for a first time contributor to undertake. In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@srcshelton
Copy link
Contributor

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description

It would be really helpful if additional column(s) could be added to those available when specifying custom podman stats output to indicate:

  • maximum/high-water-mark memory usage;
  • Average CPU usage (since started in in a given period);
  • total CPU time;
  • average I/O reads/second;
  • average I/O writes/second;
  • etc.

... of each container.

@openshift-ci-robot openshift-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 7, 2021
@rhatdan
Copy link
Member

rhatdan commented Feb 8, 2021

Well PRs welcome.

@rhatdan rhatdan added the Good First Issue This issue would be a good issue for a first time contributor to undertake. label Feb 8, 2021
@mpolden
Copy link

mpolden commented Feb 10, 2021

Some of these metrics can be retrieved through runc: runc events --stats <complete-container-id>.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@anishasthana
Copy link

I'd like to work on this.

@TomSweeneyRedHat
Copy link
Member

@anishasthana by all means and TYVM!

@TomSweeneyRedHat TomSweeneyRedHat added the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label May 10, 2021
@srcshelton
Copy link
Contributor Author

(Also, could someone fix the github-actions bot so that the stale-issue label is removed when someone replies to an issue?!)

@rhatdan
Copy link
Member

rhatdan commented May 16, 2021

Please open a PR to make the change.

@srcshelton
Copy link
Contributor Author

srcshelton commented May 16, 2021

Please open a PR to make the change.

Possibly related to actions/stale#441?

(Although actions/stale#346 sounds useful to have)

I just checked, and we do have remove-stale-when-updated set to true (any this should be supported by the included version) - so it looks as if something's not working, rather than anything's been omitted.

I can push a PR to bump the release of stale used from v1 to v3, in case anything's changed in the meantime (but this is more of a "track recent updates" than "known to fix")?

@rhatdan
Copy link
Member

rhatdan commented May 17, 2021

SGTM

@rhatdan
Copy link
Member

rhatdan commented May 17, 2021

I would tell you the stale notification, is the squeaky wheel that gets me and others to look at old issues. We don't close them automatically, we just attempt to see if they are still active or have been fixed in the last 30 days.

@cdoern
Copy link
Collaborator

cdoern commented Jun 8, 2021

If there is no progress on this one can I take a look at it? might be a good foray into the libpod side of things.

@rhatdan
Copy link
Member

rhatdan commented Jun 9, 2021

@cdoern Take it.

cdoern pushed a commit to cdoern/podman that referenced this issue Jun 16, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
cdoern pushed a commit to cdoern/podman that referenced this issue Jun 16, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
cdoern pushed a commit to cdoern/podman that referenced this issue Jun 16, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
cdoern pushed a commit to cdoern/podman that referenced this issue Jun 16, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
cdoern pushed a commit to cdoern/podman that referenced this issue Jun 16, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
cdoern pushed a commit to cdoern/podman that referenced this issue Jun 17, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
cdoern pushed a commit to cdoern/podman that referenced this issue Jun 17, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
cdoern pushed a commit to cdoern/podman that referenced this issue Jun 21, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
cdoern pushed a commit to cdoern/podman that referenced this issue Jun 21, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
@srcshelton
Copy link
Contributor Author

I've commented on the (merged) change above:

#10696 (comment)

In short, it appears that the current change only averages CPU usage (I've not checked time) over the period the single stats invocation is running, so when run with --no-stream the results aren't usable.

I feel that the counters for these stats should be associated with each running container, allowing accurate figures to be retrieved at any point whilst they still run.

@cdoern
Copy link
Collaborator

cdoern commented Jun 29, 2021

@srcshelton I took a look earlier, you are right that the --no-stream doesn't seem to work with these I will look into a fix. Time seems to work as intended since the timers don't reset on every stats call

@mheon
Copy link
Member

mheon commented Jun 29, 2021

I don't know if we can store metrics persistently. How would we capture them, considering Podman doesn't have a daemon?

@cdoern
Copy link
Collaborator

cdoern commented Jun 29, 2021

@mheon I was just thinking the same thing. All of our other stats are captured "in the moment". the avg cpu usage (besides time) is the only cumulative stat we currently have. Time only works because it is a cgroupStat

@srcshelton
Copy link
Contributor Author

Is there no facility for the long-running podman run, common, or runc/crun processes to be interrogated by a one-shot podman stats invocation to retrieve enough data to provide this form of statistic?

If not, could podman gain an additional mode (podman monitor?) which would attach to a running container (or all running containers?) and in a lightweight fashion gathers this data to then provide it to a separate podman stats call on demand - an optional mini-daemon, effectively?

@rhatdan
Copy link
Member

rhatdan commented Jun 30, 2021

runc/crun run once and then die.
rootless podman can launch other services like fuse-overlay, slirp4netns and conmon.
Rootfull usually just does conmon.

rugk pushed a commit to rugk/podman that referenced this issue Jul 9, 2021
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Aug 1, 2021

@srcshelton @cdoern @mheon What should we do with this issue?

@cdoern
Copy link
Collaborator

cdoern commented Aug 2, 2021

@rhatdan thanks for the poke, one idea I had was to implement the other stats as I did with avg cpu %: reset each time you call stats but is persistent as long as the stats stream is running. Not sure if that is what @srcshelton would like, but the way we store stats isn't conducive to any other solution I don't think

@rhatdan
Copy link
Member

rhatdan commented Aug 2, 2021

Works for me.

@srcshelton
Copy link
Contributor Author

srcshelton commented Aug 3, 2021

@rhatdan thanks for the poke, one idea I had was to implement the other stats as I did with avg cpu %: reset each time you call stats but is persistent as long as the stats stream is running. Not sure if that is what @srcshelton would like, but the way we store stats isn't conducive to any other solution I don't think

Hypothetically with this solution, would it be possible to have a lightweight daemon service keeping the stream (many streams for numerous containers?) alive, and then persist and offer-up these stats for the lifetime of the process?

The ideal would be to somehow and at any point find out the average CPU usage of a container over its entire lifetime to date - but if another collector process is needed to provide this to one-shot tools wishing to interrogate this data, that doesn't sound so bad?

@rhatdan
Copy link
Member

rhatdan commented Aug 3, 2021

I don't see this as a requirement for Podman. There is some information available from cgroups for this, I believe. I am thinking cgroup accounting. @nalind @giuseppe Any ideas?

@giuseppe
Copy link
Member

giuseppe commented Aug 4, 2021

The ideal would be to somehow and at any point find out the average CPU usage of a container over its entire lifetime to date

would that just require the cpu usage and how long the container is running?

We can grab the first information from cgroups and the second one from the OCI runtime (or reading directly starttime from /proc/$CONTAINER_PID/stat)

@cdoern
Copy link
Collaborator

cdoern commented Aug 4, 2021

@giuseppe do cgroups store all of the old cpu % as well? the issue we are running into is that the time is correct, but the percentage only holds the most recent read, leading this to be a stream only feature.

@giuseppe
Copy link
Member

giuseppe commented Aug 6, 2021

the cgroup stores the total usage of the cpu. To calculate the percentage we use two measurements

@github-actions
Copy link

github-actions bot commented Sep 9, 2021

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Sep 9, 2021

@giuseppe @cdoern any progress on this?

@giuseppe
Copy link
Member

I am not working on this issue.

@cdoern any update? Can we close this issue?

@cdoern
Copy link
Collaborator

cdoern commented Sep 29, 2021

No update from me, we can close for now and then if there is demand I can investigate the /proc/CONTAINER_PID/stat approach?

@rhatdan rhatdan closed this as completed Sep 30, 2021
TomSweeneyRedHat pushed a commit to TomSweeneyRedHat/podman that referenced this issue May 26, 2023
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Addresses: https://bugzilla.redhat.com/show_bug.cgi?id=2210139

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
Signed-off-by: Tom Sweeney <tsweeney@redhat.com>
TomSweeneyRedHat pushed a commit to TomSweeneyRedHat/podman that referenced this issue May 26, 2023
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Addresses: https://bugzilla.redhat.com/show_bug.cgi?id=2210139

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
Signed-off-by: Tom Sweeney <tsweeney@redhat.com>
TomSweeneyRedHat pushed a commit to TomSweeneyRedHat/podman that referenced this issue May 26, 2023
added Avg Cpu calculation and CPU up time to podman stats. Adding different feature sets in different PRs, CPU first.

resolves containers#9258

Signed-off-by: cdoern <cbdoer23@g.holycross.edu>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Good First Issue This issue would be a good issue for a first time contributor to undertake. In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/feature Categorizes issue or PR as related to a new feature. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants