Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'kernel_restarts' prometheus metric #1240

Open
yuvipanda opened this issue Mar 22, 2023 · 3 comments · May be fixed by #1241
Open

Add 'kernel_restarts' prometheus metric #1240

yuvipanda opened this issue Mar 22, 2023 · 3 comments · May be fixed by #1241

Comments

@yuvipanda
Copy link
Contributor

"My kernel restarted, why?" is one of the most common support questions when supporting JupyterHubs. Being able to tell when a kernel was restarted - ideally being able to differentiate between user initiated restarts and nanny initiated restarts - would be extremely helpful to JupyterHub admins in providing this support, as well as detecting when memory limits and guarantees need to be changed (because too many users are running into memory limits)

@minrk
Copy link
Contributor

minrk commented Mar 22, 2023

Should the metric include the kernel ID? It might be useful if you want to identify a specific kernel that's restarting a lot (e.g. correlating it with logs), but I'm not sure that's usually useful.

@yuvipanda
Copy link
Contributor Author

Let's just start without, and we can add them on if necessary? I can't think of a use for them right now though.

@minrk minrk linked a pull request Mar 22, 2023 that will close this issue
@minrk
Copy link
Contributor

minrk commented Mar 22, 2023

#1241 should do it. I'd love it if you have strong opinions on label names and enum values!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants