Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Auto-classification/labeling of threads based on rules & shared library of thread func (general purpose profiler) #4986

Open
vvuk opened this issue May 9, 2024 · 1 comment

Comments

@vvuk
Copy link

vvuk commented May 9, 2024

(General purpose profiler feature)

On Windows especially, there are a lot of mystery unnamed threads in most processes that do interesting things. Many of these could at least be classified by examining the call stack of all the samples in the thread. For example:

...
fun_554920 [nvwgf2um.dll]
fun_554900 [nvwgf2um.dll]
fun_1da9764 [nvwgf2um.dll]
BaseThreadInitThunk [kernel32.dll]
RtlUserThreadStart [ntdll.dll]

this thread could be labeled as "nvidia goop".

...
fun_15cc90 [MessageBus.dll]
fun_180a60 [MessageBus.dll]
fun_3d6b50 [MessageBus.dll]
BaseThreadInitThunk [kernel32.dll]
RtlUserThreadStart [ntdll.dll]

this thread could be labeled as "MessageBus.dll thread".

Something like.. look at the samples in the thread, scanning from the top (on windows, RtlUserThreadStart or LdrInitializeThunk). Look at a list that has a regexp for the library name and/or symbol name, and a classification. If nothing matches, go down one level. If there are multiple paths in the call tree, stop. Maybe use the first DLL name outside of ntdll.dll. If there are multiple trees at the top (i.e. the user thread + init thunk), walk them separately, and only classify if they result in the same. A little hand-wavy but could be helpful.

┆Issue is synchronized with this Jira Task

@mstange
Copy link
Contributor

mstange commented May 9, 2024

With symbolication being asynchronous, anything that depends on the symbol name is a bit brittle. We require categories to be set by the profile generator rather than detected in the front-end, for similar reasons:

  • We want the information to be consistent even if we can't get symbols for some reason (e.g. symbol server unreachable, no symbols present for the system library of a new OS version, etc.)
  • We want to minimize the set of hard-coded strings in the front-end.

And for the things you can detect based on the library name, couldn't you set these fallback thread names when you generate the profile? The front-end currently does not know which thread names are real and which ones are just placeholders based on pids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants