Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flamegraph does not contain sys_enter_* calls with params #308

Open
SergejIsbrecht opened this issue Nov 13, 2023 · 3 comments
Open

flamegraph does not contain sys_enter_* calls with params #308

SergejIsbrecht opened this issue Nov 13, 2023 · 3 comments

Comments

@SergejIsbrecht
Copy link

SergejIsbrecht commented Nov 13, 2023

Version

Inferno 0.11.18

Issue

When using Inferno to process sampled perf script output, some stacktraces a missing in a generated flamegraph.svg

perf record -F499 -e 'syscalls:sys_enter_*' --call-graph dwarf -- command
perf script > out

The generated flamegraph only contains sys calls without any parameters. For example:

File: out

XXXXXX XXXXX [005] 44427.957866: syscalls:sys_enter_sched_yield: 
	      xxxxxxxxxx __sched_yield (/gpl/lib/libc)
	      ...
          xxxxxxxxxx [unknown] (/gpl/lib/libpthread)
	      xxxxxxxxxx [unknown] (/gpl/lib/libc)
XXXX XXXX [005] 44422.848908: syscalls:sys_enter_getuid: 
	      XXXXXXXXXX getuid (/gpl/lib/libc)
...
	      XXXXXXXXXX __libc_start_main (/gpl/lib/libc)
	      XXXXXXXXXX _start (/opt/hmi/native_image/command)

This calls are visible in the flamegraph. On the contrary following entries are missing in the flamegraph.

XXXXXXXXX XXXXX [001] 44422.868561: syscalls:sys_enter_munmap: addr: 0xXXXXX, len: 0xXXXXX
	      XXXXXXX __munmap (/gpl/lib/libc)
          ....
	      XXXXXXX [unknown] (/gpl/lib/libpthread)
	      XXXXXXX [unknown] (/gpl/lib/libc)

Output: inferno-collapse-perf

perf script

XXXXXXXXXXXX XXXXX [003] 44422.850672: syscalls:sys_enter_sched_getaffinity: pid: 0xXXXXXXXXXX, len: 0xXXXXXXXXXX, user_mask_ptr: 0xXXXXXXXXXX
	      XXXXXXXXXX pthread_getaffinity_np (/gpl/lib/libpthread)
	      XXXXXXXXXX pthread_getattr_np (/gpl/lib/libpthread)
          ....
	      XXXXXXXXXX [unknown] (/gpl/lib/libpthread)
	      XXXXXXXXXX [unknown] (/gpl/lib/libc)

inferno-collapse-perf

[WARN  inferno::collapse::perf::logging] Weird stack line: pid: 0xXXXXXXXXXX, len: 0xXXXXXXXXXX, user_mask_ptr: 0xXXXXXXXXXX
[WARN  inferno::collapse::perf::logging] Weird event line: 	      XXXXXXXXXX pthread_getaffinity_np (/gpl/lib/libpthread)
[WARN  inferno::collapse::perf::logging] Weird event line: 	      XXXXXXXXXX pthread_getattr_np (/gpl/lib/libpthread)
...
[WARN  inferno::collapse::perf::logging] Weird event line: 	      XXXXXXXXXX [unknown] (/gpl/lib/libpthread)
[WARN  inferno::collapse::perf::logging] Weird event line: 	      XXXXXXXXXX [unknown] (/gpl/lib/libc)

Perf

perf version 4.4

FlameGraph

Seems like FlameGraph by Brendan Gregg is able to pars perf script properly

  • ./stackcollapse-perf.pl
  • ./flamegraph.pl.
@SergejIsbrecht
Copy link
Author

Example

perf 19953 [001] 33076.615333:                   syscalls:sys_enter_ioctl: fd: 0x0000040e, cmd: 0x00002400, arg: 0x00000000
	          112d8f [unknown] (/usr/lib/x86_64-linux-gnu/libc.so.6)
	          30bc56 __evlist__enable+0x96 (inlined)
	          26c177 __cmd_record+0x24b7 (inlined)
	ffffffffffffffff [unknown] ([unknown])

pipewire-pulse  2110 [000] 33076.615351:                syscalls:sys_enter_recvfrom: fd: 0x0000001a, ubuf: 0x56449a86f51c, size: 0x00000014, flags: 0x00000040, addr: 0x00000000, addr_len: 0x00000000
	          12063c __libc_recv+0x1ec (inlined)
	          12063c __libc_recv+0x1ec (inlined)

pipewire-pulse  2110 [000] 33076.615356:              syscalls:sys_enter_epoll_wait: epfd: 0x00000004, events: 0x7ffd9af1c170, maxevents: 0x00000020, timeout: 0xffffffff
	          11eb66 epoll_wait+0x106 (inlined)
	            8a22 [unknown] (/usr/lib/x86_64-linux-gnu/spa-0.2/support/libspa-support.so)
	    56449a7bb8df [unknown] ([heap])

@jonhoo
Copy link
Owner

jonhoo commented Nov 25, 2023

Oh interesting! This suggests we have a corner-case to handle in

let mut by_colons = line[end..].splitn(3, ':').skip(1);
let event = by_colons
.next()
.and_then(|has_event| has_event.rsplit(' ').next());
if let Some(event) = event {
if let Some(ref event_filter) = self.event_filter {
if event != event_filter {
self.stack_filter = StackFilter::Skip;
return;
}
} else {
// By default only show events of the first encountered event type.
// Merging together different types, such as instructions and cycles,
// produces misleading results.
logging::filtering_for_events_of_type(event);
self.event_filter = Some(event.to_string());
}
}
// some event lines _include_ a stack line if the stack only has one frame.
// in that case, the event will be followed by the stack.
let single_stack = if let Some(post_event) = by_colons.next() {
// we need to deal with a couple of cases here:
//
// vote 913 72.176760: 257597 cycles:uppp:
// false 64414 20110.539270: 34467 cycles:u: ffffffff9aa3c8de [unknown] ([unknown])
// false 64414 20110.539270: 34467 cycles: ffffffff9aa3c8de [unknown] ([unknown])
//
// the first should not be handled as a stack, whereas the latter two both should
// the trick is going to be to trim until we encounter a space or a :, whichever
// comes first, and then evaluate from there.
let post_event_start = post_event
.find(|c| c == ':' || c == ' ')
.map(|i| i + 1)
.unwrap_or(0);
let post_event = post_event[post_event_start..].trim();
if !post_event.is_empty() {
// we have a stack!
Some(post_event)
} else {
None
}
} else {
None
};

I suspect it's actually not the arguments that are tripping us up here, but rather the : in the event name (syscalls:sys_enter_ioctl). event ends up getting set to syscalls here

let mut by_colons = line[end..].splitn(3, ':').skip(1);
let event = by_colons
.next()
.and_then(|has_event| has_event.rsplit(' ').next());

and then the remainder (sys_enter_epoll and the arguments) end up in post_event here

// some event lines _include_ a stack line if the stack only has one frame.
// in that case, the event will be followed by the stack.
let single_stack = if let Some(post_event) = by_colons.next() {
// we need to deal with a couple of cases here:
//
// vote 913 72.176760: 257597 cycles:uppp:
// false 64414 20110.539270: 34467 cycles:u: ffffffff9aa3c8de [unknown] ([unknown])
// false 64414 20110.539270: 34467 cycles: ffffffff9aa3c8de [unknown] ([unknown])
//
// the first should not be handled as a stack, whereas the latter two both should
// the trick is going to be to trim until we encounter a space or a :, whichever
// comes first, and then evaluate from there.
let post_event_start = post_event

I'm somewhat short on time these days, so waiting for me to take a look at fixing this would probably take some time, but if you want to take a stab at adding a test case, tracing through the code here to figure out where it gets confused, and then submit a PR with a fix, I'd be happy to try and help guide you through!

@SergejIsbrecht
Copy link
Author

I will give it a go, but I write Java most of the time. I have already been looking into the code base, but reading Rust is quite hard to me.

Anyhow, I think it should be easy to fix it after one is familiar with the code base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants