Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"offwaketime" output causes 'attempt to multiply with overflow' #170

Open
Hi-Angel opened this issue Apr 21, 2020 · 4 comments
Open

"offwaketime" output causes 'attempt to multiply with overflow' #170

Hi-Angel opened this issue Apr 21, 2020 · 4 comments

Comments

@Hi-Angel
Copy link
Contributor

Worth noting btw, initially I have built it in release, and it has shown just invalid top-half of stacktraces. Related report on vanilla flamegraph

Steps to reproduce

  1. Make sure bcc-tools are installed (or alternatively use the attached out.txt).
  2. cargo build to compile the project in debug mode
  3. Create a 15 seconds trace sudo /usr/share/bcc/tools/offwaketime -f 15 > /tmp/out.txt, (or alternatively use the attached out.txt).
  4. target/debug/inferno-flamegraph /tmp/out.txt > /tmp/out.svg

An out.txt from my system: out.txt

Expected

It creates an svg file

Actual

It crashes with

thread 'main' panicked at 'attempt to multiply with overflow', /home/constantine/Projects/inferno/src/flamegraph/mod.rs:539:23
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Version

latest git as of commit adc83b9

@jonhoo
Copy link
Owner

jonhoo commented Apr 21, 2020

Hmm, looking at your out.txt, the last four lines look suspicious:

swapper/0;secondary_startup_64;start_kernel;cpu_startup_entry;do_idle;schedule_idle;__sched_text_start;--; 92233720369095933
swapper/3;secondary_startup_64;start_secondary;cpu_startup_entry;do_idle;schedule_idle;__sched_text_start;--; 129127208516546613
swapper/2;secondary_startup_64;start_secondary;cpu_startup_entry;do_idle;schedule_idle;__sched_text_start;--; 239807672958798250
swapper/1;secondary_startup_64;start_secondary;cpu_startup_entry;do_idle;schedule_idle;__sched_text_start;--; 239807672958951499

They have sample counts that are far higher than they should be. Not sure if this is an artifact of how offwaketime works, but those numbers are so vast you'd end up with a pretty uninteresting figure even if we could compute the graph for it.

The issue occurs here:

let pct = (100 * samples) as f64 / (timemax as f64 * opt.factor);

Probably in timemax as f64 * opt.factor. We could try reorder this to avoid the overflow, but I'm still not sure you'd get sensible output with input like that?

@Hi-Angel
Copy link
Contributor Author

Hmm, looking at your out.txt, the last four lines look suspicious:

swapper/0;secondary_startup_64;start_kernel;cpu_startup_entry;do_idle;schedule_idle;__sched_text_start;--; 92233720369095933
swapper/3;secondary_startup_64;start_secondary;cpu_startup_entry;do_idle;schedule_idle;__sched_text_start;--; 129127208516546613
swapper/2;secondary_startup_64;start_secondary;cpu_startup_entry;do_idle;schedule_idle;__sched_text_start;--; 239807672958798250
swapper/1;secondary_startup_64;start_secondary;cpu_startup_entry;do_idle;schedule_idle;__sched_text_start;--; 239807672958951499

They have sample counts that are far higher than they should be. Not sure if this is an artifact of how offwaketime works, but those numbers are so vast you'd end up with a pretty uninteresting figure even if we could compute the graph for it.

Oh, I see! Thank you very much, that explains why I am not seeing any other stackstraces in vanilla flamegraph. The vanilla flamegraph manages to calculate this, but probably because of how much the columns takes, it just gets rid of any other data (i.e. because they would be too thin otherwise). Removing the last four lines makes it work.

The issue occurs here:

let pct = (100 * samples) as f64 / (timemax as f64 * opt.factor);

Probably in timemax as f64 * opt.factor. We could try reorder this to avoid the overflow, but I'm still not sure you'd get sensible output with input like that?

Yeah, I'm wondering if it would make sense to print a warning that events less then X are too small to be seen on the flamegraph? That may help to someone in the future, because apparently offwaketime always produces such traces, I got similar results from two different systems.

@Hi-Angel
Copy link
Contributor Author

Hi-Angel commented Apr 21, 2020

Yeah, I'm wondering if it would make sense to print a warning that events less then X are too small to be seen on the flamegraph? That may help to someone in the future, because apparently offwaketime always produces such traces, I got similar results from two different systems.

Rather I think "Some events are too small to be visible in the svg. The least quantity that will be in the output is X".

@jonhoo
Copy link
Owner

jonhoo commented Apr 21, 2020

Hmm, that's an interesting idea. You can already adjust at what point things get eliminated from the flamegraph with Options::min_width (and the equivalent command-line flag), and I think that output could get very noisy because we often end up trimming out a lot of very small slivers (typically things with only one or two samples). I wonder if maybe we should go the other way: if some sample value is so large that everything else becomes irrelevant, then give a warning. Or alternatively, just do saturating_ versions of the pct computation above, and if the divisor is == std::f64::MAX, then we know that the thing is too large?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants