Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PoC] Use timestamp_delta to weight samples #166

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

casperisfine
Copy link
Contributor

NB: this is just proof of concept quality to start a discussion. A releasable version of this patch would involve quite more work.

Problem

We recently noticed that we were overlooking a major hotspot because stackprof regular reporting is biased.

Here's the stackprof summary of one of our application boot, using stackprof 0.2.17:

==================================
  Mode: cpu(1000)
  Samples: 95925 (13.33% miss rate)
  GC: 30564 (31.86%)
==================================
     TOTAL    (pct)     SAMPLES    (pct)     FRAME
     27790  (29.0%)       27790  (29.0%)     (marking)
      6278   (6.5%)        6277   (6.5%)     RubyVM::InstructionSequence.load_from_binary
...
       311   (0.3%)         311   (0.3%)     Module#using

And now here's the same profiling data, but rendered by Speedscope:

Capture d’écran, le 2021-09-03 à 12 12 46

Notice how Module#using account for almost nothing according to Stackprof, but for 11% for speedscope.

This is because stackprof's own reporting is based on the number of samples, whereas Speedscope use the raw_timestamp_deltas data that is intended for flamegraphs.

And since Module#using goes over the whole heap to flush method caches, I assume that the stackprof event isn't fired for a long amount of time. Based on the stackprof data, Module#using was sampled 311 times, but each sample took 41ms instead of 1ms.

This is particularly visible for Module#using, but from analyzing raw_timestamp_deltas, a large proportion of the samples are longer than intended.

Possible solution

Again this is just a proof of concept, but if we were to always compute the timestamp_delta, we could "weight" the samples to somewhat correct this bias.

It's a bit hard to implement because of other profiling mode such as object which aren't time based. So maybe instead of weighting the samples, we should simply record the timestamp_delta as part of the profiling, and use that instead of the sample count for reporting.

@tenderlove @XrXr what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants