Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more accurate profiling on Ruby 3.1 and fix async-signal-safety #172

Merged
merged 2 commits into from Feb 16, 2022

Conversation

jhawthorn
Copy link
Contributor

Accuracy Regression

In Ruby 3.1, stackprof was building with the warning:

../../../../ext/stackprof/stackprof.c:31:5: warning: "RUBY_API_VERSION_MAJOR" is not defined, evaluates to 0 [-Wundef]
   31 | #if RUBY_API_VERSION_MAJOR < 3
      |     ^~~~~~~~~~~~~~~~~~~~~~

Because of this, stackprof wasn't using the much improved accuracy added in #150 for Ruby 3.0+. This is fixed by adding an include of ruby/version.h

Async signal safety

Revisiting #150, I noticed that approach wasn't quite safe 馃槄.

Unfortunately though rb_profile_frames seems to be safe to call inside a signal handler, stackprof_record_sample_for_stack is not, since it calls malloc (itself or through st_*) which is not async-signal-safe. This could potentially result in deadlocks or memory corruption depending on the malloc implementation.

This commit attempts to restore the safety of the postponed job approach while keeping the accuracy of measuring immediately. Inside the signal handler we record the frames to a buffer, then enqueue a postponed job to do the actual recording.

I am interested in further improving this by adding a worker thread to do the processing of the buffer instead of doing it in the postponed job, however I'd like to apply this as a smaller fix first and attempt that in a future PR.

This is necessary for RUBY_API_VERSION_MAJOR, at least in Ruby 3.1
Copy link
Collaborator

@tenderlove tenderlove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a couple comments, but the overall approach makes sense to me.

ext/stackprof/stackprof.c Show resolved Hide resolved
ext/stackprof/stackprof.c Outdated Show resolved Hide resolved
In 6c3f4d7 we started recording signals
immediately on receiving a signal, after observing that in Ruby 3.0+ the
CFP structure was safe to read. This resulted in much more accurate
measurements.

Unfortunately though rb_profile_frames seems to be safe to call inside a
signal handler, stackprof_record_sample_for_stack is not, since it calls
malloc (itself or through st_*) which is not async-signal-safe. This
could potentially result in deadlocks or memory corruption depending on
the malloc implementation.

This commit attempts to restore the safety of the postponed job approach
while keeping the accuracy of measuring immediately. Inside the signal
handler we record the frames to a buffer, then enqueue a postponed job
to do the actual recording.
@casperisfine
Copy link
Contributor

Quick heads up, while testing ruby/ruby#6572, I got a VM crash in stackprof:

[BUG] Segmentation fault at 0x0000000000000020
ruby 3.2.0dev (2022-10-17) [x86_64-linux]
 
-- Control frame information -----------------------------------------------
 
-- Machine register context ------------------------------------------------
 RIP: 0x000055f85890e7e9 RBP: 0x0000000000000800 RSP: 0x00007f470031a660
 RAX: 0x0000000000000000 RBX: 0x0000000000000000 RCX: 0x00007f47004fd570
 RDX: 0x0000000000000000 RDI: 0x0000000000000000 RSI: 0x0000000000000800
  R8: 0x00007f4708ed40b8  R9: 0x000000000000258f R10: 0x0000000001245e0e
 R11: 0x000ab94ae0ebba42 R12: 0x0000000000000000 R13: 0xffffffffffffffc0
 R14: 0x00007f47004fd570 R15: 0x0000000000000000 EFL: 0x0000000000010246
 
-- C level backtrace information -------------------------------------------
/usr/local/ruby/bin/ruby(rb_print_backtrace+0x11) [0x55f8589107ad] vm_dump.c:762
/usr/local/ruby/bin/ruby(rb_vm_bugreport) vm_dump.c:1057
/usr/local/ruby/bin/ruby(rb_bug_for_fatal_signal+0xec) [0x55f8589bfa8c] error.c:814
/usr/local/ruby/bin/ruby(sigsegv+0x4d) [0x55f85886310d] signal.c:964
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7f47087da420]
/usr/local/ruby/bin/ruby(VM_ENV_FLAGS+0x0) [0x55f85890e7e9] vm_backtrace.c:1559
/usr/local/ruby/bin/ruby(VM_FRAME_CFRAME_P) vm_core.h:1373
/usr/local/ruby/bin/ruby(VM_FRAME_RUBYFRAME_P) vm_core.h:1381
/usr/local/ruby/bin/ruby(rb_profile_frames) vm_backtrace.c:1560
/tmp/bundle/ruby/3.2.0+2/gems/stackprof-0.2.21/lib/stackprof/stackprof.so(stackprof_buffer_sample+0x4c) [0x7f47004f51bc] stackprof.c:617
/tmp/bundle/ruby/3.2.0+2/gems/stackprof-0.2.21/lib/stackprof/stackprof.so(stackprof_buffer_sample) (null):0
/tmp/bundle/ruby/3.2.0+2/gems/stackprof-0.2.21/lib/stackprof/stackprof.so(stackprof_signal_handler+0x8d) [0x7f47004f537d] stackprof.c:742
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7f47087da420]
/usr/local/ruby/bin/ruby(thread_start_func_2+0x1b5) [0x55f8588aaca5] thread.c:676
/usr/local/ruby/bin/ruby(thread_start_func_1+0xe1) [0x55f8588ab9a1] thread_pthread.c:1170
/lib/x86_64-linux-gnu/libpthread.so.0(start_thread+0xd9) [0x7f47087ce609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f47086f3133]

I haven't investigated it much just yet, but I suspect ruby/ruby#6572 might break the async-signal safety assumption?

@casperisfine
Copy link
Contributor

Also note that the crash only happens in CPU mode, WALL mode still work fine.

@jhawthorn
Copy link
Contributor Author

I would be surprised to see async-signal-safety related to the crash you saw, since that PR is also using vm_push_frame. It seems more likely just a problem with rb_profile_frames (which had changes) and dummy frames. I'd also suspect :wall can crash too (as far as I know wall can fire at all the same times as :cpu)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants