Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to debug catastrophic crash #34

Closed
lucaskuan opened this issue May 8, 2018 · 5 comments
Closed

How to debug catastrophic crash #34

lucaskuan opened this issue May 8, 2018 · 5 comments

Comments

@lucaskuan
Copy link
Contributor

I have encountered follow error

app_1        | 00:09:45 web.1     | [BUG] Segmentation fault at 0x0000000000000018
app_1        | 00:09:45 web.1     | ruby 2.4.3p205 (2017-12-14 revision 61247) [x86_64-linux]
app_1        | 00:09:45 web.1     | 
app_1        | 00:09:45 web.1     | -- Control frame information -----------------------------------------------
app_1        | 00:09:45 web.1     | c:0001 p:---- s:0003 E:001f00 (none) [FINISH]
app_1        | 00:09:45 web.1     | 
app_1        | 00:09:45 web.1     | 
app_1        | 00:09:45 web.1     | -- Machine register context ------------------------------------------------
app_1        | 00:09:45 web.1     |  RIP: 0x00007fb22a1cb34e RBP: 0x00007fb207fff490 RSP: 0x00007fb207ffc770
app_1        | 00:09:45 web.1     |  RAX: 0x0000000000000000 RBX: 0x0000000000000000 RCX: 0x0000000000000000
app_1        | 00:09:45 web.1     |  RDX: 0x00007fb22a636a00 RDI: 0x0000000000001200 RSI: 0x0000000000000000
app_1        | 00:09:45 web.1     |   R8: 0x00007fb22a3e1540  R9: 0x00007fb207ffc870 R10: 0x00007fb22dd31760
app_1        | 00:09:45 web.1     |  R11: 0x0000000000000000 R12: 0x00007fb22a636a10 R13: 0x0000000000001200
app_1        | 00:09:45 web.1     |  R14: 0x00007fb22b63a9f0 R15: 0x00007fb22a3e1540 EFL: 0x0000000000010206
app_1        | 00:09:45 web.1     | 
app_1        | 00:09:45 web.1     | -- C level backtrace information -------------------------------------------
app_1        | 00:09:45 web.1     | /usr/local/lib/libruby.so.2.4(rb_vm_bugreport+0x4f3) [0x7fb22e8ba893] vm_dump.c:684
app_1        | 00:09:45 web.1     | /usr/local/lib/libruby.so.2.4(rb_bug_context+0xd4) [0x7fb22e7378d4] error.c:506
app_1        | 00:09:45 web.1     | /usr/local/lib/libruby.so.2.4(sigsegv+0x3e) [0x7fb22e82d8be] signal.c:907
app_1        | 00:09:45 web.1     | /lib/x86_64-linux-gnu/libpthread.so.0 [0x7fb22e483890]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so(sock_force_close+0x20e) [0x7fb22a1cb34e]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so(sock_flush+0x1e8) [0x7fb22a1cb768]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so(http1_response_finish+0x1d) [0x7fb22a1af0fd]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so(http_response_finish+0x2b) [0x7fb22a1af96b]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so [0x7fb22a1b495b]
app_1        | 00:09:45 web.1     | /usr/local/lib/libruby.so.2.4(rb_thread_call_with_gvl+0x1d3) [0x7fb22e86b8a3] thread.c:1525
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so [0x7fb22a1bb854]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so [0x7fb22a1ac910]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so(http1_fio_parser_fn+0xa8) [0x7fb22a1ad598]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so [0x7fb22a1acc72]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so [0x7fb22a1a2fbc]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so(defer_perform+0x9d) [0x7fb22a1a1b2d]
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so [0x7fb22a1a1bd1]
app_1        | 00:09:45 web.1     | /usr/local/lib/libruby.so.2.4(rb_thread_call_without_gvl2+0x42) [0x7fb22e866da2] thread.c:1320
app_1        | 00:09:45 web.1     | /bundle/gems/iodine-0.4.14/lib/iodine/iodine.so [0x7fb22a1bbd57]
app_1        | 00:09:45 web.1     | /usr/local/lib/libruby.so.2.4(thread_start_func_2+0x1b1) [0x7fb22e869131] thread.c:587
app_1        | 00:09:45 web.1     | /usr/local/lib/libruby.so.2.4(thread_start_func_1+0xd5) [0x7fb22e869bf5] thread_pthread.c:887
app_1        | 00:09:45 web.1     | /lib/x86_64-linux-gnu/libpthread.so.0 [0x7fb22e47c064]
app_1        | 00:09:45 web.1     | /lib/x86_64-linux-gnu/libc.so.6(__clone+0x6d) [0x7fb22da7562d]
app_1        | 00:09:45 web.1     | 
app_1        | 00:09:45 web.1     | -- Other runtime information -----------------------------------------------
app_1        | 00:09:45 web.1     | 
app_1        | 00:09:45 web.1     | * Loaded script: /bundle/bin/iodine
app_1        | 00:09:45 web.1     | 
app_1        | 00:09:45 web.1     | * Loaded features:
app_1        | 00:09:45 web.1     | 
app_1        | 00:09:45 web.1     |     0 enumerator.so
app_1        | 00:09:45 web.1     |     1 thread.rb
app_1        | 00:09:45 web.1     |     2 rational.so
app_1        | 00:09:45 web.1     |     3 complex.so
app_1        | 00:09:45 web.1     |     4 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/encdb.so
app_1        | 00:09:45 web.1     |     5 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/trans/transdb.so
app_1        | 00:09:45 web.1     |     6 /usr/local/lib/ruby/2.4.0/unicode_normalize.rb
app_1        | 00:09:45 web.1     |     7 /usr/local/lib/ruby/2.4.0/x86_64-linux/rbconfig.rb
app_1        | 00:09:45 web.1     |     8 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/compatibility.rb
app_1        | 00:09:45 web.1     |     9 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/defaults.rb
app_1        | 00:09:45 web.1     |    10 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/deprecate.rb
app_1        | 00:09:45 web.1     |    11 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/errors.rb
app_1        | 00:09:45 web.1     |    12 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/version.rb
app_1        | 00:09:45 web.1     |    13 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/requirement.rb
app_1        | 00:09:45 web.1     |    14 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/platform.rb
app_1        | 00:09:45 web.1     |    15 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/basic_specification.rb
app_1        | 00:09:45 web.1     |    16 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/stub_specification.rb
app_1        | 00:09:45 web.1     |    17 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/util/list.rb
app_1        | 00:09:45 web.1     |    18 /usr/local/lib/ruby/2.4.0/x86_64-linux/stringio.so
app_1        | 00:09:45 web.1     |    19 /usr/local/lib/ruby/2.4.0/uri/rfc2396_parser.rb
app_1        | 00:09:45 web.1     |    20 /usr/local/lib/ruby/2.4.0/uri/rfc3986_parser.rb
app_1        | 00:09:45 web.1     |    21 /usr/local/lib/ruby/2.4.0/uri/common.rb
app_1        | 00:09:45 web.1     |    22 /usr/local/lib/ruby/2.4.0/uri/generic.rb
app_1        | 00:09:45 web.1     |    23 /usr/local/lib/ruby/2.4.0/uri/ftp.rb
app_1        | 00:09:45 web.1     |    24 /usr/local/lib/ruby/2.4.0/uri/http.rb
app_1        | 00:09:45 web.1     |    25 /usr/local/lib/ruby/2.4.0/uri/https.rb
app_1        | 00:09:45 web.1     |    26 /usr/local/lib/ruby/2.4.0/uri/ldap.rb
app_1        | 00:09:45 web.1     |    27 /usr/local/lib/ruby/2.4.0/uri/ldaps.rb
app_1        | 00:09:45 web.1     |    28 /usr/local/lib/ruby/2.4.0/uri/mailto.rb
app_1        | 00:09:45 web.1     |    29 /usr/local/lib/ruby/2.4.0/uri.rb
app_1        | 00:09:45 web.1     |    30 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/specification.rb
app_1        | 00:09:45 web.1     |    31 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/exceptions.rb
app_1        | 00:09:45 web.1     |    32 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/dependency.rb
app_1        | 00:09:45 web.1     |    33 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_gem.rb
app_1        | 00:09:45 web.1     |    34 /usr/local/lib/ruby/2.4.0/monitor.rb
app_1        | 00:09:45 web.1     |    35 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb
app_1        | 00:09:45 web.1     |    36 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems.rb
app_1        | 00:09:45 web.1     |    37 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/path_support.rb
app_1        | 00:09:45 web.1     |    38 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/bundler_version_finder.rb
app_1        | 00:09:45 web.1     |    39 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/version.rb
app_1        | 00:09:45 web.1     |    40 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/compatibility_guard.rb
app_1        | 00:09:45 web.1     |    41 /usr/local/lib/ruby/2.4.0/x86_64-linux/pathname.so
app_1        | 00:09:45 web.1     |    42 /usr/local/lib/ruby/2.4.0/pathname.rb
app_1        | 00:09:45 web.1     |    43 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/constants.rb
app_1        | 00:09:45 web.1     |    44 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/util.rb
app_1        | 00:09:45 web.1     |    45 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/user_interaction.rb
app_1        | 00:09:45 web.1     |    46 /usr/local/lib/ruby/2.4.0/x86_64-linux/etc.so
app_1        | 00:09:45 web.1     |    47 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/config_file.rb
app_1        | 00:09:45 web.1     |    48 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/rubygems_integration.rb
app_1        | 00:09:45 web.1     |    49 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/current_ruby.rb
app_1        | 00:09:45 web.1     |    50 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/shared_helpers.rb
app_1        | 00:09:45 web.1     |    51 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/vendor/fileutils/lib/fileutils.rb
app_1        | 00:09:45 web.1     |    52 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/vendored_fileutils.rb
app_1        | 00:09:45 web.1     |    53 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/errors.rb
app_1        | 00:09:45 web.1     |    54 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/environment_preserver.rb
app_1        | 00:09:45 web.1     |    55 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/plugin/api.rb
app_1        | 00:09:45 web.1     |    56 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/plugin.rb
app_1        | 00:09:45 web.1     |    57 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/source/git.rb
app_1        | 00:09:45 web.1     |    58 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/source/installed.rb
app_1        | 00:09:45 web.1     |    59 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/source/specific_file.rb
app_1        | 00:09:45 web.1     |    60 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/source/local.rb
app_1        | 00:09:45 web.1     |    61 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/source/lock.rb
app_1        | 00:09:45 web.1     |    62 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/source/vendor.rb
app_1        | 00:09:45 web.1     |    63 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/source.rb
app_1        | 00:09:45 web.1     |    64 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/gem_helpers.rb
app_1        | 00:09:45 web.1     |    65 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/match_platform.rb
app_1        | 00:09:45 web.1     |    66 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/rubygems_ext.rb
app_1        | 00:09:45 web.1     |    67 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/build_metadata.rb
app_1        | 00:09:45 web.1     |    68 /usr/local/lib/ruby/site_ruby/2.4.0/bundler.rb
app_1        | 00:09:45 web.1     |    69 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/settings.rb
app_1        | 00:09:45 web.1     |    70 /usr/local/lib/ruby/site_ruby/2.4.0/rubygems/ext/builder.rb
app_1        | 00:09:45 web.1     |    71 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/source.rb
app_1        | 00:09:45 web.1     |    72 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/source/path.rb
app_1        | 00:09:45 web.1     |    73 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/source/git.rb
app_1        | 00:09:45 web.1     |    74 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/source/rubygems.rb
app_1        | 00:09:45 web.1     |    75 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/lockfile_parser.rb
app_1        | 00:09:45 web.1     |    76 /usr/local/lib/ruby/2.4.0/set.rb
app_1        | 00:09:45 web.1     |    77 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/definition.rb
app_1        | 00:09:45 web.1     |    78 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/dependency.rb
app_1        | 00:09:45 web.1     |    79 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/ruby_dsl.rb
app_1        | 00:09:45 web.1     |    80 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/dsl.rb
app_1        | 00:09:45 web.1     |    81 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/source_list.rb
app_1        | 00:09:45 web.1     |    82 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/source/metadata.rb
app_1        | 00:09:45 web.1     |    83 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/feature_flag.rb
app_1        | 00:09:45 web.1     |    84 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/ruby_version.rb
app_1        | 00:09:45 web.1     |    85 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/lazy_specification.rb
app_1        | 00:09:45 web.1     |    86 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/index.rb
app_1        | 00:09:45 web.1     |    87 /usr/local/lib/ruby/2.4.0/tsort.rb
app_1        | 00:09:45 web.1     |    88 /usr/local/lib/ruby/2.4.0/forwardable/impl.rb
app_1        | 00:09:45 web.1     |    89 /usr/local/lib/ruby/2.4.0/forwardable.rb
app_1        | 00:09:45 web.1     |    90 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/spec_set.rb
app_1        | 00:09:45 web.1     |    91 /usr/local/lib/ruby/2.4.0/shellwords.rb
app_1        | 00:09:45 web.1     |    92 /usr/local/lib/ruby/2.4.0/delegate.rb
app_1        | 00:09:45 web.1     |    93 /usr/local/lib/ruby/2.4.0/fileutils.rb
app_1        | 00:09:45 web.1     |    94 /usr/local/lib/ruby/2.4.0/tmpdir.rb
app_1        | 00:09:45 web.1     |    95 /usr/local/lib/ruby/2.4.0/tempfile.rb
app_1        | 00:09:45 web.1     |    96 /usr/local/lib/ruby/site_ruby/2.4.0/bundler/source/git/git_proxy.rb
app_1        | 00:09:45 web.1     |    97 /usr/local/lib/ruby/2.4.0/x86_64-linux/digest.so
app_1        | 00:09:45 web.1     |    98 /usr/local/lib/ruby/2.4.0/digest.rb
app_1        | 00:09:45 web.1     |    99 /usr/local/lib/ruby/2.4.0/x86_64-linux/digest/sha1.so

The app is totally dead and need to restart. Do you have an idea of how to better debug this kind of problem to prevent it from happening?

@boazsegev
Copy link
Owner

Thank you very much for the opening this issue and posting your logs.

I notice you're running version 0.4.14. Can you see if the issue occurs on iodine 0.4.19?

1. Debugging and fixing:

Do you have any way to replicate the issue?

If I can replicate it on my system I can usually debug it.

I use both gdb and lldb to debug, but I just as often write a bunch of fprinf statements and recompile in an attempt to pinpoint where the code breaks. Sadly, the Ruby layer makes the C lever harder to debug, but it's still achievable.

Another approach I use is re-writing the app in C using the facil.io framework to test if the issue is iodine relate or facil.io relate.

2. Possible Workarounds:

One possible work-around is to try iodine 0.5.1. The 0.4.x versions are based on a totally different design. The design was for a number of reasons that would show up on high stress machines (such as limitations on the use of pipes in the pub/sub system).

A second possible solution is to allow iodine to crash in cluster mode. When a worker crushes in cluster mode, iodine will re spawn another worker to keep the application running. I think this was true also for the 0.4.x versions, but I might be mixing things up (it might be an 0.5.x feature).

@lucaskuan
Copy link
Contributor Author

Thanks for your super quick and detail response 💪
The thing is I'm not sure how to reproduce the error. Sounds to me it is caused by frequent request from my REST API and Plezi websocket request.
I have encountered this error before which is caused by uncaught Ruby exception such as JSON.parse(null), I fixed that by wrapping the code with begin block.
I will try the 0.5.1 version and see if something similar could happen, before that I have some quick questions.
1. Any migration is needed on upgrading to 0.5.1? or simply a gem update thing
2. I saw you mention cluster mode, how to config that? currently I'm using bundle exec iodine -v -t 3 -w 1

@boazsegev
Copy link
Owner

The thing is I'm not sure how to reproduce the error. Sounds to me it is caused by frequent request from my REST API and Plezi websocket request.

I test stress Iodine before each release, but I use a trivial HTTP/WebSocket "broadcast" application broadcasting everything to a single channel (the HTTP tests are a simple "hello"). I find it hard to think that the stress alone would cause an issue.

I have encountered this error before which is caused by uncaught Ruby exception such as JSON.parse(null), I fixed that by wrapping the code with begin block.

Yes, there was an issue in the C<=>Ruby bridge that I fixed in version 0.4.17. I think this observation might be related to that issue.

1. Any migration is needed on upgrading to 0.5.1? or simply a gem update thing

Hmm... say yes, though it really depends how you use iodine.

I would recommend to upgrade to 0.4.19 first and see if this solves the issue.

The 0.5.0 release changes the pub/sub API considerably and uses a pub/sub subscription object instead of an unsubscribe method (this was part of an attempt at reaching a standard pub/sub API with a college and it didn't pan out).

I'm not sure if this effects your application, but it's part of the reason Plezi is still using the 0.4.19 release (though you can "hack" this by forcing Plezi to version 0.15.0).

Also, some of the callbacks used for WebSockets were updated, replacing on_ready with on_drained (this was part of a collaborative attempt at designing a standard Rack specification for WebSockets and SSE, which is now being reviewed (sadly more API changes will be coming due to this fact).

Some of the HTTP security features were updated, so if you're using the API to change the server default settings, that might need to be reviewed.

I should point out that 0.5.x versions were only partly successful and (sadly) more API changes are coming.

Plezi itself didn't migrate to 0.5.x yet and it might skip 0.5.x versions altogether. I'm already working on version 0.6.0 in accordance with some of the requested changes to the newly proposed Rack specification draft. Also, the unsubscribe method will return in 0.6.0, along with a new "connection" object.

2. I saw you mention cluster mode, how to config that? currently I'm using bundle exec iodine -v -t 3 -w 1

Any value of -w above 1 will invoke a cluster mode. i.e:

bundle exec iodine -v -t 3 -w 2

@lucaskuan
Copy link
Contributor Author

Thank you! I think I will try cluster mode before updating version because I mainly use plezi to connect my IoT devices.
It is quite amazing that you built such good library for people who like RoR so much as hate activecable as much to have raw socket to play with. I was quite desperate at the beginning when I try to build my server, since I'm using custom protocol which is totally different from what activecable is doing 😛
I will try cluster mode first and see what happen.
You may close this case. Good day!

@boazsegev
Copy link
Owner

You're welcome, and good luck!

Please re-open the issue (or start a new one) if you experience anything else. I'm happy to help.

B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants