Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix for #241 #244

Merged
merged 1 commit into from Sep 5, 2020
Merged

fix for #241 #244

merged 1 commit into from Sep 5, 2020

Conversation

boazsegev
Copy link
Contributor

@boazsegev boazsegev commented Sep 1, 2020

This should fix the dangling pointer issue caused by GC.compact (#241) moving the monitor object's memory address to a new location.

This should fix the dangling pointer issue caused by `GC.compact` moving the monitor object's memory address to a new location.
@ioquatix
Copy link
Member

ioquatix commented Sep 1, 2020

@tenderlove can you please comment on this? Thanks.

@tenderlove
Copy link

This will indeed prevent the pointer at self from moving. I'm not sure why it needs to hold a reference to itself (since self should be passed in with all method calls) but I don't know this code base at all 😅

@tarcieri
Copy link
Contributor

tarcieri commented Sep 2, 2020

@tenderlove when monitoring for IO readiness (e.g. poll/epoll/kqueue), it registers a pointer to a C struct NIO_Monitor with the kernel. The kernel returns this pointer back to the C struct, and the self pointer allows it to get back to the Ruby object associated with the C struct.

@ioquatix
Copy link
Member

ioquatix commented Sep 5, 2020

Thanks everyone for your feedback, it looks like this should solve the problems we are facing. Thanks for @boazsegev for figuring it out, for this specific problem I didn't even know where to start.

@ioquatix ioquatix merged commit 507c199 into socketry:master Sep 5, 2020
@Willianvdv
Copy link

Willianvdv commented Sep 15, 2020

Hey folks! It seems that this PR introduced a regression on Ruby 2.6.6. When upgrading from 2.5.2 to 2.5.3, we're experiencing segfaults on our CIs.

Here's one of the backtraces we got:

/bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/structs.rb:123: [BUG] Segmentation fault at 0x0000000000000080
ruby 2.6.6p146 (2020-03-31 revision 67876) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0031 p:0078 s:0176 e:000171 BLOCK  /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/structs.rb:123 [FINISH]
c:0030 p:---- s:0166 e:000165 CFUNC  :each
c:0029 p:0033 s:0162 E:000d08 METHOD /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/structs.rb:115
c:0028 p:0014 s:0157 e:000156 METHOD /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/structs.rb:106
c:0027 p:0009 s:0152 e:000151 METHOD /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/structs.rb:62
c:0026 p:0035 s:0147 e:000146 METHOD /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/structs.rb:206 [FINISH]
c:0025 p:---- s:0141 e:000140 CFUNC  :new
c:0024 p:0036 s:0133 e:000132 METHOD /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/library_hooks/webmock.rb:46
c:0023 p:0022 s:0128 e:000127 METHOD /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/library_hooks/webmock.rb:119
c:0022 p:0047 s:0124 e:000122 METHOD /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/request_handler.rb:107
c:0021 p:0007 s:0117 e:000114 METHOD /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/request_handler.rb:7
c:0020 p:0026 s:0110 e:000109 BLOCK  /bundle/ruby/2.6.0/gems/vcr-3.0.3/lib/vcr/library_hooks/webmock.rb:144
c:0019 p:0013 s:0106 e:000102 BLOCK  /bundle/ruby/2.6.0/gems/webmock-3.8.3/lib/webmock/stub_registry.rb:28
c:0018 p:0140 s:0099 e:000098 METHOD /bundle/ruby/2.6.0/gems/webmock-3.8.3/lib/webmock/request_pattern.rb:40
c:0017 p:0011 s:0093 e:000092 BLOCK  /bundle/ruby/2.6.0/gems/webmock-3.8.3/lib/webmock/stub_registry.rb:58 [FINISH]
c:0016 p:---- s:0089 e:000088 IFUNC 
c:0015 p:---- s:0086 e:000085 CFUNC  :each
c:0014 p:---- s:0083 e:000082 CFUNC  :detect
c:0013 p:0015 s:0079 e:000078 METHOD /bundle/ruby/2.6.0/gems/webmock-3.8.3/lib/webmock/stub_registry.rb:57
c:0012 p:0006 s:0074 e:000073 METHOD /bundle/ruby/2.6.0/gems/webmock-3.8.3/lib/webmock/stub_registry.rb:50
c:0011 p:0026 s:0068 e:000067 METHOD /bundle/ruby/2.6.0/gems/webmock-3.8.3/lib/webmock/http_lib_adapters/em_http_request_adapter.rb:140
c:0010 p:0012 s:0064 e:000063 METHOD /bundle/ruby/2.6.0/gems/webmock-3.8.3/lib/webmock/http_lib_adapters/em_http_request_adapter.rb:58
c:0009 p:0056 s:0057 e:000056 METHOD /bundle/ruby/2.6.0/gems/em-http-request-1.1.5/lib/em-http/http_connection.rb:92
c:0008 p:0028 s:0050 e:000049 METHOD (eval):5
c:0007 p:0193 s:0042 e:000041 METHOD /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/handlers/proxy_handler.rb:28
c:0006 p:0021 s:0029 e:000028 BLOCK  /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/handlers/request_handler.rb:22 [FINISH]
c:0005 p:---- s:0024 e:000023 CFUNC  :each
c:0004 p:0023 s:0020 e:000019 METHOD /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/handlers/request_handler.rb:21
c:0003 p:0023 s:0009 e:000008 BLOCK  /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/proxy_connection.rb:61
c:0002 p:0003 s:0006 e:000005 BLOCK  /bundle/ruby/2.6.0/gems/em-synchrony-1.0.6/lib/em-synchrony.rb:30 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

Another one in the same isolated nio4r upgrade :

/bundle/ruby/2.6.0/gems/em-synchrony-1.0.6/lib/em-synchrony.rb:30: [BUG] try to mark T_NONE object
ruby 2.6.6p146 (2020-03-31 revision 67876) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0012 p:---- s:0054 e:000053 CFUNC  :initialize
c:0011 p:---- s:0051 e:000050 CFUNC  :new
c:0010 p:0029 s:0047 E:002460 METHOD /bundle/ruby/2.6.0/gems/em-synchrony-1.0.6/lib/em-synchrony.rb:30
c:0009 p:0011 s:0041 E:0023f0 METHOD /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/proxy_connection.rb:60
c:0008 p:0121 s:0037 e:000036 METHOD /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/proxy_connection.rb:46 [FINISH]
c:0007 p:---- s:0032 e:000031 CFUNC  :<<
c:0006 p:0008 s:0027 e:000026 METHOD /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/proxy_connection.rb:20 [FINISH]
c:0005 p:---- s:0022 e:000021 CFUNC  :run_machine
c:0004 p:0270 s:0018 e:000017 METHOD /bundle/ruby/2.6.0/gems/eventmachine-1.0.9.1/lib/eventmachine.rb:193
c:0003 p:0011 s:0010 E:0009d0 METHOD /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/proxy.rb:70
c:0002 p:0005 s:0006 e:000005 BLOCK  /bundle/ruby/2.6.0/bundler/gems/puffing-billy-a37a97da9c18/lib/billy/proxy.rb:19 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

My knowledge on C libraries is limited and I don't know if the above backtraces is useful at all.

@tarcieri
Copy link
Contributor

That's a NULL pointer exception, although the location it's occurring isn't helpful:

https://github.com/vcr/vcr/blob/v3.0.3/lib/vcr/structs.rb#L123

It may be a sign of memory corruption.

@Willianvdv
Copy link

@tarcieri thanks for looking into my comment, much appreciated! I was also thinking about memory corruption. The reason I ended up in this PR is that I can reproduce segfaults (like the one I posted above to this MR). The failure rate with this PR merged on our CI increased with 8% and all issues are related to segfaults.

Our stack is huge so it's hard to pinpoint exactly what's going on. I'll try to create a dummy project to get this in a reproducible state but I can't promise any success.

@Willianvdv
Copy link

#251 fixed the issue, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants