Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

method `session_new_cb' called on hidden T_DATA object (NotImplementedError) #464

Closed
casperisfine opened this issue Oct 13, 2021 · 7 comments

Comments

@casperisfine
Copy link

A weird error that happened once in production.

Ruby 3.0.2 (linux)

Full backtrace:

/usr/local/ruby/lib/ruby/3.0.0/openssl/buffering.rb:205:in `sysread_nonblock': hidden object cannot have instance variables (TypeError)
    from /usr/local/ruby/lib/ruby/3.0.0/openssl/buffering.rb:205:in `read_nonblock'
    from gems/net-protocol-0.1.1/lib/net/protocol.rb:212:in `rbuf_fill'
    from gems/net-protocol-0.1.1/lib/net/protocol.rb:193:in `readuntil'
    from gems/net-protocol-0.1.1/lib/net/protocol.rb:203:in `readline'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http/response.rb:42:in `read_status_line'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http/response.rb:31:in `read_new'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http.rb:1557:in `block in transport_request'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http.rb:1548:in `catch'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http.rb:1548:in `transport_request'
...

Caused by: /usr/local/ruby/lib/ruby/3.0.0/openssl/buffering.rb:205:in `sysread_nonblock': method `session_new_cb' called on hidden T_DATA object (0x00007f17df28b0d8 flags=0xc) (NotImplementedError)
    from /usr/local/ruby/lib/ruby/3.0.0/openssl/buffering.rb:205:in `read_nonblock'
    from gems/net-protocol-0.1.1/lib/net/protocol.rb:212:in `rbuf_fill'
    from gems/net-protocol-0.1.1/lib/net/protocol.rb:193:in `readuntil'
    from gems/net-protocol-0.1.1/lib/net/protocol.rb:203:in `readline'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http/response.rb:42:in `read_status_line'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http/response.rb:31:in `read_new'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http.rb:1557:in `block in transport_request'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http.rb:1548:in `catch'
    from /usr/local/ruby/lib/ruby/3.0.0/net/http.rb:1548:in `transport_request'
...

We do call GC.compact once, so maybe there's a static VALUE reference that isn't properly declared to the GC?

@casperisfine
Copy link
Author

I tried to reproduce with GC.verify_compaction_references, but it doesn't reveal anything: casperisfine@8bdb729

So maybe it's a red herring and this bug has nothing to do with GC compaction?

@rhenium
Copy link
Member

rhenium commented Oct 13, 2021

Caused by: /usr/local/ruby/lib/ruby/3.0.0/openssl/buffering.rb:205:in sysread_nonblock': method session_new_cb' called on hidden T_DATA object (0x00007f17df28b0d8 flags=0xc) (NotImplementedError)

I believe this is coming from

cb = rb_funcall(ssl_obj, rb_intern("session_new_cb"), 0);

The receiver is an OpenSSL::SSL::SSLSocket obtained from the OpenSSL type SSL; each SSLSocket wraps an SSL and the SSL has a reverse reference to the SSLSocket object.

Unfortunately I know little about the compactor. Do we need to mark/pin the T_DATA/SSLSocket itself explicitly in such a case?

@casperisfine
Copy link
Author

Unfortunately I know little about the compactor.

That was just my assumption as I did deal with several compactor issues in native gems, and they usually revolve around methods being called on random objects, this one might be entirely different.

@rhenium
Copy link
Member

rhenium commented Oct 14, 2021

This causes segfault, so this indeed seems to be the root cause.

require "socket"
require "openssl"

p1, p2 = UNIXSocket.pair
ctx = OpenSSL::SSL::SSLContext.new.tap { |x| x.security_level = 0; x.ciphers = "aNULL" }
s1 = OpenSSL::SSL::SSLSocket.new(p1, ctx)
s2 = OpenSSL::SSL::SSLSocket.new(p2, ctx)
GC.verify_compaction_references
[
  Thread.new { s1.accept },
  Thread.new { s2.connect },
].map(&:join)

@casperisfine
Copy link
Author

Yes, @XrXr came to the same conclusion: XrXr@ad02bb9

@rhenium
Copy link
Member

rhenium commented Oct 14, 2021

#465 should fix this by adding rb_gc_mark() calls.

@rhenium
Copy link
Member

rhenium commented Oct 16, 2021

openssl 2.1.3 and 2.2.1 have been released with the fix. Thanks for reporting this issue!

@rhenium rhenium closed this as completed Oct 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants