Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault running Dalli test suite #3438

Open
nirvdrum opened this issue Feb 2, 2024 · 5 comments
Open

Segfault running Dalli test suite #3438

nirvdrum opened this issue Feb 2, 2024 · 5 comments
Labels

Comments

@nirvdrum
Copy link
Collaborator

nirvdrum commented Feb 2, 2024

While working on fixing a TruffleRuby compatibility issue with Dalli, a popular memcache client, I encountered a segfault while running its test suite. I haven't seen the segfault every time. I don't know what the frequency is.

I'm running the latest TruffleRuby release on macOS ARM, testing 4c4a5a2354707604456f6f1bf08d020f1909b49e of Dalli:

> ruby -v
truffleruby 23.1.2, like ruby 3.2.2, Oracle GraalVM Native [aarch64-darwin]

Segfault log

I'll try out dev builds and see if it's still a problem there. If not, at least we'll have a log with an open source project to refer to.

Fatal error: No exception handler registered for deopt target
encodedBci: 37 (bci 5 rethrowException)
Method info: org.truffleruby.language.control.WhileNode.execute(WhileNode.java:37)
Partial Deoptimized Stack
org.truffleruby.language.control.WhileNode.execute(WhileNode.java:37)
@eregon eregon added the bug label Feb 5, 2024
@nirvdrum
Copy link
Collaborator Author

nirvdrum commented Feb 6, 2024

I'm still playing with different versions and combinations to see which TruffleRuby builds are affected. So far I've managed to experience the segfault with three different configurations on x86_64 Linux (Ubuntu 23.10). The point at which it segfaults differs in each case, but they all happen with rethrowException.

@eregon
Copy link
Member

eregon commented Feb 7, 2024

Thanks for the report and trying on various versions.
Fatal error: No exception handler registered for deopt target certainly feels like Native Image bug.

Could you post detailed repro instructions so I can share them on the internal issue?
Did you just run bundle exec rake? With/without RUN_SASL_TESTS=1? On which branch/commit?
EDIT: from this log it's clear it's bundle exec rake.

@eregon
Copy link
Member

eregon commented Feb 7, 2024

I'll regroup the header of each log here for convenience:

truffleruby 23.1.2, like ruby 3.2.2, Oracle GraalVM Native [aarch64-darwin]
Fatal error: No exception handler registered for deopt target
encodedBci: 37 (bci 5 rethrowException)
Method info: org.truffleruby.language.control.WhileNode.execute(WhileNode.java:37)
Partial Deoptimized Stack
org.truffleruby.language.control.WhileNode.execute(WhileNode.java:37)

---

truffleruby 24.1.0-dev-eedda850, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]
Fatal error: No exception handler registered for deopt target
encodedBci: 313 (bci 74 rethrowException)
Method info: org.truffleruby.core.array.ArrayEachIteratorNode.iterateMany(ArrayEachIteratorNode.java:66)
Partial Deoptimized Stack
org.truffleruby.core.array.ArrayEachIteratorNode.iterateMany(ArrayEachIteratorNode.java:66)

---

truffleruby 24.1.0-dev-677ac08b, like ruby 3.2.2, GraalVM CE Native [x86_64-linux]
Fatal error: No exception handler registered for deopt target
encodedBci: 113 (bci 24 rethrowException)
Method info: org.truffleruby.language.RubyProcRootNode.execute(RubyProcRootNode.java:77)
Partial Deoptimized Stack
org.truffleruby.language.RubyProcRootNode.execute(RubyProcRootNode.java:77)

---

truffleruby 23.1.2, like ruby 3.2.2, Oracle GraalVM Native [x86_64-linux]
Fatal error: No exception handler registered for deopt target
encodedBci: 313 (bci 74 rethrowException)
Method info: org.truffleruby.core.array.ArrayEachIteratorNode.iterateMany(ArrayEachIteratorNode.java:66)
Partial Deoptimized Stack
org.truffleruby.core.array.ArrayEachIteratorNode.iterateMany(ArrayEachIteratorNode.java:66)

So they are all Fatal error: No exception handler registered for deopt target but for different methods.

And the stacktraces are all similar:

Stacktrace for the failing thread 0x00007f6a88004f80 (A=AOT compiled, J=JIT compiled, D=deoptimized, i=inlined):
  i  SP 0x00007f6a911fa830 IP 0x00007f6b58152550 size=48    [image code] com.oracle.svm.core.jdk.VMErrorSubstitutions.shutdown(VMErrorSubstitutions.java:148)
  A  SP 0x00007f6a911fa830 IP 0x00007f6b58152550 size=48    [image code] com.oracle.svm.core.jdk.VMErrorSubstitutions.shouldNotReachHere(VMErrorSubstitutions.java:141)
  A  SP 0x00007f6a911fa860 IP 0x00007f6b58226954 size=16    [image code] com.oracle.svm.core.util.VMError.shouldNotReachHere(VMError.java:90)
  A  SP 0x00007f6a911fa870 IP 0x00007f6b580cb305 size=80    [image code] com.oracle.svm.core.deopt.Deoptimizer.fatalDeoptimizationError0(Deoptimizer.java:1322)
  i  SP 0x00007f6a911fa8c0 IP 0x00007f6b580c553c size=48    [image code] com.oracle.svm.core.deopt.Deoptimizer.fatalDeoptimizationError(Deoptimizer.java:1301)
  A  SP 0x00007f6a911fa8c0 IP 0x00007f6b580c553c size=48    [image code] com.oracle.svm.core.deopt.DeoptimizedFrame.throwMissingExceptionHandler(DeoptimizedFrame.java:413)
  A  SP 0x00007f6a911fa8f0 IP 0x00007f6b580c5457 size=80    [image code] com.oracle.svm.core.deopt.DeoptimizedFrame.takeException(DeoptimizedFrame.java:402)
  i  SP 0x00007f6a911fa940 IP 0x00007f6b5820203f size=224   [image code] com.oracle.svm.core.snippets.ExceptionUnwind.deoptTakeExceptionInterruptible(ExceptionUnwind.java:266)
  A  SP 0x00007f6a911fa940 IP 0x00007f6b5820203f size=224   [image code] com.oracle.svm.core.snippets.ExceptionUnwind.defaultUnwindException(ExceptionUnwind.java:223)
  A  SP 0x00007f6a911faa20 IP 0x00007f6b58202e70 size=32    [image code] com.oracle.svm.core.snippets.ExceptionUnwind.unwindExceptionInterruptible(ExceptionUnwind.java:129)
  A  SP 0x00007f6a911faa40 IP 0x00007f6b582031ce size=1232  [image code] com.oracle.svm.core.snippets.ExceptionUnwind.unwindExceptionWithCalleeSavedRegisters(ExceptionUnwind.java:106)
  A  SP 0x00007f6a911faf10 IP 0x00007f6b5825a8f5 size=1280  [image code] com.oracle.svm.truffle.api.SubstrateThreadLocalHandshake.pollStub(SubstrateThreadLocalHandshake.java)
  D  SP 0x00007f6a911fb410 IP 0x00007f6b580ca190 size=64    [image code, deopt] org.truffleruby.language.RubyProcRootNode.execute**(RubyProcRootNode.java:77)
  D  SP 0x00007f6a911fb410 IP 0x00007f6b580ca190 size=64    [image code, deopt] com.oracle.truffle.runtime.OptimizedCallTarget.executeRootNode**(OptimizedCallTarget.java:745)

This looks like a TruffleSafepoint poll that ends up throwing and then triggers this error.

@eregon
Copy link
Member

eregon commented Feb 7, 2024

FWIW I tried to repro just raising an exception in another thread but that didn't trigger this issue (maybe because it doesn't deopt):

ruby -e 't=Thread.new { sleep 1; 100.times { Thread.pass until Thread.main.backtrace&.first&.include? "each"; Thread.main.raise "bye" } }; a=Array.new(100_000_000, 0); while true; begin; a.each { |e| }; rescue => e; p e.class; end; end'

I will try to repro on dalli.

@eregon
Copy link
Member

eregon commented Feb 7, 2024

Filed as GR-51913 internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants