Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird and inexplicable crashes when using TypedArrayContents #922

Open
robinchrist opened this issue Oct 20, 2021 · 26 comments
Open

Weird and inexplicable crashes when using TypedArrayContents #922

robinchrist opened this issue Oct 20, 2021 · 26 comments

Comments

@robinchrist
Copy link

I'm getting really weird crashes with v8 array buffers / typed arrays

Base:

size_t numEntries = 2 * TOTAL_NUM_BANDS;
size_t numBytes = sizeof(float) * numEntries;
v8::Local<v8::ArrayBuffer> dataBuffer = v8::ArrayBuffer::New(v8::Isolate::GetCurrent(), numBytes);
v8::Local<v8::Float32Array> dataArr = v8::Float32Array::New(minMaxDataBuffer, 0, numEntries);

works:

 auto dataPtr = static_cast<float*>(dataBuffer->GetContents().Data());

segfaults:

auto test = Nan::TypedArrayContents<float>(dataArr);

...but it doesn't segfault immediately, but rather a couple of seconds later, probably when some GC happens.

I don't even need to do anything with the pointer. It is enough to construct a (temporary) Nan::TypedArrayContents

I have digged through the nan_typedarray_contents.h header

once I comment out

data = static_cast<char*>(buffer->GetBackingStore()->Data()) + byte_offset;

there is no crash.

I tried to find out what this is, but now it's getting really weird:

auto dataPtr2 = static_cast<float*>(dataBuffer->GetBackingStore()->Data());
//auto dataPtr3 = static_cast<float*>(dataBuffer->GetBackingStore()->Data());

works.
Where as

auto dataPtr2 = static_cast<float*>(dataBuffer->GetBackingStore()->Data());
auto dataPtr3 = static_cast<float*>(dataBuffer->GetBackingStore()->Data());

(just the first call repeated)
segfaults immediately... I suppose this is a node bug?

I'm compiling against electron 8.0.0, node version should be 12.13.0

I'll try with a later node / electron version now

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 20, 2021

Based on auto dataPtr = static_cast<float*>(dataBuffer->GetContents().Data()); working and it not crashing when you comment out https://github.com/nodejs/nan/blob/main/nan_typedarray_contents.h#L37, I suspect that something is off in a mismatch between the version of V8 being linked against and the V8 headers.
https://github.com/nodejs/nan/blob/main/nan_typedarray_contents.h#L36-L40 identifies whether to dereference a class pointer or reference based on the V8 version indicated by the headers.

@robinchrist
Copy link
Author

Hmm, but that doesn't explain why a single call static_cast<float*>(dataBuffer->GetBackingStore()->Data()); works, but doing it twice crashes

@addaleax
Copy link
Member

suspect that something is off in a mismatch between the version of V8 being linked against and the V8 headers.

If this is the case, my guess would be that this is a std::shared_ptr<> ABI mismatch (for the temporary std::shared_ptr<v8::BackingStore>). We’ve had problems with that on at least arm64 and Windows.

@robinchrist
Copy link
Author

robinchrist commented Oct 20, 2021

AMD64 and Linux... Yeah, an ABI mismatch is a possibility, but I don't know how I could prove or disprove this...

@addaleax
Copy link
Member

@robinchrist Can you maybe share the exact compiler version and flags used when building the addon?

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 20, 2021

Hmm, but that doesn't explain why a single call static_cast<float*>(dataBuffer->GetBackingStore()->Data()); works, but doing it twice crashes

I missed that doing it twice causes a problem. However, auto test = Nan::TypedArrayContents<float>(dataArr); seems a bit off, Nan::TypedArrayContents does not permit assigning, copying or even moving.

@robinchrist
Copy link
Author

I missed that doing it twice causes a problem. However, auto test = Nan::TypedArrayContents<float>(dataArr); seems a bit off, Nan::TypedArrayContents does not permit assigning, copying or even moving.

Well that'd give a compile time error, so I'd say your "do not permit assigning" macro doesn't work as intended.
See godbolt example here: https://godbolt.org/z/j8584bYWj

@robinchrist Can you maybe share the exact compiler version and flags used when building the addon?
I will try to extract the flags from the build process

Ubuntu clang version 12.0.1-++20211011094644+fed41342a82f-1~exp1~20211011215105.3
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 20, 2021

Strange, I did get a compile time error with Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: x86_64-apple-darwin19.6.0 when trying it.

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 20, 2021

Ah, problem is C++17.

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 20, 2021

Trying to do a std::move() fails as expected. Perhaps C++17 allows this assignment expression because it is guaranteed to be identical to directly using the constructor as foo(arg);? Either way, moving it should probably be fine, non-movability is just what we default to for all objects of this nature.

@robinchrist
Copy link
Author

robinchrist commented Oct 21, 2021

I've got a stacktrace with a debug version of Electron (9.0.0, but same for all other versions)

bt
* thread #1, name = 'electron', stop reason = signal SIGILL: illegal instruction operand
  * frame #0: 0x000055c6699fe572 electron`v8::base::OS::Abort() at platform-posix.cc:457:5
    frame #1: 0x000055c6699f8ad5 electron`::V8_Fatal() at logging.cc:167:3
    frame #2: 0x000055c6699f86a5 electron`v8::base::(anonymous namespace)::DefaultDcheckHandler(char const*, int, char const*) at logging.cc:57:3
    frame #3: 0x000055c665c26b20 electron`::Unregister() at backing-store.cc:662:5
    frame #4: 0x000055c665c264a3 electron`::~BackingStore() at backing-store.cc:166:3
    frame #5: 0x000055c66583cec7 electron`::__on_zero_shared() [inlined] operator() at memory:2378:5
    frame #6: 0x000055c66583ceba electron`::__on_zero_shared() at memory:3551:5
    frame #7: 0x00007f2676e4a2b5 addon.node`std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(this=0x00002696eefcf020) at shared_ptr_base.h:155:6
    frame #8: 0x00007f2676e4a26a addon.node`std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count(this=0x00007ffd41f7cf80) at shared_ptr_base.h:730:11
    frame #9: 0x00007f2676f2f079 addon.node`std::__shared_ptr<v8::BackingStore, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr(this=0x00007ffd41f7cf78) at shared_ptr_base.h:1169:31
    frame #10: 0x00007f2676f2f055 addon.node`std::shared_ptr<v8::BackingStore>::~shared_ptr(this=0x00007ffd41f7cf78) at shared_ptr.h:103:11
    frame #11: 0x00007f2676f25670 addon.node`auto CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(this=(0x00002696ee7d5a30, 0x00002696ee7d5a38), arg=0x2696ef18e000)::$_4::operator()<SNIP&>(SNIP&) const at SPLCalculator.cc:723:25
    frame #12: 0x00007f2676f24a82 addon.node`void std::__invoke_impl<void, CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4, SNIP&>((null)=__invoke_other @ 0x00007ffd41f7d3e8, __f=(0x00002696ee7d5a30, 0x00002696ee7d5a38), __args=0x2696ef18e000)::$_4&&, SNIP&) at invoke.h:60:14
    frame #13: 0x00007f2676f24a12 addon.node`std::__invoke_result<CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4, SNIP&>::type std::__invoke<CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(__fn=(0x00002696ee7d5a30, 0x00002696ee7d5a38), __args=0x2696ef18e000)::$_4, SNIP&>(CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4&&, SNIP&) at invoke.h:95:14
    frame #14: 0x00007f2676f249da addon.node`std::__detail::__variant::__gen_vtable_impl<true, std::__detail::__variant::_Multi_array<void (*)(CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4&&, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&)>, std::tuple<std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&>, std::integer_sequence<unsigned long, 1ul> >::__visit_invoke_impl(__visitor=(0x00002696ee7d5a30, 0x00002696ee7d5a38), __vars=0x00002696ef0917e8)::$_4&&, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&) at variant:973:11
    frame #15: 0x00007f2676f24992 addon.node`std::__detail::__variant::__gen_vtable_impl<true, std::__detail::__variant::_Multi_array<void (*)(CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4&&, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&)>, std::tuple<std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&>, std::integer_sequence<unsigned long, 1ul> >::__do_visit_invoke(__visitor=(0x00002696ee7d5a30, 0x00002696ee7d5a38), __vars=0x00002696ef0917e8)::$_4&&, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&) at variant:981:9
    frame #16: 0x00007f2676f246b2 addon.node`std::__detail::__variant::__gen_vtable_impl<true, std::__detail::__variant::_Multi_array<void (*)(CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4&&, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&)>, std::tuple<std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&>, std::integer_sequence<unsigned long, 1ul> >::__visit_invoke(__visitor=(0x00002696ee7d5a30, 0x00002696ee7d5a38), __vars=0x00002696ef0917e8)::$_4&&, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&) at variant:997:11
   frame #17: 0x00007f2676f245f5 addon.node`decltype(__visitor=(0x00002696ee7d5a30, 0x00002696ee7d5a38), __variants=0x00002696ef0917e8) std::__do_visit<false, true, CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&>(CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4&&, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&) at variant:1644:14
    frame #18: 0x00007f2676f232e4 addon.node`decltype(__visitor=(0x00002696ee7d5a30, 0x00002696ee7d5a38), __variants=0x00002696ef0917e8) std::visit<CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&>(CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(SNIP)::$_4&&, std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > const&) at variant:1655:14
    frame #19: 0x00007f2676f22f3b addon.node`CVO::node::StreamingSurfaceSPLWorker::HandleProgressCallback(this=0x00002696ee7d5a20, progressStructArr=0x00002696ef0917e8, count=1) at SPLCalculator.cc:676:13
    frame #20: 0x00007f2676f3e90a addon.node`CVO::node::AsyncStreamingWorker<std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > >::WorkProgress(this=0x00002696ee7d5a20) at SPLCalculator.h:124:27
    frame #21: 0x00007f2676f3e0d0 addon.node`CVO::node::AsyncStreamingWorker<std::variant<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::unique_ptr<CVO::simengine::SurfaceResult, std::default_delete<CVO::simengine::SurfaceResult> > > >::AsyncProgress_(async=0x00002696ee7d5a40) at SPLCalculator.h:147:21
    frame #22: 0x000055c66c6fad24 electron`uv__async_io at async.c:147:5
    frame #23: 0x000055c66c70a1a8 electron`uv__io_poll at linux-core.c:421:11
    frame #24: 0x000055c66c6fb181 electron`uv_run at core.c:375:5
    frame #25: 0x000055c66469d05a electron`::UvRunOnce() at node_bindings.cc:425:11
    frame #26: 0x000055c66455fd04 electron`::Run() [inlined] Invoke<void (electron::api::BrowserWindow::*)(), const base::WeakPtr<electron::api::BrowserWindow> &> at bind_internal.h:489:12
    frame #27: 0x000055c66455fc89 electron`::Run() [inlined] MakeItSo<void (electron::api::BrowserWindow::*const &)(), const base::WeakPtr<electron::api::BrowserWindow> &> at bind_internal.h:643:5
    frame #28: 0x000055c66455fc66 electron`::Run() [inlined] RunImpl<void (electron::api::BrowserWindow::*const &)(), const std::__1::tuple<base::WeakPtr<electron::api::BrowserWindow>> &, 0> at bind_internal.h:696:12
    frame #29: 0x000055c66455fc66 electron`::Run() at bind_internal.h:678:12
    frame #30: 0x000055c667df2842 electron`::RunTask() [inlined] Run at callback.h:98:12
    frame #31: 0x000055c667df27e2 electron`::RunTask() at task_annotator.cc:142:33
    frame #32: 0x000055c667e0f124 electron`::DoWorkImpl() at thread_controller_with_message_pump_impl.cc:324:23
    frame #33: 0x000055c667e0edf2 electron`::DoWork() at thread_controller_with_message_pump_impl.cc:248:7
    frame #34: 0x000055c667db3064 electron`::Run() at message_pump_default.cc:39:55
    frame #35: 0x000055c667e0f8d7 electron`::Run() at thread_controller_with_message_pump_impl.cc:429:12
    frame #36: 0x000055c667dd6d22 electron`::Run() at run_loop.cc:124:14
    frame #37: 0x000055c66c35fd65 electron`::RendererMain() at renderer_main.cc:226:16
    frame #38: 0x000055c666d7d9cb electron`::Run() at content_main_runner_impl.cc:882:10
    frame #39: 0x000055c669865df6 electron`::Main() at main.cc:454:29
    frame #40: 0x000055c665641871 electron`content::ContentMain(content::ContentMainParams const&) at content_main.cc:19:10
    frame #41: 0x000055c66453380f electron`main at electron_main.cc:231:10
    frame #42: 0x00007f26843a90b3 libc.so.6`__libc_start_main + 243
    frame #43: 0x000055c66453352a electron`_start + 42

The assertion / check is raised in backing-store.cc in the DCHECK Line (662)

void GlobalBackingStoreRegistry::Unregister(BackingStore* backing_store) {
  if (!backing_store->globally_registered_) return;

  DCHECK_NOT_NULL(backing_store->buffer_start());

  base::MutexGuard scope_lock(&impl()->mutex_);
  const auto& result = impl()->map_.find(backing_store->buffer_start());
  if (result != impl()->map_.end()) {
    DCHECK(!result->second.lock());
    impl()->map_.erase(result);
  }
  backing_store->globally_registered_ = false;
}

@robinchrist
Copy link
Author

robinchrist commented Oct 21, 2021

@addaleax

Compile Flags for Electron 9 (that happens to be the version I have a debug build of...)

/usr/bin/c++ -DEIGEN_MPL2_ONLY -DNOGDI -DNOMINMAX -DSPDLOG_ACTIVE_LEVEL=SPDLOG_LEVEL_DEBUG -DSPDLOG_COMPILED_LIB -DSPDLOG_FMT_EXTERNAL -DSPDLOG_NO_ATOMIC_LEVELS -DV8_31BIT_SMIS_ON_64BIT_ARCH -DV8_COMPRESS_POINTERS -DVCL_NAMESPACE=vcl -DWIN32_LEAN_AND_MEAN -D_USE_MATH_DEFINES -Daddon_EXPORTS -I/home/robin/.cmake-ts/electron/linux/x64/v9.0.0/include/node -I../../../../../node_modules/@coda/nan -I<LOTS OF INCLUDES> -g3 -glldb -O0 -fPIC -march=core-avx2 -mtune=generic -fcolor-diagnostics -std=c++17 -MD -MT CMakeFiles/addon.dir/src/SPLCalculator.cc.o -MF CMakeFiles/addon.dir/src/SPLCalculator.cc.o.d -o CMakeFiles/addon.dir/src/SPLCalculator.cc.o -c ../../../../../src/SPLCalculator.cc
Ubuntu clang version 12.0.1-++20211011094644+fed41342a82f-1~exp1~20211011215105.3
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 22, 2021

Based on what @addaleax mentioned, the std::shared_ptr<v8::BackingStore> in the backtrace does stick out.

@robinchrist
Copy link
Author

Yeah it definitely sticks out... But I'm not sure where the ABI mismatch could be.

I'm definitely building against the right node version (otherwise the addon wouldn't be able to be loaded by node because the node ABI version is embedded in the path where the actual shared library is located)

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 22, 2021

Are there threads involved somehow?

@robinchrist
Copy link
Author

robinchrist commented Oct 22, 2021

Yes, in the original version. But this definitely happens inside the main loop (in an async worker).

I have prepared another minimal example (don't know whether I'll be able to push it today, it's 3AM), without any threads or whatever involved.

The issue depends on the size of the buffer!

NAN_METHOD(CalculateSync) {
    Nan::HandleScope scope;
    v8::Local<v8::ArrayBuffer> fieldDataBuffer = v8::ArrayBuffer::New(v8::Isolate::GetCurrent(), 245761 * sizeof(float));
    v8::Local<v8::Float32Array> fieldDataArr = v8::Float32Array::New(fieldDataBuffer, 0, 245761);
    
    float* ptr = (float*)(fieldDataBuffer->GetBackingStore()->Data());

    ptr[0] = 1.0;

    info.GetReturnValue().Set(fieldDataArr);
  }

Renderer crashes for 245761, but works for 245760

If you remove the Float32Array and only return the Buffer, everything works fine.

Yes, I have actually bisected the issue

9 works
134217728 crash
67108868 crash
33554438 crash
100000 works
16827219 crash
8463609 crash
528975 crash
264487 crash
182243 works
223365 works
202804 works
243926 works
254206 crash
249066 crash
246496 crash
245211 works
245853 crash
245532 works
245692 works
245772 crash
245732 works
245752 works
245762 crash
245757 works

245760 works
245761 crash

@robinchrist
Copy link
Author

bt
* thread #1, name = 'electron', stop reason = signal SIGILL: illegal instruction operand
  * frame #0: 0x000055d4a39df572 electron`v8::base::OS::Abort() at platform-posix.cc:457:5
    frame #1: 0x000055d4a39d9ad5 electron`::V8_Fatal() at logging.cc:167:3
    frame #2: 0x000055d4a39d96a5 electron`v8::base::(anonymous namespace)::DefaultDcheckHandler(char const*, int, char const*) at logging.cc:57:3
    frame #3: 0x000055d49fc07b20 electron`::Unregister() at backing-store.cc:662:5
    frame #4: 0x000055d49fc074a3 electron`::~BackingStore() at backing-store.cc:166:3
    frame #5: 0x000055d49f81dec7 electron`::__on_zero_shared() [inlined] operator() at memory:2378:5
    frame #6: 0x000055d49f81deba electron`::__on_zero_shared() at memory:3551:5
    frame #7: 0x00007f8a82b50fd5 addon.node`std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release(this=0x0000197ae65b07c0) at shared_ptr_base.h:155:6
    frame #8: 0x00007f8a82b50f8a addon.node`std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count(this=0x00007ffc5baea030) at shared_ptr_base.h:730:11
frame #9: 0x00007f8a82b50f59 addon.node`std::__shared_ptr<v8::BackingStore, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr(this=0x00007ffc5baea028) at shared_ptr_base.h:1169:31
    frame #10: 0x00007f8a82b50e25 addon.node`std::shared_ptr<v8::BackingStore>::~shared_ptr(this=0x00007ffc5baea028) at shared_ptr.h:103:11
    frame #11: 0x00007f8a82b50c83 addon.node`CVO::node::CalculateSync(info=0x00007ffc5baea0c8) at sync.cc:25:18
    frame #12: 0x00007f8a82b4fba9 addon.node`Nan::imp::FunctionCallbackWrapper(info=0x00007ffc5baea450) at nan_callbacks_12_inl.h:176:3
    frame #13: 0x000055d49f8878d7 electron`::Call() at api-arguments-inl.h:158:3
    frame #14: 0x000055d49f886094 electron`::HandleApiCallHelper<false>() at builtins-api.cc:111:36
    frame #15: 0x000055d49f884227 electron`::Builtin_Impl_HandleApiCall() at builtins-api.cc:141:5
    frame #16: 0x000055d49f883d0a electron`::Builtin_HandleApiCall() at builtins-api.cc:129:1
    frame #17: 0x000055d4a07e349f electron`Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit + 63
    frame #18: 0x000055d4a05c5bb5 electron`Builtins_InterpreterEntryTrampoline + 213
    frame #19: 0x000055d4a05bc65a electron`Builtins_JSEntryTrampoline + 90
    frame #20: 0x000055d4a05bc438 electron`Builtins_JSEntry + 120
    frame #21: 0x000055d49f9b1f4c electron`::Invoke() [inlined] Call at simulator.h:142:12
    frame #22: 0x000055d49f9b1f41 electron`::Invoke() at execution.cc:367:33
    frame #23: 0x000055d49f9b0d68 electron`::Call() at execution.cc:461:10
    frame #24: 0x000055d49f95ab80 electron`::Global() at debug-evaluate.cc:54:32
    frame #25: 0x000055d49f811286 electron`::EvaluateGlobal() at api.cc:10119:7
    frame #26: 0x000055d4a026ae09 electron`::evaluate() at v8-runtime-agent-impl.cc:288:24
    frame #27: 0x000055d4a02206b3 electron`::evaluate() at Runtime.cpp:2084:16
    frame #28: 0x000055d4a0201b28 electron`::__call_impl<std::__1::__function::__default_alloc_func<(lambda at gen/v8/src/inspector/protocol/Console.cpp:235:10), void (const v8_crdtp::Dispatchable &)>>() [inlined] operator() at Console.cpp:241:5
    frame #29: 0x000055d4a0201ab8 electron`::__call_impl<std::__1::__function::__default_alloc_func<(lambda at gen/v8/src/inspector/protocol/Console.cpp:235:10), void (const v8_crdtp::Dispatchable &)>>() [inlined] __invoke<(lambda at gen/v8/src/inspector/protocol/Console.cpp:235:10) &, const v8_crdtp::Dispatchable &> at type_traits:3529:1
    frame #30: 0x000055d4a0201ab8 electron`::__call_impl<std::__1::__function::__default_alloc_func<(lambda at gen/v8/src/inspector/protocol/Console.cpp:235:10), void (const v8_crdtp::Dispatchable &)>>() [inlined] __call<(lambda at gen/v8/src/inspector/protocol/Console.cpp:235:10) &, const v8_crdtp::Dispatchable &> at __functional_base:348:9
    frame #31: 0x000055d4a0201ab8 electron`::__call_impl<std::__1::__function::__default_alloc_func<(lambda at gen/v8/src/inspector/protocol/Console.cpp:235:10), void (const v8_crdtp::Dispatchable &)>>() [inlined] operator() at functional:1590:12
    frame #32: 0x000055d4a0201ab8 electron`::__call_impl<std::__1::__function::__default_alloc_func<(lambda at gen/v8/src/inspector/protocol/Console.cpp:235:10), void (const v8_crdtp::Dispatchable &)>>() at functional:2071:16
    frame #33: 0x000055d4a0293d83 electron`::Run() [inlined] operator() at functional:2203:16
    frame #34: 0x000055d4a0293d7d electron`::Run() [inlined] operator() at functional:2473:12
    frame #35: 0x000055d4a0293d7d electron`::Run() at dispatch.cc:501:3
    frame #36: 0x000055d4a0263ca2 electron`::dispatchProtocolMessage() at v8-inspector-session-impl.cc:378:39
    frame #37: 0x000055d4a470abb8 electron`::DispatchProtocolCommandImpl() at devtools_session.cc:223:18
    frame #38: 0x000055d49f6ae459 electron`::Accept() at devtools_agent.mojom-blink.cc:835:13
    frame #39: 0x000055d4a21124af electron`::HandleValidatedMessage() at interface_endpoint_client.cc:554:54
    frame #40: 0x000055d4a2118fcb electron`::Accept() at message_dispatcher.cc:41:19
    frame #41: 0x000055d4a2113c95 electron`::HandleIncomingMessage() at interface_endpoint_client.cc:356:22
    frame #42: 0x000055d4a2444f1f electron`::AcceptOnProxyThread() at ipc_mojo_bootstrap.cc:933:24
    frame #43: 0x000055d4a244193c electron`::RunOnce() [inlined] Invoke<void (IPC::(anonymous namespace)::ChannelAssociatedGroupController::*)(mojo::Message), scoped_refptr<IPC::(anonymous namespace)::ChannelAssociatedGroupController>, mojo::Message> at bind_internal.h:489:12
    frame #44: 0x000055d4a2441913 electron`::RunOnce() [inlined] MakeItSo<void (IPC::(anonymous namespace)::ChannelAssociatedGroupController::*)(mojo::Message), scoped_refptr<IPC::(anonymous namespace)::ChannelAssociatedGroupController>, mojo::Message> at bind_internal.h:623:12
    frame #45: 0x000055d4a2441913 electron`::RunOnce() [inlined] RunImpl<void (IPC::(anonymous namespace)::ChannelAssociatedGroupController::*)(mojo::Message), std::__1::tuple<scoped_refptr<IPC::(anonymous namespace)::ChannelAssociatedGroupController>, mojo::Message>, 0, 1> at bind_internal.h:696:12
    frame #46: 0x000055d4a24418c9 electron`::RunOnce() at bind_internal.h:665:12
    frame #47: 0x000055d4a1dd3842 electron`::RunTask() [inlined] Run at callback.h:98:12
    frame #48: 0x000055d4a1dd37e2 electron`::RunTask() at task_annotator.cc:142:33
    frame #49: 0x000055d4a1df0124 electron`::DoWorkImpl() at thread_controller_with_message_pump_impl.cc:324:23
    frame #50: 0x000055d4a1defdf2 electron`::DoWork() at thread_controller_with_message_pump_impl.cc:248:7
    frame #51: 0x000055d4a1d94064 electron`::Run() at message_pump_default.cc:39:55
    frame #52: 0x000055d4a1df08d7 electron`::Run() at thread_controller_with_message_pump_impl.cc:429:12
    frame #53: 0x000055d4a1db7d22 electron`::Run() at run_loop.cc:124:14
    frame #54: 0x000055d4a6340d65 electron`::RendererMain() at renderer_main.cc:226:16
    frame #55: 0x000055d4a0d5e9cb electron`::Run() at content_main_runner_impl.cc:882:10
    frame #56: 0x000055d4a3846df6 electron`::Main() at main.cc:454:29
    frame #57: 0x000055d49f622871 electron`content::ContentMain(content::ContentMainParams const&) at content_main.cc:19:10
    frame #58: 0x000055d49e51480f electron`main at electron_main.cc:231:10
    frame #59: 0x00007f8a87a8b0b3 libc.so.6`__libc_start_main + 243
    frame #60: 0x000055d49e51452a electron`_start + 42

@kkoopa
Copy link
Collaborator

kkoopa commented Oct 23, 2021

I tried this and did not see a crash. Could this problem be specific to Electron or is it exposing a deeper flaw?

#include <nan.h>

#include <cstdint>

NAN_METHOD(CalculateSync) {
    Nan::HandleScope scope;
    int32_t len = Nan::To<int32_t>(info[0]).FromJust();
    v8::Local<v8::ArrayBuffer> fieldDataBuffer = v8::ArrayBuffer::New(v8::Isolate::GetCurrent(), len * sizeof(float));
    v8::Local<v8::Float32Array> fieldDataArr = v8::Float32Array::New(fieldDataBuffer, 0, len);

    float* ptr = (float*)(fieldDataBuffer->GetBackingStore()->Data());

    ptr[0] = 1.0;

    info.GetReturnValue().Set(fieldDataArr);
  }

NAN_MODULE_INIT(InitAll) {
  Nan::Set(target, Nan::New<v8::String>("calculateSync").ToLocalChecked(),
    Nan::GetFunction(Nan::New<v8::FunctionTemplate>(CalculateSync)).ToLocalChecked());
}

NODE_MODULE(addon, InitAll)
var addon = require('./build/Release/addon');

for (var size = 245760; size != 528975; ++size)
	addon.calculateSync(size);

@robinchrist
Copy link
Author

robinchrist commented Oct 25, 2021

Update:
Only in the release version does the renderer crash depend on the array size
For the debug version, the renderer crash happens regardless of the array size

@devyetii
Copy link

I think this is related to what happens here : #892

@robinchrist
Copy link
Author

robinchrist commented Nov 21, 2021

Oh yeah, I forgot to follow up.

TL;DR
I'm almost certain that this is an ABI issue

Electron and Node are built against libc++ whereas the default on Linux is still libstdc++

Everything was fine until "now" (meaning the point when I upgraded electron and thus node). The native node API was very C-ish, so the ABI issue which was there all the time didn't matter.

But at some point, the node guys decided that it would be a clever thing to change the API and now pass a non-trivial object, aka shared_ptr across an ABI boundary.
Spoiler: It is not a good idea.

While I understand the reason (more intuitive and idiomatic code within node), it is a huge issue for everyone using the native node API.
They do offer an alternative, N-API, but it may be a whole lot of work to port your addon from native node API to N-API.

So for me, the only viable solution was to port the entire addon (more than 10k lines) to N-API / node-addon-api.

I am still a bit mad about that decision from the node devs, but I have successfully ported the addon to node-addon-api / N-API and everything is working fine.

My personal recommendation:
Stop using nan and switch to node-addon-api. There is not a single feature I am missing (and I use quite a lot).
node-addon-api / N-API avoids subtle ABI issues and inherently solves the context awareness issue.
nan should be abandoned / archived, its use officially be discouraged and superseded by node-addon-api`

If you want more bloody details, here's a story from a guy on medium which almost applys 1:1
https://nornagon.medium.com/a-libc-odyssey-973e51649063

@addaleax
Copy link
Member

I am still a bit mad about that decision from the node devs

Not to deflect, but unfortunately we weren’t given a choice here – this came from V8/Chromium. They don’t prioritize situations like the Node.js one, where the V8 “embedder” doesn’t consist of a monolithic codebase and instead potentially includes sub-embedders like Electron and users who write addons against the V8 API.

Stop using nan and switch to node-addon-api

👍 👍 👍

@robinchrist
Copy link
Author

Maybe this should be added to the "Known issues" section? And it would probably be good to place a warning at the very top

like

Please be aware that nan, as the name says, uses the native Node.js API
Due to the way Node.js (and Chromium) work, using nan or the native Node.js API may lead to a variety of issues, most importantly ABI issues which may crash the application when certain features are used.
This is especially important if you're planning to write an addon for Electron
Unless you're sure you want to go this way and know what you're doing, please consider using node-addon-api (based on the ABI stable Node-API / N-API) instead

Opinions on this?

@robinchrist
Copy link
Author

robinchrist commented Nov 23, 2021

Not to deflect, but unfortunately we weren’t given a choice here – this came from V8/Chromium. They don’t prioritize situations like the Node.js one, where the V8 “embedder” doesn’t consist of a monolithic codebase and instead potentially includes sub-embedders like Electron and users who write addons against the V8 API.

Well, in that case I'm mad at the Chromium / V8 devs now 😁

@kkoopa
Copy link
Collaborator

kkoopa commented Nov 23, 2021 via email

@addaleax
Copy link
Member

NAN, being a header-only library, has never promised a stable ABI.

Right – to be clear, there’s also nothing wrong with NAN here. It’s Node.js that does more or less promise to expose a stable ABI through its module version number, and fails to do so here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants