Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash in v8::ArrayBuffer::GetBackingStore #54

Open
gireeshpunathil opened this issue Feb 11, 2021 · 35 comments
Open

crash in v8::ArrayBuffer::GetBackingStore #54

gireeshpunathil opened this issue Feb 11, 2021 · 35 comments

Comments

@gireeshpunathil
Copy link

steps to reprouce:

gireesh@p1:~/mq/rec1$ node -v
v14.15.4

gireesh@p1:~/mq/rec1$ cat rec.js

var ref = require('ref-napi');
var st = require('ref-struct-di')(ref);
var at = require('ref-array-di')(ref);

var rt = st({
  asid: at(ref.types.char,1024),
  os: st({ptr: ref.refType(ref.types.void)})
})

var idx = 0
const r = new rt()
r.os.ptr = Buffer.alloc(10)
setInterval(() => {
  ref.reinterpret(r.os.ptr, 0, 'utf8');
  if (++idx % 1000 === 0) console.log(`${idx}th iteration`)
  r.os.ptr = Buffer.alloc(10)
},1)

gireesh@p1:~/mq/rec1$ node rec

1000th iteration


#
# Fatal error in , line 0
# Check failed: result.second.
#
#
#
#FailureMessage Object: 0x7ffd839cc040
 1: 0xa71261  [node]
 2: 0x19d3254 V8_Fatal(char const*, ...) [node]
 3: 0xe5b549 v8::internal::GlobalBackingStoreRegistry::Register(std::shared_ptr<v8::internal::BackingStore>) [node]
 4: 0xba6a98 v8::ArrayBuffer::GetBackingStore() [node]
 5: 0x9c1be0 napi_get_typedarray_info [node]
 6: 0x7fea6e5ef0ef  [/home/gireesh/mq/rec1/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node]
 7: 0x7fea6e5f15bb  [/home/gireesh/mq/rec1/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node]
 8: 0x7fea6e5f7ceb Napi::details::CallbackData<void (*)(Napi::CallbackInfo const&), void>::Wrapper(napi_env__*, napi_callback_info__*) [/home/gireesh/mq/rec1/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node]
 9: 0x9b8c6f  [node]
10: 0xbe571b  [node]
11: 0xbe6cc6  [node]
12: 0xbe7346 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [node]
13: 0x14012d9  [node]
Illegal instruction (core dumped)
gireesh@p1:~/mq/rec1$ 
@gireeshpunathil
Copy link
Author

I have a more elaborate stack trace, if that helps - that shows WritePointer in the stack where the backing array is required.

(lldb) bt
* thread #1, name = 'node', stop reason = signal SIGILL
  * frame #0: 0x0000000000962729 node`v8::base::OS::Abort() + 9
    frame #1: 0x00000000019d3265 node`V8_Fatal(char const*, ...) + 357
    frame #2: 0x0000000000e5b549 node`v8::internal::GlobalBackingStoreRegistry::Register(std::shared_ptr<v8::internal::BackingStore>) + 697
    frame #3: 0x0000000000ba6a98 node`v8::ArrayBuffer::GetBackingStore() + 88
    frame #4: 0x00000000009c1be0 node`napi_get_typedarray_info + 256
    frame #5: 0x00007f65a886b0ef node.napi.node`(anonymous namespace)::GetBufferData(Napi::Value) + 95
    frame #6: 0x00007f65a886d5bb node.napi.node`(anonymous namespace)::WritePointer(Napi::CallbackInfo const&) + 155
    frame #7: 0x00007f65a8873ceb node.napi.node`Napi::details::CallbackData<void (*)(Napi::CallbackInfo const&), void>::Wrapper(napi_env__*, napi_callback_info__*) + 235
    frame #8: 0x00000000009b8c6f node`v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&) + 127
    frame #9: 0x0000000000be571b node`v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) + 491
    frame #10: 0x0000000000be6cc6 node`v8::internal::Builtin_Impl_HandleApiCall(v8::internal::BuiltinArguments, v8::internal::Isolate*) + 166
    frame #11: 0x0000000000be7346 node`v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) + 22
    frame #12: 0x00000000014012d9 node`Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit + 57
    frame #13: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #14: 0x00001b9541a06435
    frame #15: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #16: 0x000000000139829a node`Builtins_JSEntryTrampoline + 90
    frame #17: 0x0000000001398078 node`Builtins_JSEntry + 120
    frame #18: 0x0000000000cc2cf1 node`v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) + 449
    frame #19: 0x0000000000cc3b5f node`v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 95
    frame #20: 0x0000000000f266ab node`v8::internal::Object::SetPropertyWithAccessor(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::Maybe<v8::internal::ShouldThrow>) + 347
    frame #21: 0x0000000000f469e3 node`v8::internal::Object::SetPropertyInternal(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::StoreOrigin, bool*) + 611
    frame #22: 0x0000000000f46fb2 node`v8::internal::Object::SetProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::StoreOrigin, v8::Maybe<v8::internal::ShouldThrow>) + 98
    frame #23: 0x0000000000dd61cf node`v8::internal::StoreIC::Store(v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Name>, v8::internal::Handle<v8::internal::Object>, v8::internal::StoreOrigin) + 575
    frame #24: 0x0000000000dd6af6 node`v8::internal::Runtime_StoreIC_Miss(int, unsigned long*, v8::internal::Isolate*) + 342
    frame #25: 0x00000000014011f9 node`Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit + 57
    frame #26: 0x0000000001476a7e node`Builtins_StaNamedPropertyHandler + 126
    frame #27: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #28: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #29: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #30: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #31: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #32: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #33: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #34: 0x000000000139a582 node`Builtins_InterpreterEntryTrampoline + 194
    frame #35: 0x000000000139829a node`Builtins_JSEntryTrampoline + 90
    frame #36: 0x0000000001398078 node`Builtins_JSEntry + 120
    frame #37: 0x0000000000cc2cf1 node`v8::internal::(anonymous namespace)::Invoke(v8::internal::Isolate*, v8::internal::(anonymous namespace)::InvokeParams const&) + 449
    frame #38: 0x0000000000cc3b5f node`v8::internal::Execution::Call(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, int, v8::internal::Handle<v8::internal::Object>*) + 95
    frame #39: 0x0000000000b8ba54 node`v8::Function::Call(v8::Local<v8::Context>, v8::Local<v8::Value>, int, v8::Local<v8::Value>*) + 324
    frame #40: 0x000000000096b347 node`node::InternalMakeCallback(node::Environment*, v8::Local<v8::Object>, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) + 631
    frame #41: 0x000000000096b51c node`node::MakeCallback(v8::Isolate*, v8::Local<v8::Object>, v8::Local<v8::Function>, int, v8::Local<v8::Value>*, node::async_context) + 156
    frame #42: 0x00000000009d68fb node`napi_make_callback + 875
    frame #43: 0x00007f65a8641bc0 ffi_bindings.node`FFI::FFI::FinishAsyncFFICall(uv_work_s*, int) + 288
    frame #44: 0x00000000013774ed node`uv__work_done(handle=0x000000000446c870) at threadpool.c:313:5
    frame #45: 0x000000000137bae6 node`uv__async_io.part.1 at async.c:163:5
    frame #46: 0x000000000138e5c5 node`uv__io_poll at linux-core.c:462:11
    frame #47: 0x000000000137c418 node`uv_run(loop=0x000000000446c7c0, mode=UV_RUN_DEFAULT) at core.c:385:5
    frame #48: 0x0000000000a44974 node`node::NodeMainInstance::Run() + 580
    frame #49: 0x00000000009d1e15 node`node::Start(int, char**) + 277
    frame #50: 0x00007f65ab2aa0b3 libc.so.6`__libc_start_main + 243
    frame #51: 0x00000000009694cc node`_start + 41
(lldb) 

Could this be due to gc (incorrectly) collecting the Buffer object as it is not referenced by anyone in the JS space?

@gireeshpunathil
Copy link
Author

chatting with @mhdawson , it was revealed that an entry for the backing store was found in the global backing store registery, where it was not expecting one, is how the crash is materializing.

I am not good in interpreting the trace log of --trace-backing-store, but looking at the last entry

BS:reg bs=0x55b41104b420 mem=0x55b411217430 (length=10, capacity=10)

and searching backwards, I found a matching registration for a huge chunk:

BS:reg bs=0x55b411217450 mem=0x55b411217430 (length=1073741823, capacity=1073741823)
the backing stores are different, with capacities 1GB and 10 bytes, but they share backing_store->buffer_start() (starting address of the chunk?) there is no evidence of the large chunk being freed up yet.

@mhdawson
Copy link
Contributor

mhdawson commented Feb 11, 2021

After incrementally adding a bunch of instrumentation it looked be related to WritePointer being called as that seemed to be what was always on the stack when the crash occurred. (EDIT I see Gireesh pointed that out above, but I'd started down the instrumentation path before then).

In writePointer

exports.writePointer = function writePointer (buf, offset, ptr) {
  debug('writing pointer to buffer', buf, offset, ptr);
  exports._writePointer(buf, offset, ptr);
  exports._attach(buf, ptr);
};

_attach tries to avoid the buffer to which the pointer refers being freed before the buffer to which the pointer was written. However I seem to remember @addaleax mentioning that it was difficult to ensure that the cleanup for the buffer to which the pointer was written was complete before the reference buffer was freed. Currently it only ensures that both buffers become eligible to be collected and their finalizes run at the same time but does not ensure which order that will happen.

This patch will ensure that the buffer to which the pointer refers will only become eligible to be collected AFTER the finalizers for the buffer to which the pointer was written have been run. It seems to avoid the crash:

diff --git a/src/binding.cc b/src/binding.cc
index d1fefbd..2cbc5f7 100644
--- a/src/binding.cc
+++ b/src/binding.cc
@@ -355,6 +355,17 @@ void WritePointer(const CallbackInfo& args) {
   if (input.IsNull()) {
     *reinterpret_cast<char**>(ptr) = nullptr;
   } else {
+    // create a napi reference and finalizer to ensure that
+    // the buffer whoes pointer is written can only be
+    // collected after the finalizers for the buffer
+    // to which the pointer was written have already run
+    napi_ref* _ref = new napi_ref;
+    napi_create_reference(env, args[2], 1, _ref);
+    args[0].As<Object>().AddFinalizer([](Env env, napi_ref* ref) {
+      napi_delete_reference(env, *ref);
+      delete ref;
+    }, _ref);
+         
     char* input_ptr = GetBufferData(input);
     *reinterpret_cast<char**>(ptr) = input_ptr;
   }

There are other places that use _attach so this might not be the full fix, but I thought I'd post share what I've figured out so far.

@mhdawson
Copy link
Contributor

Ok the same thing but using node-addon-api versus dropping down into node-api c calls:

diff --git a/src/binding.cc b/src/binding.cc
index d1fefbd..7d8906e 100644
--- a/src/binding.cc
+++ b/src/binding.cc
@@ -355,6 +355,16 @@ void WritePointer(const CallbackInfo& args) {
   if (input.IsNull()) {
     *reinterpret_cast<char**>(ptr) = nullptr;
   } else {
+    // create a node-api reference and finalizer to ensure that
+    // the buffer whoes pointer is written can only be
+    // collected after the finalizers for the buffer
+    // to which the pointer was written have already run
+    Reference<Value>* ref = new Reference<Value>;
+    *ref = Persistent(args[2]);
+    args[0].As<Object>().AddFinalizer([](Env env, Reference<Value>* ref) {
+      delete ref;
+    }, ref);
+
     char* input_ptr = GetBufferData(input);
     *reinterpret_cast<char**>(ptr) = input_ptr;
   }

I'm going to open a PR and discussion on whether it is the right fix can continue there.

mhdawson added a commit to mhdawson/ref-napi that referenced this issue Feb 12, 2021
Keep a buffer written by WritePointer alive
until the finalizers for the buffer to which
the pointer has been run have been executed:

Refs: node-ffi-napi#54

Signed-off-by: Michael Dawson <mdawson@devrus.com>
@mhdawson
Copy link
Contributor

Created PR: #55

@prabhatmishra33
Copy link

is my issue any relatable to this one @gireeshpunathil @mhdawson here is the issue

@mhdawson
Copy link
Contributor

@prabhatmishra33 I see this on the stack in the issue you posted 6: 0x7f268814b0ef [/app/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node] so its possible.

Since it's in production it might be hard to try the change in #55, but that would be the best way to confirm one way or the other.

addaleax pushed a commit that referenced this issue Mar 17, 2021
* src: keep reference buffer alive longer

Keep a buffer written by WritePointer alive
until the finalizers for the buffer to which
the pointer has been run have been executed:

Refs: #54

Signed-off-by: Michael Dawson <mdawson@devrus.com>

* limit change to writePointer

Update to avoid changing behaviour for
_writePointer and limit change to writePointer
where objects were already being associated
with the buffer.

Signed-off-by: Michael Dawson <mdawson@devrus.com>

* remove _attach call in writePointer

remove _attach call in writePointer as
it is no longer needed since a stronger
version is done in the native code
when true is passed as the fourth parameter
@nshmyrev
Copy link

nshmyrev commented May 3, 2021

Hi

we still see the crash with ref-napi 3.0.2

The code we use looks like this, it essentially creates an object and releases it afterwards:

var vosk = require('..')
const model = new vosk.Model("model")

for (var i = 0; i < 1000; i++) {
    console.log("iter", i)
    const rec = new vosk.Recognizer({model: model, sampleRate: 16000});
    console.log("handle", rec.handle)
    rec.free();
    console.log("free")
}

I'll try to simplify code probably. The crash is not stable it occurs at different iterations, so likely a threading issue. The backtrace looks like this:

Thread 1 "node_g" received signal SIGILL, Illegal instruction.
v8::base::OS::Abort () at ../deps/v8/src/base/platform/platform-posix.cc:502
502	    V8_IMMEDIATE_CRASH();
(gdb) bt
#0  v8::base::OS::Abort () at ../deps/v8/src/base/platform/platform-posix.cc:502
#1  0x00005555581b5fd5 in V8_Fatal (file=file@entry=0x555558bae718 "../deps/v8/src/objects/backing-store.cc", line=line@entry=696, 
    format=format@entry=0x555558b27044 "Check failed: %s.") at ../deps/v8/src/base/logging.cc:167
#2  0x0000555556c11d19 in v8::internal::GlobalBackingStoreRegistry::Register (
    backing_store=std::shared_ptr<v8::internal::BackingStore> (use count 3, weak count 1) = {...})
    at ../deps/v8/src/objects/backing-store.cc:696
#3  0x000055555670eb8e in v8::ArrayBuffer::GetBackingStore (this=<optimized out>) at ../deps/v8/src/api/api.cc:3726
#4  0x00005555563bca51 in napi_get_typedarray_info (env=0x55555b4a2990, typedarray=0x7fffffffcee0, type=0x7fffffffc960, 
    length=0x7fffffffc968, data=0x7fffffffc970, arraybuffer=0x0, byte_offset=0x0) at ../src/js_native_api_v8.cc:2946
#5  0x00007ffff50870ff in (anonymous namespace)::GetBufferData(Napi::Value) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#6  0x00007ffff5089a4d in (anonymous namespace)::InstanceData::GetBufferData(napi_value__*) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#7  0x00007ffff507001d in FFI::FFI::FFICall(Napi::CallbackInfo const&) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ffi-napi/build/Release/ffi_bindings.node
#8  0x00007ffff5072e0d in Napi::details::CallbackData<void (*)(Napi::CallbackInfo const&), void>::Wrapper(napi_env__*, napi_callback_info__*) () from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ffi-napi/build/Release/ffi_bindings.node
#9  0x00005555563b1674 in v8impl::(anonymous namespace)::CallbackWrapperBase::<lambda(napi_env)>::operator()(napi_env) const (
    __closure=0x7fffffffcb70, env=0x55555b443960) at ../src/js_native_api_v8.cc:599
#10 0x00005555563bdf89 in napi_env__::CallIntoModule<v8impl::(anonymous namespace)::CallbackWrapperBase::InvokeCallback()::<lambda(napi_env)> >(v8impl::(anonymous namespace)::CallbackWrapperBase::<lambda(napi_env)> &&, void (&&)(napi_env__ *, v8::Local<v8::Value>)) (
    this=0x55555b443960, call=..., handle_exception=
    @0x5555563beb92: {void (napi_env__ *, v8::Local<v8::Value>)} 0x5555563beb92 <napi_env__::HandleThrow(napi_env__*, v8::Local<v8::Value>)>) at ../src/js_native_api_v8.h:95
#11 0x00005555563b16e5 in v8impl::(anonymous namespace)::CallbackWrapperBase::InvokeCallback (this=0x7fffffffcbb0)
    at ../src/js_native_api_v8.cc:598
#12 0x00005555563b173a in v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke (info=...) at ../src/js_native_api_v8.cc:616
#13 0x000055555677fb3d in v8::internal::FunctionCallbackArguments::Call (this=this@entry=0x7fffffffcd20, handler=..., 
    handler@entry=...) at ../deps/v8/src/api/api-arguments-inl.h:158
#14 0x00005555567808b9 in v8::internal::(anonymous namespace)::HandleApiCallHelper<false> (isolate=isolate@entry=0x55555b3a0860, 
    function=..., function@entry=..., new_target=..., new_target@entry=..., fun_data=..., receiver=..., receiver@entry=..., args=...)
    at ../deps/v8/src/builtins/builtins-api.cc:113
#15 0x000055555678521b in v8::internal::Builtin_Impl_HandleApiCall (args=..., isolate=isolate@entry=0x55555b3a0860)
    at ../deps/v8/src/handles/handles.h:132
#16 0x0000555556786048 in v8::internal::Builtin_HandleApiCall (args_length=9, args_object=0x7fffffffcee8, isolate=0x55555b3a0860)
    at ../deps/v8/src/builtins/builtins-api.cc:131
#17 0x0000555557732b00 in Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit ()
    at ../../deps/v8/src/builtins/torque-internal.tq:84
#18 0x00005555574c2585 in Builtins_InterpreterEntryTrampoline () at ../../deps/v8/src/objects/string.tq:183

The backingstore trace is attached too.

bs.log

Still looking on this issue, but maybe you have some ideas. I don't think the patch above solved the problem.

@nshmyrev
Copy link

nshmyrev commented May 3, 2021

Another backtrace with WritePointer instead of FFICall

502	    V8_IMMEDIATE_CRASH();
(gdb) bt
#0  v8::base::OS::Abort () at ../deps/v8/src/base/platform/platform-posix.cc:502
#1  0x00005555581b5fd5 in V8_Fatal (file=file@entry=0x555558bae718 "../deps/v8/src/objects/backing-store.cc", line=line@entry=696, 
    format=format@entry=0x555558b27044 "Check failed: %s.") at ../deps/v8/src/base/logging.cc:167
#2  0x0000555556c11d19 in v8::internal::GlobalBackingStoreRegistry::Register (
    backing_store=std::shared_ptr<v8::internal::BackingStore> (use count 3, weak count 1) = {...})
    at ../deps/v8/src/objects/backing-store.cc:696
#3  0x000055555670eb8e in v8::ArrayBuffer::GetBackingStore (this=<optimized out>) at ../deps/v8/src/api/api.cc:3726
#4  0x00005555563bca51 in napi_get_typedarray_info (env=0x55555b493fc0, typedarray=0x7fffffffd260, type=0x7fffffffcc80, 
    length=0x7fffffffcc88, data=0x7fffffffcc90, arraybuffer=0x0, byte_offset=0x0) at ../src/js_native_api_v8.cc:2946
#5  0x00007ffff50870ff in (anonymous namespace)::GetBufferData(Napi::Value) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#6  0x00007ffff50878a8 in (anonymous namespace)::AddressForArgs(Napi::CallbackInfo const&, unsigned long) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#7  0x00007ffff5089591 in (anonymous namespace)::WritePointer(Napi::CallbackInfo const&) ()
   from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#8  0x00007ffff508fd6b in Napi::details::CallbackData<void (*)(Napi::CallbackInfo const&), void>::Wrapper(napi_env__*, napi_callback_info__*) () from /home/shmyrev/travis/vosk-api/nodejs/node_modules/ref-napi/prebuilds/linux-x64/node.napi.node
#9  0x00005555563b1674 in v8impl::(anonymous namespace)::CallbackWrapperBase::<lambda(napi_env)>::operator()(napi_env) const (
    __closure=0x7fffffffcf00, env=0x55555b493fc0) at ../src/js_native_api_v8.cc:599
#10 0x00005555563bdf89 in napi_env__::CallIntoModule<v8impl::(anonymous namespace)::CallbackWrapperBase::InvokeCallback()::<lambda(napi_env)> >(v8impl::(anonymous namespace)::CallbackWrapperBase::<lambda(napi_env)> &&, void (&&)(napi_env__ *, v8::Local<v8::Value>)) (
    this=0x55555b493fc0, call=..., handle_exception=
    @0x5555563beb92: {void (napi_env__ *, v8::Local<v8::Value>)} 0x5555563beb92 <napi_env__::HandleThrow(napi_env__*, v8::Local<v8::Value>)>) at ../src/js_native_api_v8.h:95
#11 0x00005555563b16e5 in v8impl::(anonymous namespace)::CallbackWrapperBase::InvokeCallback (this=0x7fffffffcf40)
    at ../src/js_native_api_v8.cc:598
#12 0x00005555563b173a in v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke (info=...) at ../src/js_native_api_v8.cc:616
#13 0x000055555677fb3d in v8::internal::FunctionCallbackArguments::Call (this=this@entry=0x7fffffffd0b0, handler=..., 
    handler@entry=...) at ../deps/v8/src/api/api-arguments-inl.h:158
#14 0x00005555567808b9 in v8::internal::(anonymous namespace)::HandleApiCallHelper<false> (isolate=isolate@entry=0x55555b3a0860, 
    function=..., function@entry=..., new_target=..., new_target@entry=..., fun_data=..., receiver=..., receiver@entry=..., args=...)
    at ../deps/v8/src/builtins/builtins-api.cc:113
#15 0x000055555678521b in v8::internal::Builtin_Impl_HandleApiCall (args=..., isolate=isolate@entry=0x55555b3a0860)
    at ../deps/v8/src/handles/handle

@mhdawson
Copy link
Contributor

mhdawson commented May 3, 2021

It was confirmed to have solved the test cast that @gireeshpunathil had provided and the larger real-world case that it was derived from. You might have a different case that triggers a similar issue.

@nshmyrev
Copy link

nshmyrev commented May 3, 2021

@mhdawson Looking on original problem I see it is about the fact that you try to create the buffer for the same address twice. The patch deals with garbage collection but from what I see the problem is not related to garbage collection at all.

What happens is that during the program execution we allocate a chunk of memory and store it as a buffer. Than we free this chunk and allocate again. If by accident new chunk has the same address as before and we have the collision of addresses. Old buffer is not garbage collected and not removed from memory.

From what I see there is no method in ref-napi to release an pointer and free corresponding buffer. If introduced I could use that method to release Buffer object when I free memory chunk.

Another solution I see is not to use NAPI buffers to store pointers at all, feels like they are totally broken.

@nshmyrev
Copy link

nshmyrev commented May 3, 2021

Also take a note from backtrace that second chunk which conflicts with the first one is allocated as an argument for function calls, not as an object. Maybe additional check is missing there.

@nshmyrev
Copy link

nshmyrev commented May 3, 2021

Here is more detailed backtrace with more intermediate calls between. See that val.As<Buffer<char>>() in GetBufferData is supposed to simply unpack the value. Instead it registers an new object in backign store. Maybe it is node-addon-api issue:

#1  0x00005555581b5fd5 in V8_Fatal (file=file@entry=0x555558bae718 "../deps/v8/src/objects/backing-store.cc", line=line@entry=696, 
    format=format@entry=0x555558b27044 "Check failed: %s.") at ../deps/v8/src/base/logging.cc:167
167	  v8::base::OS::Abort();
(gdb) 
#2  0x0000555556c11d19 in v8::internal::GlobalBackingStoreRegistry::Register (
    backing_store=std::shared_ptr<v8::internal::BackingStore> (use count 3, weak count 1) = {...})
    at ../deps/v8/src/objects/backing-store.cc:696
696	  CHECK(result.second);
(gdb) 
#3  0x000055555670eb8e in v8::ArrayBuffer::GetBackingStore (this=<optimized out>) at ../deps/v8/src/api/api.cc:3726



3726	  i::GlobalBackingStoreRegistry::Register(backing_store);
(gdb) 

#4  0x00005555563bca51 in napi_get_typedarray_info (env=0x55555b49d370, typedarray=0x7fffffffd230, type=0x7fffffffcbd0, 
    length=0x7fffffffcbd8, data=0x7fffffffcbe0, arraybuffer=0x0, byte_offset=0x0) at ../src/js_native_api_v8.cc:2946
2946	    *data = static_cast<uint8_t*>(buffer->GetBackingStore()->Data()) +
(gdb) 
#5  0x00007ffff528c2a4 in Napi::TypedArrayOf<unsigned char>::TypedArrayOf (this=0x7fffffffcbc0, env=0x55555b49d370, 
    value=0x7fffffffd230) at /home/shmyrev/travis/vosk-api/nodejs/ref-napi/node_modules/node-addon-api/napi-inl.h:1877
1877	    status = napi_get_typedarray_info(
(gdb) 

#6  0x00007ffff528bae5 in Napi::Buffer<char>::Buffer (this=0x7fffffffcbc0, env=0x55555b49d370, value=0x7fffffffd230)
    at /home/shmyrev/travis/vosk-api/nodejs/ref-napi/node_modules/node-addon-api/napi-inl.h:2209
2209	  : Uint8Array(env, value), _length(0), _data(nullptr) {
(gdb) 
#7  0x00007ffff528b012 in Napi::Value::As<Napi::Buffer<char> > (this=0x7fffffffcbb0)
    at /home/shmyrev/travis/vosk-api/nodejs/ref-napi/node_modules/node-addon-api/napi-inl.h:656
656	  return T(_env, _value);
(gdb) 
#8  0x00007ffff5280df8 in (anonymous namespace)::GetBufferData (val=...) at ../src/binding.cc:194
194	  Buffer<char> buf = val.As<Buffer<char>>();
(gdb) 
#9  0x00007ffff5281805 in (anonymous namespace)::WritePointer (args=...) at ../src/binding.cc:379
379	    char* input_ptr = GetBufferData(input);
(gdb) 
#10 0x00007ffff528c08c in Napi::details::CallbackData<void (*)(Napi::CallbackInfo const&), void>::Wrapper(napi_env__*, napi_callback_info__*)::{lambda()#1}::operator()() const (this=0x7fffffffcdb0)
    at /home/shmyrev/travis/vosk-api/nodejs/ref-napi/node_modules/node-addon-api/napi-inl.h:129
129	      callbackData->callback(callbackInfo);

@nshmyrev
Copy link

nshmyrev commented May 4, 2021

After investigation looks like GetBufferData should use pointer_to_orig_buffer map instead of creating arraybuffer directly on backingstore.

@nshmyrev
Copy link

nshmyrev commented May 4, 2021

Simplified version that crashes quickly:

var ffi = require('ffi-napi');
var ref = require('ref-napi');

const ptr = ref.refType(ref.types.void);

var libc = ffi.Library('libc', {
  'malloc': [ ptr, [ 'int' ] ],
  'free': [ 'void', [ ptr ] ]
});

for (var i = 0; i < 10000; i++) {
    p = libc.malloc(100000);
    console.log(p);
    libc.free(p);
}

@nshmyrev
Copy link

nshmyrev commented May 4, 2021

Essentially the same discussion as in nodejs/node#33321

@nshmyrev
Copy link

nshmyrev commented May 4, 2021

Also #47 ?

@nshmyrev
Copy link

nshmyrev commented May 5, 2021

I could have used weak-napi

var ffi = require('ffi-napi');
var weak = require('weak-napi');

var libc = ffi.Library('libc', {
  'malloc': [ 'pointer', [ 'int' ] ],
  'free': [ 'void', [ 'pointer' ] ]
});

weak(libc.malloc(10000), function(buf) {
    console.log("destroy")
    libc.free(buf)
})

but this also crashes:

Thread 1 "node_g" received signal SIGSEGV, Segmentation fault.
v8::internal::ThreadManager::IsLockedByCurrentThread (this=0x0) at ../deps/v8/src/execution/v8threads.h:74
74	    return mutex_owner_.load(std::memory_order_relaxed) == ThreadId::Current();
(gdb) bt
#0  v8::internal::ThreadManager::IsLockedByCurrentThread (this=0x0) at ../deps/v8/src/execution/v8threads.h:74
#1  v8::HandleScope::Initialize (isolate=0x55555b352010, this=0x55555b519c70) at ../deps/v8/src/api/api.cc:847
#2  v8::HandleScope::HandleScope (this=0x55555b519c70, isolate=0x55555b352010) at ../deps/v8/src/api/api.cc:836
#3  0x00005555563b06fb in v8impl::(anonymous namespace)::HandleScopeWrapper::HandleScopeWrapper (this=0x55555b519c70, 
    isolate=0x55555b352010) at ../src/js_native_api_v8.cc:114
#4  0x00005555563baabe in napi_open_handle_scope (env=0x55555b501c40, result=0x7fffffffdbe8) at ../src/js_native_api_v8.cc:2573
#5  0x00007ffff4e5bfc3 in SetImmediate<(anonymous namespace)::ObjectInfo::OnFree()::{lambda()#1}>(napi_env__*, (anonymous namespace)::ObjectInfo::OnFree()::{lambda()#1}&&)::{lambda(uv_check_s*)#2}::_FUN(uv_check_s*) ()
   from /home/shmyrev/t/node_modules/weak-napi/prebuilds/linux-x64/node.napi.uv1.node
#6  0x0000555557487119 in uv__run_check (loop=0x55555b33ab40 <default_loop_struct>) at ../deps/uv/src/unix/loop-watcher.c:67
#7  0x000055555747d1d6 in uv_run (loop=0x55555b33ab40 <default_loop_struct>, mode=UV_RUN_ONCE) at ../deps/uv/src/unix/core.c:394
#8  0x0000555556370c51 in node::Environment::CleanupHandles (this=0x55555b44d5b0) at ../src/env.cc:643
#9  0x00005555563710df in node::Environment::RunCleanup (this=0x55555b44d5b0) at ../src/env.cc:704
#10 0x00005555562e6749 in node::FreeEnvironment (env=0x55555b44d5b0) at ../src/api/environment.cc:371
#11 0x00005555562e2880 in node::FunctionDeleter<node::Environment, &node::FreeEnvironment>::operator() (this=0x7fffffffe1a0, 
    pointer=0x55555b44d5b0) at ../src/util.h:636
#12 0x00005555562e2252 in std::unique_ptr<node::Environment, node::FunctionDeleter<node::Environment, &node::FreeEnvironment> >::~unique_ptr (this=0x7fffffffe1a0, __in_chrg=<optimized out>) at /usr/include/c++/9/bits/unique_ptr.h:292
#13 0x000055555649e4e9 in node::NodeMainInstance::Run (this=0x7fffffffe240, env_info=0x55555b32ace0 <node::env_info>)
    at ../src/node_main_instance.cc:135
#14 0x00005555563d2c91 in node::Start (argc=3, argv=0x7fffffffe4c8) at ../src/node.cc:1083
#15 0x0000555557e1ab72 in main (argc=3, argv=0x7fffffffe4c8) at ../src/node_main.cc:127

@nshmyrev
Copy link

nshmyrev commented May 5, 2021

@addaleax any comment please?

@addaleax
Copy link
Contributor

addaleax commented May 5, 2021

@nshmyrev I’ll get to this when I can. If you have a good understanding of what’s happening, you can also open a PR, which would speed things up a bit.

@nshmyrev
Copy link

nshmyrev commented May 5, 2021

Great thank you. For now my proposal is to use separate class for wrapping pointers, not Buffer. Something like PointerBuffer with separate memory management model avoiding all BackingStore issues. We won't need centralized map then too and the logic will be much more straightforward.

@addaleax
Copy link
Contributor

addaleax commented May 5, 2021

@nshmyrev That would have been ideal a long time ago, but major backwards-compatibility breakage isn’t really an option for this package, I’m afraid.

@nshmyrev
Copy link

nshmyrev commented May 8, 2021

Fyi, my branches have the full implementation with separate structures. All tests pass, though it might not be 100% compatible:

https://github.com/alphacep/ref-napi
https://github.com/alphacep/node-ffi-napi

please let me know if you are interested to merge or we can just have a new packages inside this project.

@AlexMisiulia
Copy link

Hi everyone. Any updates on this? Seems that @nshmyrev proposed a solution. I think that he can make a fork but this will be not as useful for community because everyone ends up fixing same bugs in their own implementation.

@addaleax do you have plans and time to look into this issue?

@TimothyJones
Copy link

I'm also looking for a fix for this problem.

but major backwards-compatibility breakage isn’t really an option for this package, I’m afraid.

Forgive my ignorance, but is there a reason that breaking changes can't be made along with a major semver bump?

@addaleax
Copy link
Contributor

Since my maintenance of this package is mostly unpaid, and this would unfortunately be a larger rewrite, it’s not at the top of my agenda. @nshmyrev’s proposed solution conceptually works, and if there is a PR at some point, I’d be happy to take a look.

@TimothyJones
Copy link

Thanks! I hear you on the developer priorities thing - funnily enough, I'm after this fix in order to do unpaid maintenance elsewhere 😅

@nshmyrev is there anything that you need in order to put together a PR? Can I help out in some way?

@nshmyrev
Copy link

Thanks! I hear you on the developer priorities thing - funnily enough, I'm after this fix in order to do unpaid maintenance elsewhere 😅

Thank you, Tim. I could take a look on PR a bit later but if you could help and pull the changes it would be much faster.

@TimothyJones
Copy link

@addaleax - as a first step, I have opened a PR from @nshmyrev's fork of ref-napi here: #64. Please let me know if you need any changes made to it before merging.

@cstavaru
Copy link

cstavaru commented Oct 5, 2021

I'm also having this exact issue.

I haven't seen a node ffi implementation working well since Node 12.x with the lxe pull request, tonight I thought I can finally switch to a higher NodeJS version by using ffi-napi instead of node-ffi but no luck...hopefully the proposed pull request above is accepted.

@samuelraeburn
Copy link

Using Node v17.0.1, the issue no longer reproduces for me. Could somebody point me to the change in Node that fixed this? I am guessing it is nodejs/node#40178 but I am not sure.

@dacevedo12
Copy link

Using Node v17.0.1, the issue no longer reproduces for me. Could somebody point me to the change in Node that fixed this? I am guessing it is nodejs/node#40178 but I am not sure.

Fixed indeed but now I'm experiencing a significant slowdown. Maybe related to nodejs/node#32226?

@sangshuduo
Copy link

Using Node v17.0.1, the issue no longer reproduces for me. Could somebody point me to the change in Node that fixed this? I am guessing it is nodejs/node#40178 but I am not sure.

I see v17.4.0 solved the issue I mentioned in #50. Wish the solution can be backported to 14/16 soon.

@tomek113
Copy link

Using Node v17.0.1, the issue no longer reproduces for me. Could somebody point me to the change in Node that fixed this? I am guessing it is nodejs/node#40178 but I am not sure.

I see v17.4.0 solved the issue I mentioned in #50. Wish the solution can be backported to 14/16 soon.

Anyone got any news on implementing the fix in v. 14 or 16?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests