Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault after a few minutes of running .... #333

Open
nvp152 opened this issue May 6, 2024 · 1 comment
Open

Segmentation Fault after a few minutes of running .... #333

nvp152 opened this issue May 6, 2024 · 1 comment

Comments

@nvp152
Copy link

nvp152 commented May 6, 2024

Hi,

I am experiencing segmentation faults every few minutes running the driver. I am really not sure where to go beyond the information i provided.

  • nodejs v18.20.1
  • msodbcsql18 version: 18.3.3.1-1
  • debian 11 (Darwin Kernel Version 22.6.0: Tue Nov 7 21:48:06 PST 2023; root:xnu-8796.141.3.702.9~2/RELEASE_X86_64 x86_64)
  • The application is running in a container in kubernetes.
  • Connecting to a MSSQL Server 2022 using Kerberos auth and an instance name.

I made sure to update to openssl 3.2.1 as per documentation. The lsof output confirms it when the application is running.

node    5996 root  mem       REG      0,379  6212984    1739188 /usr/local/ssl/lib64/libcrypto.so.3
node    5996 root  mem       REG      0,379  1143344    1739191 /usr/local/ssl/lib64/libssl.so.3
node    5996 root  mem       REG      0,379   473136    1746022 /usr/lib/x86_64-linux-gnu/libodbcinst.so.2.0.0
node    5996 root  mem       REG      0,379  2092656    1745998 /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
node    5996 root  mem       REG      0,379  2351638    1746016 /usr/lib/x86_64-linux-gnu/libodbc.so.2.0.0
node    5996 root  mem       REG      0,379   100736    2624235 /lib/x86_64-linux-gnu/libgcc_s.so.1
node    5996 root  mem       REG      0,379  1870824    2625010 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28

I don't always get the exact same segfault but many times it seems related to memory type issues (malloc). When these memory type errors, it seems that a SELECT INTO FROM query is running.

Here is a stack trace

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `node /usr/local/bin/cubejs server'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f6161afa700 (LWP 6009))]
(gdb) set substitute-path . /opt/src/glibc-2.31/
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f6169612537 in __GI_abort () at abort.c:79
#2  0x00007f616966a3e8 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f6169788390 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007f61696716da in malloc_printerr (str=str@entry=0x7f616978a680 "corrupted size vs. prev_size in fastbins") at malloc.c:5347
#4  0x00007f616967258c in malloc_consolidate (av=av@entry=0x7f6144000020) at malloc.c:4493
#5  0x00007f61696743d5 in _int_malloc (av=av@entry=0x7f6144000020, bytes=bytes@entry=1096) at malloc.c:3699
#6  0x00007f6169675f19 in __GI___libc_malloc (bytes=bytes@entry=1096) at malloc.c:3066
#7  0x00007f6160060a7e in __post_internal_error_ex_w_noprefix (error_header=error_header@entry=0x7f6144069a88, 
    sqlstate=sqlstate@entry=0x7f6161af7670, native_error=851968, msg=msg@entry=0x7f6161af7e90, class_origin=class_origin@entry=0, 
    subclass_origin=subclass_origin@entry=0) at __info.c:4068
#8  0x00007f6160049891 in SQLDriverConnectW (hdbc=0x7f61440690b0, hwnd=<optimized out>, conn_str_in=<optimized out>, len_conn_str_in=172, 
    conn_str_out=0x0, conn_str_out_max=0, ptr_conn_str_out=0x0, driver_completion=0) at SQLDriverConnectW.c:844
#9  0x00007f61602ca033 in mssql::OdbcConnection::try_open(std::shared_ptr<std::vector<unsigned short, std::allocator<unsigned short> > >, int) ()
   from /cube/node_modules/msnodesqlv8/build/Release/sqlserverv8.node
#10 0x00007f61602dcfce in mssql::OpenOperation::TryInvokeOdbc() () from /cube/node_modules/msnodesqlv8/build/Release/sqlserverv8.node
#11 0x00007f61602cf15f in mssql::OdbcOperation::Execute() () from /cube/node_modules/msnodesqlv8/build/Release/sqlserverv8.node
#12 0x000000000166ebb4 in worker (arg=0x0) at ../deps/uv/src/threadpool.c:122
#13 0x00007f61697cbea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#14 0x00007f61696eba6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Here is another. When this one happens, i see these errors [Microsoft][ODBC Driver 18 for SQL Server]SSPI Provider: Clock skew too great. In this situation, while it isn't ideal it really should not segfault anyhow.....

Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `node /usr/local/bin/cubejs server'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __GI___libc_free (mem=0x120102033005a8) at malloc.c:3102
3102	malloc.c: No such file or directory.
[Current thread is 1 (Thread 0x7fba137fe700 (LWP 9367))]
(gdb) set substitute-path . /opt/src/glibc-2.31/
(gdb) bt
#0  __GI___libc_free (mem=0x120102033005a8) at malloc.c:3102
#1  0x0000000001870a96 in BIO_free_all ()
#2  0x00007fba02809e2c in ossl_ssl_connection_free () from /usr/local/ssl/lib64/libssl.so.3
#3  0x00007fba02808f7d in SSL_free () from /usr/local/ssl/lib64/libssl.so.3
#4  0x00007fba02d5bf3c in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#5  0x00007fba02d5eb9c in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#6  0x00007fba02d524e1 in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#7  0x00007fba02d551b3 in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#8  0x00007fba02d1b003 in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#9  0x00007fba02d1c0b4 in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#10 0x00007fba02d1cf14 in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#11 0x00007fba02c8b16c in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#12 0x00007fba02cbfe8e in ?? () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#13 0x00007fba02c89d2a in SQLDriverConnectW () from /opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.3.1
#14 0x00007fba0303530d in SQLDriverConnectW (hdbc=0x7fba080013c0, hwnd=0x0, conn_str_in=0x7fba28b7c320, len_conn_str_in=172, conn_str_out=0x0, 
    conn_str_out_max=0, ptr_conn_str_out=0x0, driver_completion=0) at SQLDriverConnectW.c:776
#15 0x00007fba100aa033 in mssql::OdbcConnection::try_open(std::shared_ptr<std::vector<unsigned short, std::allocator<unsigned short> > >, int) ()
   from /cube/node_modules/msnodesqlv8/build/Release/sqlserverv8.node
#16 0x00007fba100bcfce in mssql::OpenOperation::TryInvokeOdbc() () from /cube/node_modules/msnodesqlv8/build/Release/sqlserverv8.node
#17 0x00007fba100af15f in mssql::OdbcOperation::Execute() () from /cube/node_modules/msnodesqlv8/build/Release/sqlserverv8.node
#18 0x000000000166ebb4 in worker (arg=0x0) at ../deps/uv/src/threadpool.c:122
#19 0x00007fba2bc8aea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
#20 0x00007fba2bbaaa6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
@TimelordUK
Copy link
Owner

please note, i do get impression at times people assume that I am part of the Microsoft organisation or at least this driver is owened and maintained by them. This is not the case, this librarty was forked long ago from an orginal MS source and is now maintained by me - and is becoming far too costly in terms of time.

these sorts of problems can be extremely complex to solve - one stack trace you provide is obvious memory corruption, the internal malloc memory structures have been corrupted and hence it is raising an abort as the system is no longer viable.

my only practical initial suggestion is to build the exact docker image here https://github.com/TimelordUK/node-sqlserver-v8/tree/master/docker/debian-msnodesqlv8 and run some of the queries that you would want to run inside this container.

if this does not crash then we are at least in a position to start a comparison with your container.

I am sorry, I could spend huge amounts of time investigating things like this, but this is time i do not have. I do run the driver in debian frequently and have not seen these particular problems but as you state there may well be a problem causing memory corruption for some queries, but I would need a reproducable simple example causing such a crash to even start an investigation.

i am really winding down time i spend on this driver , it will likely soon be only new node releases and absolutely critical fixes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants