Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bus Error on M1 mac #281

Open
sano-jin opened this issue Feb 11, 2022 · 11 comments
Open

Bus Error on M1 mac #281

sano-jin opened this issue Feb 11, 2022 · 11 comments
Assignees
Labels

Comments

@sano-jin
Copy link
Contributor

slim runs successfully but outputs

[1]    45373 bus error  /Users/sano/work/slim/bin/slim

in the last line on Sano's M1 mac.

need investigation on other M1 macs: tbd.

@sano-jin sano-jin added the low label Feb 11, 2022
@sano-jin
Copy link
Contributor Author

sano-jin commented Mar 23, 2022

実行終了後にメモリを解放する段階で bus error が起きているようだ.

とりあえず,ここはうまくいっていない:

st_foreach(ccallback_tbl, (st_iter_func)free_v, 0);

が,この(これとこの次の)2行をコメントアウトしても,

slim(28531,0x104700580) malloc: *** error for object 0x1045ec000: pointer being freed was not allocated
slim(28531,0x104700580) malloc: *** set a breakpoint in malloc_error_break to debug
>>>> lmn_profiler_finalize();>>>> destroy_rules();>>>> delete lmn_functor_table;[1]    28531 abort      /Users/sano/work/lmntal/slim/bin/slim test5.il

となったので,まだ調査が必要.

@sano-jin
Copy link
Contributor Author

暫定的な解決法

main.cpp の slim_finalize(); をコメントアウトすれば(メモリリークを気にしなければ)問題なく動く.

slim_finalize();

@nikosai
Copy link
Contributor

nikosai commented Apr 7, 2023

これは関係するだろうか?
ffi/ffi#864

@kyawaway
Copy link
Contributor

kyawaway commented Feb 13, 2024

使用: Apple clang version 14.0.0 (clang-1400.0.29.202)

デバッガ(lldb)によると,

Process 77010 stopped                                                                                                  
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1                                             
    frame #0: 0x00000001000d6498 slim`st_foreach(table=0x000000010100a960, func=(slim`CCallback::free_v(unsigned long, 
unsigned long, unsigned long) at ccallback.cpp:64:3), arg=0)(...), unsigned long) at st.cpp:517:26 [opt]               
   514    enum st_retval retval;                                                                                       
   515    int i;
   516 
-> 517    for (i = 0; i < table->num_bins; i++) {
   518      last = 0;
   519      for (ptr = table->bins[i];
   520           ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
Target 0: (slim) stopped.
(lldb) s
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x00000001000d64cc slim`st_foreach(table=0x000000010100a960, func=(slim`CCallback::free_v(unsigned long, 
unsigned long, unsigned long) at ccallback.cpp:64:3), arg=0)(...), unsigned long) at st.cpp:519:23 [opt]
   516 
   517    for (i = 0; i < table->num_bins; i++) {
   518      last = 0;
-> 519      for (ptr = table->bins[i];
   520           ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
   521        retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
   522        switch (retval) {
Target 0: (slim) stopped.
(lldb) 
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x00000001000d64f8 slim`st_foreach(table=0x000000010100a960, func=(slim`CCallback::free_v(unsigned long, 
unsigned long, unsigned long) at ccallback.cpp:64:3), arg=0)(...), unsigned long) at st.cpp:521:45 [opt]
   518      last = 0;
   519      for (ptr = table->bins[i];
   520           ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
-> 521        retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
   522        switch (retval) {
   523        case ST_CHECK: /* check if hash is modified during iteration */
   524          tmp = 0;
Target 0: (slim) stopped.
(lldb) 
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x00000001000771f4 slim`CCallback::free_v(key=<unavailable>, v=4295455220, _t=<unavailable>) at ccallback
.cpp:64:3 [opt]
   61   }
   62  
   63   int CCallback::free_v(st_data_t key, st_data_t v, st_data_t _t) {
-> 64     delete (CCallback *)v;
   65     return ST_CONTINUE;
   66   }
   67   /* コールバックを名前nameで登録する。arityはコールバックに引数として
Target 0: (slim) stopped.
(lldb) 
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in                                                    
    frame #0: 0x00000001000cb050 slim`operator delete(p=0x00000001000771f4) at alloc.cpp:169:3 [opt]                   
   166  }
   167 
   168  void operator delete(void* p) noexcept {
-> 169    lmn_free(p);
   170  }
   171 
   172  void operator delete[](void* p) noexcept {
Target 0: (slim) stopped.
(lldb)  
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x00000001000cb014 slim`lmn_free(p=0x00000001000771f4) at alloc.cpp:158:26 [opt]
   155    return result;
   156  }
   157 
-> 158  void lmn_free(void *p) { free(p); }
   159 
   160  void* operator new(std::size_t num) {
   161    return lmn_malloc(num);
Target 0: (slim) stopped.
(lldb) 
slim(77010,0x1f9833a80) malloc: *** error for object 0x1000771f4: pointer being freed was not allocated
slim(77010,0x1f9833a80) malloc: *** set a breakpoint in malloc_error_break to debug
Process 77010 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
    frame #0: 0x000000019e4961b0 libsystem_kernel.dylib`__pthread_kill + 8
libsystem_kernel.dylib`:
->  0x19e4961b0 <+8>:  b.lo   0x19e4961d0               ; <+40>
    0x19e4961b4 <+12>: pacibsp 
    0x19e4961b8 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x19e4961bc <+20>: mov    x29, sp
Target 0: (slim) stopped.
(lldb) 
Process 77010 exited with status = 6 (0x00000006) Terminated due to signal 6
(lldb) 

ccallback_tbl 中のなんらかの要素に対してfreeしようとして,これはallocしてないぞと怒られている.

@kyawaway
Copy link
Contributor

kyawaway commented Feb 13, 2024

使用: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
intelマシンでvalgrindを回してみる.

command: valgrind --tool=memcheck --leak-check=yes --show-reachable=yes
result:

==954898== Memcheck, a memory error detector                                                                           
==954898== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.                                             
==954898== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info                                          
==954898== Command: ../lmntal/installed/bin/slim demo.il                                                               
==954898==                                                                                                             
a(b).                                                                                                                  
==954898==                                                                                                             
==954898== HEAP SUMMARY:                                                                                               
==954898==     in use at exit: 224 bytes in 2 blocks                                                                   
==954898==   total heap usage: 1,495 allocs, 1,493 frees, 252,594 bytes allocated                                      
==954898==                                                                                                             
==954898== 96 bytes in 1 blocks are still reachable in loss record 1 of 2                                              
==954898==    at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)                        
==954898==    by 0x4A0C990: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)                                        
==954898==    by 0x4A0B212: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)                                        
==954898==    by 0x4A0B2E4: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)                                        
==954898==    by 0x400647D: call_init.part.0 (dl-init.c:70) 
==954898==    by 0x4006567: call_init (dl-init.c:33)
==954898==    by 0x4006567: _dl_init (dl-init.c:117)
==954898==    by 0x40202C9: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
==954898==    by 0x1: ???
==954898==    by 0x1FFF0004D6: ???
==954898==    by 0x1FFF0004F3: ???
==954898== 
==954898== 128 bytes in 1 blocks are still reachable in loss record 2 of 2
==954898==    at 0x484DCD3: realloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==954898==    by 0x4A0C9B8: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898==    by 0x4A21C54: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898==    by 0x4A0B2C8: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==954898==    by 0x400647D: call_init.part.0 (dl-init.c:70) 
==954898==    by 0x4006567: call_init (dl-init.c:33)
==954898==    by 0x4006567: _dl_init (dl-init.c:117)
==954898==    by 0x40202C9: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
==954898==    by 0x1: ???
==954898==    by 0x1FFF0004D6: ???
==954898==    by 0x1FFF0004F3: ???
==954898== 
==954898== LEAK SUMMARY:
==954898==    definitely lost: 0 bytes in 0 blocks
==954898==    indirectly lost: 0 bytes in 0 blocks
==954898==      possibly lost: 0 bytes in 0 blocks
==954898==    still reachable: 224 bytes in 2 blocks
==954898==         suppressed: 0 bytes in 0 blocks
==954898== 
==954898== For lists of detected and suppressed errors, rerun with: -s
==954898== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

still reachableではあるが,2箇所でメモリリークが起きているのがわかる.
(追記)これはそんなに関係ない気がしてきた.

@kyawaway
Copy link
Contributor

alloc,freeしているアドレスを全部printしてみたところ,最後にfreeに失敗しているアドレスは,

  • 一度もallocされていない
    • 二重解放では無い
  • 他の全てのアドレスは末尾が0だが,これだけ末尾が0ではない.
    • これはintel側では末尾が0ではないアドレスを解放しようとしたりはしていなかった.つまり,M1固有の問題?

@kyawaway
Copy link
Contributor

kyawaway commented Feb 13, 2024

使用:
M1: Apple clang version 14.0.0 (clang-1400.0.29.202)
intel: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

デバッガ
M1:

Process 37362 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x00000001000d64d0 slim`st_foreach(table=0x0000000101206220, func=(slim`CCallback::free_v(unsigned long, unsigned long, unsigned long) at ccallback.cpp:64:3), arg=0)(...), unsigned long) at st.cpp:521:45 [opt]
   518      last = 0;
   519      for (ptr = table->bins[i];
   520           ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
-> 521        retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
   522        switch (retval) {
   523        case ST_CHECK: /* check if hash is modified during iteration */
   524          tmp = 0;
Target 0: (slim) stopped.
(lldb) v
(st_table_t) table = 0x0000000101206220
(int (*)(...)) func = 0x0000000100077158 (slim`CCallback::free_v(unsigned long, unsigned long, unsigned long) at ccallback.cpp:64:3)
(st_data_t) arg = 0
(int) i = 0
(st_table_entry *) ptr = 0x0000000100707e20
(st_table_entry *) last = nullptr
(st_table_entry *) tmp = <variable not available>

(st_retval) retval = <variable not available>
.
.
.

* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x0000000100077158 slim`CCallback::free_v(key=<unavailable>, v=4295455064, _t=<unavailable>) at ccallback.cpp:64:3 [opt]
   61   }
   62  
   63   int CCallback::free_v(st_data_t key, st_data_t v, st_data_t _t) {
-> 64     delete (CCallback *)v;
   65     return ST_CONTINUE;
   66   }
   67   /* コールバックを名前nameで登録する。arityはコールバックに引数として
Target 0: (slim) stopped.
(lldb) v
(st_data_t) key = <no location, value may have been optimized out>

(st_data_t) v = 4295455064
(st_data_t) _t = <no location, value may have been optimized out>
Process 37340 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x00000001000cb028 slim`operator delete(p=0x0000000100077158) at alloc.cpp:176:3 [opt]
   173  }
   174 
   175  void operator delete(void* p) noexcept {
-> 176    lmn_free(p);
   177  }
   178 
   179  void operator delete[](void* p) noexcept {
Target 0: (slim) stopped.
(lldb) v
(void *) p = 0x0000000100077158

intel:

(gdb) s521           retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
(gdb) p func
$6 = (int (*)(void)) 0x5555555afab8 <CCallback::free_v(unsigned long, unsigned long, unsigned long)>
(gdb) p ptr->key
$7 = 88
(gdb) p ptr->record
$8 = 93824993358816
(gdb) p arg
$9 = 0
(gdb) p ptr
$10 = (st_table_entry *) 0x5555556709c0
(gdb)
.
.
.
CCallback::free_v (key=88, v=93824993358816, _t=0) at ccallback.cpp:65
65      int CCallback::free_v(st_data_t key, st_data_t v, st_data_t _t) {
(gdb) p v
$4 = 93824993358816
(gdb) s
66        delete (CCallback *)v;
(gdb) p v
$5 = 93824993358816
operator delete (p=0x5555556673e0) at alloc.cpp:172
172       lmn_free(p);

のように,delete (CCallback *)v;の直前までは挙動が同じで,(CCallback *)vの結果で差が出ているように思われる.

deleteしたpに関して,M1では未allocのアドレスが出てきたが,intelではalloc済みのpが出てきた.

(CCallback *)vの値に関して,M1ではfuncのアドレスが出てきて,intelでは前にallocしたどこかのアドレスが出てきた.

vの値がおかしい?

@kyawaway
Copy link
Contributor

vの値を追う.

st_foreach時,

intel:

st_foreach中:

Breakpoint 4, st_foreach(st_table*, int (*)(...), unsigned long) (table=0x55555566de50, func=func@entry=0x5555555afab8 <CCallback::free_v(unsigned long, unsigned long, unsigned long)>, arg=arg@entry=0) at st.cpp:512
512     int st_foreach(st_table_t table, int (*func)(ANYARGS), st_data_t arg) {
(gdb) s
517       for (i = 0; i < table->num_bins; i++) {
(gdb) s
519         for (ptr = table->bins[i];
(gdb) s
520              ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
(gdb) s
521           retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
(gdb) p ptr->key
$26 = 88
(gdb) p ptr->record
$27 = 93824993358816
(gdb) p ptr
$28 = (st_table_entry *) 0x5555556709c0

(v=93824993358816)
key=88の要素のinsert時:

Breakpoint 3, CCallback::lmn_register_c_fun (name=name@entry=0x55555560feba “cb_make_input_string”, f=f@entry=0x5555555f5cee <cb_make_input_string(LmnReactCxt*, LmnMembrane*, void*, unsigned char, void*, unsigned char)>, arity=arity@entry=2) at ccallback.cpp:71
71      void CCallback::lmn_register_c_fun(const char *name, void *f, int arity) {
(gdb) 
Continuing.
malloc: 0x5555556673e0
malloc: 0x555555670960
malloc: 0x555555670990

Breakpoint 1, st_insert (table=0x55555566de50, key=88, value=value@entry=93824993358816) at st.cpp:305
305     int st_insert(st_table_t table, st_data_t key, st_data_t value) {
(gdb) 
Continuing.

valueは間違いない.直後のfree_v呼び出しでも,

CCallback::free_v (key=88, v=93824993358816, _t=0) at ccallback.cpp:65
65      int CCallback::free_v(st_data_t key, st_data_t v, st_data_t _t) {
(gdb) s
66        delete (CCallback *)v;
(gdb) p v
$29 = 93824993358816

当然元のvalue=93824993358816が渡っている.

一方,M1では,

st_foreach中:

Process 10945 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x00000001000d64d8 slim`st_foreach(table=0x0000000101005140, func=(slim`CCallback::free_v(unsigned long, unsigned long, unsigned long) at ccallback.cpp:65), arg=0)(...), unsigned long) at st.cpp:522:45 [opt]
   519 
   520    for (i = 0; i < table->num_bins; i++) {
   521      last = 0;
-> 522      for (ptr = table->bins[i];
   523           ptr != 0;) { /* エントリーの存在するチェインでのみfuncが呼ばれる */
   524        retval = (enum st_retval)(*func)(ptr->key, ptr->record, arg);
   525        switch (retval) {
Target 0: (slim) stopped.
(lldb) v
(st_table_t) table = 0x0000000101005140
(int (*)(...)) func = 0x0000000100077188 (slim`CCallback::free_v(unsigned long, unsigned long, unsigned long) at ccallback.cpp:65)
(st_data_t) arg = 0
(int) i = 0
(st_table_entry *) ptr = 0x0000000101404660
(st_table_entry *) last = nullptr
(st_table_entry *) tmp = <variable not available>

(st_retval) retval = <variable not available>

(lldb) v ptr->record
(st_data_t) ptr->record = 4302336288
(lldb) v ptr->key
(st_data_t) ptr->key = 88

(v=4302336288)
key=88要素のinsertは,

* thread #1, queue = ‘com.apple.main-thread’, stop reason = breakpoint 2.1
    frame #0: 0x00000001000771e8 slim`CCallback::lmn_register_c_fun(name=“cb_make_input_string”, f=0x00000001000d4820, arity=2) at ccallback.cpp:76:18 [opt]
   73   /* コールバックを名前nameで登録する。arityはコールバックに引数として
   74      渡されるアトムのリンク数 */
   75   void CCallback::lmn_register_c_fun(const char *name, void *f, int arity) {
-> 76     CCallback *c = new CCallback;
   77     c->f = f;
   78     c->arity = arity;
   79     st_insert(ccallback_tbl, (st_data_t)lmn_intern(name), (st_data_t)c);
Target 0: (slim) stopped.
(lldb) 
Process 10945 resuming
Process 10945 stopped
* thread #1, queue = ‘com.apple.main-thread’, stop reason = breakpoint 3.1
    frame #0: 0x00000001000d5d74 slim`st_insert(table=0x0000000101005140, key=88, value=4302336288) at st.cpp:306:13 [opt]
   303  /* ハッシュ表に新たなエントリーを追加する.
   304   * エントリが存在した場合は, エントリの値のみを更新し,
   305   * キーは元々のものを更新しない. エントリが存在しなかった場合に0,
-> 306   * エントリが存在した場合には1以上の整数を返す. */
   307  int st_insert(st_table_t table, st_data_t key, st_data_t value) {
   308    int ret = st_insert_inner(table, key, value);
   309    if (!ret && (table->num_entries / table->num_bins) > ST_DEFAULT_MAX_DENSITY) {
Target 0: (slim) stopped.

ここまでは問題ない.しかし,次のfree_v呼び出し時では,

Process 10945 stopped* thread #1, queue = 'com.apple.main-thread', stop reason = step in
    frame #0: 0x00000001000771ac slim`CCallback::free_v(key=<unavailable>, v=4295455112, _t=<unavailable>) at ccallback.cpp:70:3 [opt]
   67     //printf("v: %p\n", (CCallback *)v);
   68     //(CCallback *)vをprint
   69     printf("v(cast): %p\n", (CCallback *)v);
-> 70     delete (CCallback *)v;
   71     return ST_CONTINUE;
   72   }
   73   /* コールバックを名前nameで登録する。arityはコールバックに引数として
Target 0: (slim) stopped.
(lldb) v
(st_data_t) key = <no location, value may have been optimized out>

(st_data_t) v = 4295455112
(st_data_t) _t = <no location, value may have been optimized out>

vが違う値(v=4295455112)になっている.vに関しては見えなくなっている.

@kyawaway
Copy link
Contributor

<no location, value may have been optimized out>: bytecodealliance/wasmtime#3884

@k-ueda
Copy link
Contributor

k-ueda commented Feb 15, 2024

【clang】
MacBook Pro (M1, 2020, Sonoma 14.0, clang 15.0) でも bus error 現象を確認しました.(bison, ruby, autoconf@2.69 は brew で別途導入)
また,test/system_check/testsuite/hyperlink の case5 は,bus error の前に segmentation fault します.
(参考)Ubuntu22.04 PCで clang++-15 (+ libstdc++-12) でビルドしたら make check 全部通りました.

【gnu gcc】
M1 Mac でも,gcc-13 ならば make check 全部通りました.ただ,できた src/slim バイナリで .il ファイルは実行できるけど,.lmn を与えてコンパイル+実行しようとすると segmentation fault します.要調査.
gcc-12 では,make check が,上記の case5 だけ FAIL して,調べてみるとこの例題だけ segmentation fault しています.
bus error とは別に,case5.lmn の内部挙動を調査する必要があるかもしれません.

@kyawaway
Copy link
Contributor

#281 に関しては,(Ubuntu22.04, clang++-15)ではbus errorは出ないということならば,clang(llvm)は無関係で,M1固有の問題の可能性が高いということですね,ありがとうございます.

#294 に関しては,#281 とは独立しているということで,別途調査します.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants