Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

watchdog: BUG: soft lockup - CPU#0 stuck for 186s #12

Open
stffrdhrn opened this issue Jun 11, 2020 · 8 comments
Open

watchdog: BUG: soft lockup - CPU#0 stuck for 186s #12

stffrdhrn opened this issue Jun 11, 2020 · 8 comments

Comments

@stffrdhrn
Copy link
Member

Even after increasing memory to 512mb on qemu, I still occasionally see this.

/ # [90216.140000] watchdog: BUG: soft lockup - CPU#0 stuck for 186s! [kworker/u3:3:2073]
[90216.140000] Modules linked in:
[90216.140000] CPU: 0 PID: 2073 Comm: kworker/u3:3 Tainted: G        W         5.7.0-rc2-simple-smp-00001-g6bd140e14d9a-dirty #109
[90216.140000] Workqueue: xprtiod xs_stream_data_receive_workfn
[90216.140000] CPU #: 0
[90216.140000]    PC: c03c1694    SR: 00008a7f    SP: d89a3c90
[90216.140000] GPR00: 00000000 GPR01: d89a3c90 GPR02: d89a3cb4 GPR03: 00000000
[90216.140000] GPR04: eb895d10 GPR05: 00000000 GPR06: c8000000 GPR07: 00000000
[90216.140000] GPR08: c029c738 GPR09: c006b898 GPR10: d89a2000 GPR11: fffffe5c
[90216.140000] GPR12: 1a69ac61 GPR13: 00000000 GPR14: df4f5800 GPR15: 00000000
[90216.140000] GPR16: c08c2c20 GPR17: eb895d10 GPR18: 20000000 GPR19: 00000001
[90216.140000] GPR20: 098a4d3e GPR21: 000001a3 GPR22: 000051df GPR23: 00000000
[90216.140000] GPR24: 0000001a GPR25: 00000000 GPR26: 01125cf2 GPR27: 00000000
[90216.140000] GPR28: 002c771e GPR29: c04c4fa1 GPR30: df4f5b9c GPR31: c04b63c0
[90216.140000]   RES: fffffe5c oGPR11: ffffffff
[90216.140000] Process kworker/u3:3 (pid: 2073, stackpage=d8850000)
[90216.140000]
[90216.140000] Stack:
[90216.140000] Call trace:
[90216.140000] [<8558de7f>] ktime_get+0x70/0x1bc
[90216.140000] [<8d9c6218>] tcp_mstamp_refresh+0x30/0x190
[90216.140000] [<d9699fc9>] tcp_rcv_space_adjust+0x38/0x300
[90216.140000] [<93cbbb5f>] tcp_recvmsg+0x558/0xebc
[90216.140000] [<a2442133>] ? __switch_to+0x50/0x7c
[90216.140000] [<e17f0602>] ? __enqueue_entity.constprop.0+0xc4/0x104
[90216.140000] [<85044956>] inet_recvmsg+0x3c/0x64
[90216.140000] [<96e0a9e6>] sock_recvmsg+0x24/0x34
[90216.140000] [<a02d272c>] xs_sock_recvmsg.constprop.0+0x44/0x88
[90216.140000] [<4c68e7cc>] xs_stream_data_receive_workfn+0x110/0xaac
[90216.140000] [<e17f0602>] ? __enqueue_entity.constprop.0+0xc4/0x104
[90216.140000] [<ee8f26ff>] process_one_work+0x254/0x538
[90216.140000] [<a246f55d>] worker_thread+0x64/0x694
[90216.140000] [<3bf47ca9>] ? worker_thread+0x0/0x694
[90216.140000] [<818fceb2>] kthread+0x120/0x150
[90216.140000] [<0db975eb>] ? kthread+0x0/0x150
[90216.140000] [<336cbdd5>] ret_from_fork+0x1c/0x9c
@stffrdhrn
Copy link
Member Author

[90434.850000] watchdog: BUG: soft lockup - CPU#0 stuck for 186s! [kworker/u3:0:2435]
[90434.850000] Modules linked in:
[90434.850000] CPU: 0 PID: 2435 Comm: kworker/u3:0 Tainted: G        W    L    5.7.0-rc2-simple-smp-00001-g6bd140e14d9a-dirty #109
[90434.850000] Workqueue: xprtiod xs_stream_data_receive_workfn
[90434.850000] CPU #: 0
[90434.850000]    PC: c0006d40    SR: 0000827f    SP: d88a5a90
[90434.850000] GPR00: 00000000 GPR01: d88a5a90 GPR02: d88a5a98 GPR03: 0000827f
[90434.850000] GPR04: bc00440c GPR05: 00000000 GPR06: d88a5b10 GPR07: d88a5a8c
[90434.850000] GPR08: 00000001 GPR09: c0274acc GPR10: d88a4000 GPR11: df0f0600
[90434.850000] GPR12: 00004000 GPR13: 00000138 GPR14: df0beec0 GPR15: 00000003
[90434.850000] GPR16: df0beec0 GPR17: 00008279 GPR18: d974da90 GPR19: fffffff9
[90434.850000] GPR20: 00424800 GPR21: 000f0000 GPR22: 00000408 GPR23: 00ff0000
[90434.850000] GPR24: 1f0f0600 GPR25: 00000000 GPR26: 0000040c GPR27: 00000008
[90434.850000] GPR28: 00000001 GPR29: 00000008 GPR30: 00000000 GPR31: 0000013a
[90434.850000]   RES: df0f0600 oGPR11: ffffffff
[90434.850000] Process kworker/u3:0 (pid: 2435, stackpage=d97e3b90)
[90434.850000]
[90434.850000] Stack:
[90434.850000] Call trace:
[90434.850000] [<948bc844>] ethoc_start_xmit+0x1e4/0x36c
[90434.850000] [<dc228f84>] ? skb_checksum_help+0xc4/0x1d4
[90434.850000] [<28d011c4>] ? netif_skb_features+0x148/0x3a0
[90434.850000] [<74dd9b51>] dev_hard_start_xmit+0xc4/0x1bc
[90434.850000] [<f3ac5f3d>] sch_direct_xmit+0x174/0x32c
[90434.850000] [<707490e7>] __qdisc_run+0x194/0x658
[90434.850000] [<8f2a1121>] ? __do_softirq+0x170/0x320
[90434.850000] [<ed9217a4>] __dev_queue_xmit+0x65c/0x830
[90434.850000] [<e10c603b>] dev_queue_xmit+0x18/0x28
[90434.850000] [<ea3a2b2b>] ip_finish_output2+0x490/0x784
[90434.850000] [<6fd69efd>] __ip_finish_output+0x248/0x298
[90434.850000] [<fb555f04>] ip_output+0xcc/0xec
[90434.850000] [<439f8ee8>] ip_local_out+0x58/0x74
[90434.850000] [<41232458>] __ip_queue_xmit+0x1d4/0x4f8
[90434.850000] [<d7f4ed47>] ? tcp_queue_rcv+0x54/0x21c
[90434.850000] [<b56edd9a>] ip_queue_xmit+0x18/0x28
[90434.850000] [<f41f34ed>] __tcp_transmit_skb+0x670/0xe2c
[90434.850000] [<8ff6d6ed>] __tcp_send_ack.part.0+0xf4/0x15c
[90434.850000] [<9b1f6bc5>] tcp_send_ack+0x2c/0x44
[90434.850000] [<39ea58f7>] tcp_cleanup_rbuf+0xb0/0x1bc
[90434.850000] [<cd8c1332>] tcp_recvmsg+0x358/0xebc
[90434.850000] [<f45595f9>] ? queue_work_on+0x78/0xb4
[90434.850000] [<85044956>] inet_recvmsg+0x3c/0x64
[90434.850000] [<96e0a9e6>] sock_recvmsg+0x24/0x34
[90434.850000] [<a02d272c>] xs_sock_recvmsg.constprop.0+0x44/0x88

@stffrdhrn
Copy link
Member Author

[90434.890000] BUG: workqueue lockup - pool cpus=0 flags=0x4 nice=0 stuck for 197s!
[90434.890000] Showing busy workqueues and worker pools:
[90434.890000] workqueue events: flags=0x0
[90434.890000]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[90434.890000]     pending: stop_one_cpu_nowait_workfn
[90434.890000] workqueue events_power_efficient: flags=0x80
[90434.890000]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256 refcnt=5
[90434.890000]     pending: phy_state_machine, check_lifetime, do_cache_clean, neigh_periodic_work
[90434.890000] workqueue rpciod: flags=0xa
[90434.890000]   pwq 2: cpus=0 flags=0x4 nice=0 active=4/256 refcnt=6
[90434.890000]     in-flight: 2067:rpc_async_schedule
[90434.890000]     pending: rpc_async_schedule, rpc_async_schedule, __rpc_queue_timer_fn
[90434.900000] workqueue xprtiod: flags=0x1a
[90434.900000]   pwq 3: cpus=0 flags=0x4 nice=-20 active=1/256 refcnt=3
[90434.900000]     in-flight: 2435:xs_stream_data_receive_workfn
[90434.900000] pool 2: cpus=0 flags=0x4 nice=0 hung=197s workers=10 idle: 1304 2070 2445 2071 2444 2075 2066 1625 2076
[90434.900000] pool 3: cpus=0 flags=0x4 nice=-20 hung=197s workers=6 idle: 2082 2073 2074 2062 2068
[90434.910000] nfs: server 10.0.0.27 not responding, still trying

@stffrdhrn
Copy link
Member Author

stffrdhrn commented Jun 11, 2020

We had a similar case with ktime_get_ts64, handling a syscall in a user space program.

[93353.230000] watchdog: BUG: soft lockup - CPU#0 stuck for 187s! [ld-linux-or1k.s:92]
[93353.230000] Modules linked in:
[93353.230000] CPU: 0 PID: 92 Comm: ld-linux-or1k.s Tainted: G        W    L    5.7.0-rc2-simple-smp-00001-g6bd140e14d9a-dirty #109
[93353.230000] CPU #: 0
[93353.230000]    PC: c03c16d4    SR: 00008a7f    SP: df49be34
[93353.230000] GPR00: 00000000 GPR01: df49be34 GPR02: df49be58 GPR03: 00000000
[93353.230000] GPR04: eb895f45 GPR05: 00000000 GPR06: c8000000 GPR07: 7fb158d4
[93353.230000] GPR08: 00000000 GPR09: c006acd0 GPR10: df49a000 GPR11: 0000c800
[93353.230000] GPR12: 4a6de800 GPR13: 301ac9b4 GPR14: 00000000 GPR15: 0000827e
[93353.230000] GPR16: 00016be3 GPR17: 4a6de800 GPR18: 20000000 GPR19: b8030800
[93353.230000] GPR20: df49beac GPR21: 00000000 GPR22: 16a0793e GPR23: 00000000
[93353.230000] GPR24: 00000000 GPR25: 00000010 GPR26: 0000001a GPR27: fffffff9
[93353.230000] GPR28: 0041ec48 GPR29: c011dad8 GPR30: 00000000 GPR31: d8895c24
[93353.230000]   RES: 0000c800 oGPR11: ffffffff
[93353.230000] Process ld-linux-or1k.s (pid: 92, stackpage=df538000)
[93353.230000]
[93353.230000] Stack:
[93353.230000] Call trace:
[93353.230000] [<48f53e3b>] ktime_get_ts64+0x80/0x2a4
[93353.230000] [<a5ff5e89>] poll_select_set_timeout+0xbc/0x134
[93353.230000] [<bd428b75>] do_pselect+0x94/0x148
[93353.230000] [<b1841f6e>] sys_pselect6_time32+0x70/0x94
[93353.230000] [<cfae1e30>] _syscall_return+0x0/0x4

@stffrdhrn
Copy link
Member Author

Warning again, strange that its always ~187 seconds. I have a hunch that its something in qemu, I also notice the CPU time used by qemu is very high.

 # [  672.530000] watchdog: BUG: soft lockup - CPU#0 stuck for 187s! [find:68]
[  672.530000] Modules linked in:
[  672.530000] CPU: 0 PID: 68 Comm: find Not tainted 5.7.0-rc2-simple-smp-00001-g6bd140e14d9a-dirty #109
[  672.530000] CPU #: 0
[  672.530000]    PC: c03c1694    SR: 00008a7f    SP: df107c3c
[  672.530000] GPR00: 00000000 GPR01: df107c3c GPR02: df107c60 GPR03: 00000000
[  672.530000] GPR04: eb8a9e88 GPR05: 00000000 GPR06: c8000000 GPR07: df107bcf
[  672.530000] GPR08: fffffff0 GPR09: c006b898 GPR10: df106000 GPR11: fffffffd
[  672.530000] GPR12: 2224ce92 GPR13: 00000000 GPR14: df107df4 GPR15: 00000000
[  672.530000] GPR16: c08c2c20 GPR17: eb8a9e88 GPR18: e0000000 GPR19: 00000001
[  672.530000] GPR20: 5e948ab4 GPR21: 00000002 GPR22: 0000006e GPR23: 00000010
[  672.530000] GPR24: 0000001a GPR25: 00000563 GPR26: 00013640 GPR27: 00000001
[  672.530000] GPR28: 00db0fc5 GPR29: d8634000 GPR30: dca46010 GPR31: d86356c0
[  672.530000]   RES: fffffffd oGPR11: ffffffff
[  672.530000] Process find (pid: 68, stackpage=df136000)
[  672.530000]
[  672.530000] Stack:
[  672.530000] Call trace:
[  672.530000] [<a747bb5c>] ktime_get+0x70/0x1bc
[  672.530000] [<27329367>] rpc_new_task+0x160/0x1d4
[  672.530000] [<d7ee08da>] rpc_run_task+0x24/0x240
[  672.530000] [<4d80ff3e>] rpc_call_sync+0x60/0xec
[  672.530000] [<2b88228d>] nfs3_rpc_wrapper+0x48/0xbc
[  672.530000] [<ff2084be>] nfs3_proc_getattr+0x78/0x98
[  672.530000] [<02cfa7c7>] __nfs_revalidate_inode+0xd4/0x22c
[  672.530000] [<7f7a6fa3>] nfs_lookup_verify_inode+0xec/0x120
[  672.530000] [<4b0a1b28>] nfs_do_lookup_revalidate+0x2a4/0x394
[  672.530000] [<0f326b1c>] nfs_lookup_revalidate+0x90/0xc0
[  672.530000] [<8711b743>] lookup_fast+0x120/0x1e4
[  672.530000] [<a5a2bce5>] path_openat+0x160/0x124c
[  672.530000] [<adae4104>] do_filp_open+0x78/0xf8
[  672.530000] [<5a437d71>] ? __alloc_fd+0x5c/0x2c4
[  672.530000] [<054df2bd>] do_sys_openat2+0x32c/0x404
[  672.530000] [<6510ae45>] do_sys_open+0x54/0xa0
[  672.530000] [<b8f86578>] sys_openat+0x14/0x24
[  672.530000] [<30023bc1>] ? _syscall_return+0x0/0x4

@stffrdhrn
Copy link
Member Author

My hunch is that the CPU is being starved and cannot service the queue/timer requests. Doing some profiling I found that qemu is spending a lot of time100% cpu time running l.mtspr to flush tlb buffers.

I traced this to the fact that our kernel flushes every page on tlb changes.

WIth this patch, the cpu usage goes from 100% to 30% and the timeouts seem to go away, but we now get failures sometimes:

diff --git a/arch/openrisc/mm/tlb.c b/arch/openrisc/mm/tlb.c
index 4b680aed8f5f..36f65f554e61 100644
--- a/arch/openrisc/mm/tlb.c
+++ b/arch/openrisc/mm/tlb.c
@@ -28,7 +28,7 @@
 
 #define NO_CONTEXT -1
 
-#define NUM_DTLB_SETS (1 << ((mfspr(SPR_IMMUCFGR) & SPR_IMMUCFGR_NTS) >> \
+#define NUM_DTLB_SETS (1 << ((mfspr(SPR_DMMUCFGR) & SPR_DMMUCFGR_NTS) >> \
                            SPR_DMMUCFGR_NTS_OFF))
 #define NUM_ITLB_SETS (1 << ((mfspr(SPR_IMMUCFGR) & SPR_IMMUCFGR_NTS) >> \
                            SPR_IMMUCFGR_NTS_OFF))
@@ -49,13 +49,16 @@ void local_flush_tlb_all(void)
        unsigned long num_tlb_sets;
 
        /* Determine number of sets for IMMU. */
-       /* FIXME: Assumption is I & D nsets equal. */
        num_tlb_sets = NUM_ITLB_SETS;
 
        for (i = 0; i < num_tlb_sets; i++) {
-               mtspr_off(SPR_DTLBMR_BASE(0), i, 0);
                mtspr_off(SPR_ITLBMR_BASE(0), i, 0);
        }
+
+       num_tlb_sets = NUM_DTLB_SETS;
+       for (i = 0; i < num_tlb_sets; i++) {
+               mtspr_off(SPR_DTLBMR_BASE(0), i, 0);
+       }
 }
 
 #define have_dtlbeir (mfspr(SPR_DMMUCFGR) & SPR_DMMUCFGR_TEIRI)
@@ -116,20 +119,14 @@ void local_flush_tlb_range(struct vm_area_struct *vma,
        }
 }
 
-/*
- * Invalidate the selected mm context only.
- *
- * FIXME: Due to some bug here, we're flushing everything for now.
- * This should be changed to loop over over mm and call flush_tlb_range.
- */
-
 void local_flush_tlb_mm(struct mm_struct *mm)
 {
+       struct vm_area_struct *vma = mm->mmap;
 
-       /* Was seeing bugs with the mm struct passed to us. Scrapped most of
-          this function. */
-       /* Several architctures do this */
-       local_flush_tlb_all();
+       while (vma != NULL) {
+               local_flush_tlb_range(vma, vma->vm_start, vma->vm_end);
+               vma = vma->vm_next;
+       }
 }

Crashes caused by accessing bad pages.

[    4.850000] Unable to handle kernel access
[    4.850000]  at virtual address 0x7665006f
[    4.860000]
[    4.860000] Oops#: 0000
[    4.860000] CPU #: 0
[    4.860000]    PC: c000b3a8    SR: 0000827f    SP: c1f9fd8c
[    4.860000] GPR00: 00000000 GPR01: c1f9fd8c GPR02: c1f9fd98 GPR03: 7665006b
[    4.860000] GPR04: 6b6e2d3e GPR05: 61637469 GPR06: 00000000 GPR07: 00000001
[    4.860000] GPR08: 00000000 GPR09: c000b3b4 GPR10: c1f9e000 GPR11: 00000006
[    4.860000] GPR12: 00000007 GPR13: 00000000 GPR14: fffffff8 GPR15: 00000000
[    4.860000] GPR16: 7665006b GPR17: 00000000 GPR18: 00000000 GPR19: 0000001c
[    4.860000] GPR20: c1f98ce0 GPR21: 0000001c GPR22: c1f98ce0 GPR23: 00000002
[    4.860000] GPR24: c1b84d70 GPR25: 00000000 GPR26: c1b92bc0 GPR27: 00000000
[    4.860000] GPR28: c1f99030 GPR29: 00000000 GPR30: 00000000 GPR31: ffffffff
[    4.860000]   RES: 00000006 oGPR11: ffffffff
[    4.860000] Process swapper (pid: 1, stackpage=c1f98ce0)
[    4.860000]
[    4.860000] Stack:
[    4.860000] Call trace:
[    4.860000] [<(ptrval)>] local_flush_tlb_mm+0x34/0x5c
[    4.870000] [<(ptrval)>] switch_mm+0x28/0x40
[    4.870000] [<(ptrval)>] begin_new_exec+0x658/0x970
[    4.870000] [<(ptrval)>] ? kernel_read+0x24/0x3c
[    4.870000] [<(ptrval)>] load_elf_binary+0x720/0x146c
[    4.870000] [<(ptrval)>] ? lock_acquire+0x118/0x4b8
[    4.870000] [<(ptrval)>] ? __do_execve_file+0x6d8/0xcf8
[    4.870000] [<(ptrval)>] ? lock_release+0x1bc/0x360
[    4.870000] [<(ptrval)>] __do_execve_file+0x6cc/0xcf8
[    4.870000] [<(ptrval)>] ? __do_execve_file+0x598/0xcf8
[    4.870000] [<(ptrval)>] do_execve+0x3c/0x4c
[    4.870000] [<(ptrval)>] run_init_process+0xdc/0x100
[    4.870000] [<(ptrval)>] ? kernel_init+0x0/0x144
[    4.870000] [<(ptrval)>] kernel_init+0x5c/0x144
[    4.870000] [<(ptrval)>] ? schedule_tail+0x54/0x94
[    4.870000] [<(ptrval)>] ret_from_fork+0x1c/0x150

stffrdhrn pushed a commit that referenced this issue Aug 4, 2020
KASAN report null-ptr-deref error when register_netdev() failed:

KASAN: null-ptr-deref in range [0x00000000000003c0-0x00000000000003c7]
CPU: 2 PID: 422 Comm: ip Not tainted 5.8.0-rc4+ #12
Call Trace:
 ip6gre_init_net+0x4ab/0x580
 ? ip6gre_tunnel_uninit+0x3f0/0x3f0
 ops_init+0xa8/0x3c0
 setup_net+0x2de/0x7e0
 ? rcu_read_lock_bh_held+0xb0/0xb0
 ? ops_init+0x3c0/0x3c0
 ? kasan_unpoison_shadow+0x33/0x40
 ? __kasan_kmalloc.constprop.0+0xc2/0xd0
 copy_net_ns+0x27d/0x530
 create_new_namespaces+0x382/0xa30
 unshare_nsproxy_namespaces+0xa1/0x1d0
 ksys_unshare+0x39c/0x780
 ? walk_process_tree+0x2a0/0x2a0
 ? trace_hardirqs_on+0x4a/0x1b0
 ? _raw_spin_unlock_irq+0x1f/0x30
 ? syscall_trace_enter+0x1a7/0x330
 ? do_syscall_64+0x1c/0xa0
 __x64_sys_unshare+0x2d/0x40
 do_syscall_64+0x56/0xa0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

ip6gre_tunnel_uninit() has set 'ign->fb_tunnel_dev' to NULL, later
access to ign->fb_tunnel_dev cause null-ptr-deref. Fix it by saving
'ign->fb_tunnel_dev' to local variable ndev.

Fixes: dafabb6 ("ip6_gre: fix use-after-free in ip6gre_tunnel_lookup()")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stffrdhrn pushed a commit that referenced this issue Aug 4, 2020
I compiled with AddressSanitizer and I had these memory leaks while I
was using the tep_parse_format function:

    Direct leak of 28 byte(s) in 4 object(s) allocated from:
        #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
        #1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985
        #2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140
        #3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206
        #4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291
        #5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299
        #6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849
        #7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161
        #8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207
        #9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786
        #10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285
        #11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369
        #12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335
        #13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389
        #14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431
        torvalds#15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251
        torvalds#16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284
        torvalds#17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593
        torvalds#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727
        torvalds#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048
        torvalds#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127
        torvalds#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152
        torvalds#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252
        torvalds#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347
        torvalds#24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461
        torvalds#25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673
        torvalds#26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)

The token variable in the process_dynamic_array_len function is
allocated in the read_expect_type function, but is not freed before
calling the read_token function.

Free the token variable before calling read_token in order to plug the
leak.

Signed-off-by: Philippe Duplessis-Guindon <pduplessis@efficios.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/linux-trace-devel/20200730150236.5392-1-pduplessis@efficios.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
@stffrdhrn
Copy link
Member Author

Errors showing this thread xs_stream_data_receive_workfn is starving the RCU task are related to NFS. During these failures an find runing over nfs was running. It seems that caused an extra high load of nfs activity starting the RCU cleanup thread.

@stffrdhrn
Copy link
Member Author

stffrdhrn commented Aug 11, 2020

We are also seeing this warning:

/ # ./setup.sh
creating user shorne ...
mounting home work ...
[   34.290000] mount (71) used greatest stack depth: 5360 bytes left
setting clock ...
Tue Aug 11 08:27:56 GMT 2020
enabling login for shorne ...
setting coredumps ...
setting up /dev/stderr /dev/stdin /dev/stdout ...
configuring dropbear private key ...
generating entropy for ssh, must be higher than 500 ...
before: 12
after: 12
starting dropbear ...
[   73.300000] ------------[ cut here ]------------
[   73.300000] WARNING: CPU: 1 PID: 16 at net/sched/sch_generic.c:442 dev_watchdog+0x36c/0x374
[   73.300000] NETDEV WATCHDOG: eth0 (ethoc): transmit queue 0 timed out
[   73.300000] Modules linked in:
[   73.300000] CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.8.0-simple-smp-00012-g55b2662ec665-dirty #257
[   73.300000] Call trace:
[   73.300000] [<(ptrval)>] dump_stack+0xc4/0x100
[   73.300000] [<(ptrval)>] __warn+0xf4/0x124
[   73.300000] [<(ptrval)>] ? dev_watchdog+0x36c/0x374
[   73.300000] [<(ptrval)>] warn_slowpath_fmt+0x7c/0x94
[   73.300000] [<(ptrval)>] dev_watchdog+0x36c/0x374
[   73.300000] [<(ptrval)>] ? dev_watchdog+0x0/0x374
[   73.320000] [<(ptrval)>] call_timer_fn.isra.0+0x24/0xa0
[   73.320000] [<(ptrval)>] run_timer_softirq+0x2a4/0x2b8
[   73.320000] [<(ptrval)>] ? dev_watchdog+0x0/0x374
[   73.320000] [<(ptrval)>] ? set_next_entity+0x154/0x558
[   73.320000] [<(ptrval)>] __do_softirq+0x16c/0x364
[   73.320000] [<(ptrval)>] ? __schedule+0x2fc/0xa30
[   73.320000] [<(ptrval)>] run_ksoftirqd+0x74/0x94
[   73.320000] [<(ptrval)>] smpboot_thread_fn+0x1e4/0x2a0
[   73.320000] [<(ptrval)>] ? _raw_spin_unlock_irqrestore+0x24/0x34
[   73.320000] [<(ptrval)>] ? smpboot_thread_fn+0x0/0x2a0
[   73.320000] [<(ptrval)>] kthread+0x124/0x154
[   73.320000] [<(ptrval)>] ? kthread+0x0/0x154
[   73.320000] [<(ptrval)>] ret_from_fork+0x1c/0x9c
[   73.320000] ---[ end trace 60dafde39c415454 ]---

Note this is right after starting up dropbear which is accessed over NFS.

@stffrdhrn
Copy link
Member Author

stffrdhrn commented Aug 11, 2020

Another boot we see:

[ 7806.080000] INFO: timekeeping: Cycle offset (3951585492) is larger than the 'openrisc_timer' clock's 50% safety margin (2147483647)
[ 7806.080000]       timekeeping: Your kernel is still fine, but is feeling a bit nervous
[ 7967.090000] watchdog: BUG: soft lockup - CPU#1 stuck for 151s! [swapper/1:0]
[ 7967.090000] Modules linked in:
[ 7967.090000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-simple-smp-00012-g55b2662ec665-dirty #257
[ 7967.090000] CPU #: 1
[ 7967.090000]    PC: c0005730    SR: 0000807f    SP: df039fb0
[ 7967.090000] GPR00: 00000000 GPR01: df039fb0 GPR02: df039fc0 GPR03: 0000827f
[ 7967.090000] GPR04: df038000 GPR05: df08fe60 GPR06: df08c050 GPR07: 00000000
[ 7967.090000] GPR08: 00000000 GPR09: c000570c GPR10: df038000 GPR11: 00000006
[ 7967.090000] GPR12: 08f0d180 GPR13: 00000000 GPR14: 00000000 GPR15: 00000000
[ 7967.090000] GPR16: c0572550 GPR17: 00000010 GPR18: 00000001 GPR19: 00000000
[ 7967.090000] GPR20: c0573ab4 GPR21: df038004 GPR22: 00010000 GPR23: 00000000
[ 7967.090000] GPR24: c053be64 GPR25: 587a8000 GPR26: c04e186c GPR27: 00000719
[ 7967.090000] GPR28: 00000060 GPR29: df03a000 GPR30: c05458e0 GPR31: fefefeff
[ 7967.090000]   RES: 00000006 oGPR11: ffffffff
[ 7967.090000] Process swapper/1 (pid: 0, stackpage=df033320)
[ 7967.090000]
[ 7967.090000] Stack:
[ 7967.090000] Call trace:
[ 7967.090000] [<(ptrval)>] arch_cpu_idle+0x18/0x4c
[ 7967.090000] [<(ptrval)>] default_idle_call+0x58/0x80
[ 7967.090000] [<(ptrval)>] do_idle+0x140/0x260
[ 7967.090000] [<(ptrval)>] cpu_startup_entry+0x34/0x3c
[ 7967.090000] [<(ptrval)>] ? secondary_start_kernel+0x11c/0x134

stffrdhrn pushed a commit that referenced this issue Apr 3, 2021
Crypto engine (CAAM) on LS1046A platform is configured HW-coherent,
mark accordingly the DT node.

As reported by Greg and Sascha, and explained by Robin, lack of
"dma-coherent" property for an IP that is configured HW-coherent
can lead to problems, e.g. on v5.11:

> kernel BUG at drivers/crypto/caam/jr.c:247!
> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-20210225-3-00039-g434215968816-dirty #12
> Hardware name: TQ TQMLS1046A SoM on Arkona AT1130 (C300) board (DT)
> pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> pc : caam_jr_dequeue+0x98/0x57c
> lr : caam_jr_dequeue+0x98/0x57c
> sp : ffff800010003d50
> x29: ffff800010003d50 x28: ffff8000118d4000
> x27: ffff8000118d4328 x26: 00000000000001f0
> x25: ffff0008022be480 x24: ffff0008022c6410
> x23: 00000000000001f1 x22: ffff8000118d4329
> x21: 0000000000004d80 x20: 00000000000001f1
> x19: 0000000000000001 x18: 0000000000000020
> x17: 0000000000000000 x16: 0000000000000015
> x15: ffff800011690230 x14: 2e2e2e2e2e2e2e2e
> x13: 2e2e2e2e2e2e2020 x12: 3030303030303030
> x11: ffff800011700a38 x10: 00000000fffff000
> x9 : ffff8000100ada30 x8 : ffff8000116a8a38
> x7 : 0000000000000001 x6 : 0000000000000000
> x5 : 0000000000000000 x4 : 0000000000000000
> x3 : 00000000ffffffff x2 : 0000000000000000
> x1 : 0000000000000000 x0 : 0000000000001800
> Call trace:
>  caam_jr_dequeue+0x98/0x57c
>  tasklet_action_common.constprop.0+0x164/0x18c
>  tasklet_action+0x44/0x54
>  __do_softirq+0x160/0x454
>  __irq_exit_rcu+0x164/0x16c
>  irq_exit+0x1c/0x30
>  __handle_domain_irq+0xc0/0x13c
>  gic_handle_irq+0x5c/0xf0
>  el1_irq+0xb4/0x180
>  arch_cpu_idle+0x18/0x30
>  default_idle_call+0x3c/0x1c0
>  do_idle+0x23c/0x274
>  cpu_startup_entry+0x34/0x70
>  rest_init+0xdc/0xec
>  arch_call_rest_init+0x1c/0x28
>  start_kernel+0x4ac/0x4e4
> Code: 91392021 912c2000 d377d8c6 97f24d96 (d4210000)

Cc: <stable@vger.kernel.org> # v4.10+
Fixes: 8126d88 ("arm64: dts: add QorIQ LS1046A SoC support")
Link: https://lore.kernel.org/linux-crypto/fe6faa24-d8f7-d18f-adfa-44fa0caa1598@arm.com
Reported-by: Greg Ungerer <gerg@kernel.org>
Reported-by: Sascha Hauer <s.hauer@pengutronix.de>
Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Acked-by: Greg Ungerer <gerg@kernel.org>
Acked-by: Li Yang <leoyang.li@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
stffrdhrn pushed a commit that referenced this issue Apr 3, 2021
I got several memory leak reports from Asan with a simple command.  It
was because VDSO is not released due to the refcount.  Like in
__dsos_addnew_id(), it should put the refcount after adding to the list.

  $ perf record true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]

  =================================================================
  ==692599==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 439 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce4aa8ee in dso__new_id util/dso.c:1256
    #2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132
    #3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347
    #4 0x559bce50826c in map__new util/map.c:175
    #5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #6 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #7 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #8 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #10 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #11 0x559bce519bea in perf_session__process_events util/session.c:2297
    #12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    #14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    torvalds#15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    torvalds#16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    torvalds#17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    torvalds#18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    torvalds#19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    torvalds#20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce520907 in nsinfo__copy util/namespaces.c:169
    #2 0x559bce50821b in map__new util/map.c:168
    #3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #4 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #5 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #6 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #8 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #9 0x559bce519bea in perf_session__process_events util/session.c:2297
    #10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    #12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    #13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    #14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    torvalds#15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    torvalds#16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    torvalds#17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    torvalds#18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210315045641.700430-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
stffrdhrn pushed a commit that referenced this issue May 15, 2021
The following deadlock is detected:

  truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write).

  PID: 14827  TASK: ffff881686a9af80  CPU: 20  COMMAND: "ora_p005_hrltd9"
   #0  __schedule at ffffffff818667cc
   #1  schedule at ffffffff81866de6
   #2  inode_dio_wait at ffffffff812a2d04
   #3  ocfs2_setattr at ffffffffc05f322e [ocfs2]
   #4  notify_change at ffffffff812a5a09
   #5  do_truncate at ffffffff812808f5
   #6  do_sys_ftruncate.constprop.18 at ffffffff81280cf2
   #7  sys_ftruncate at ffffffff81280d8e
   #8  do_syscall_64 at ffffffff81003949
   #9  entry_SYSCALL_64_after_hwframe at ffffffff81a001ad

dio completion path is going to complete one direct IO (decrement
inode->i_dio_count), but before that it hung at locking inode->i_rwsem:

   #0  __schedule+700 at ffffffff818667cc
   #1  schedule+54 at ffffffff81866de6
   #2  rwsem_down_write_failed+536 at ffffffff8186aa28
   #3  call_rwsem_down_write_failed+23 at ffffffff8185a1b7
   #4  down_write+45 at ffffffff81869c9d
   #5  ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2]
   #6  ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2]
   #7  dio_complete+140 at ffffffff812c873c
   #8  dio_aio_complete_work+25 at ffffffff812c89f9
   #9  process_one_work+361 at ffffffff810b1889
  #10  worker_thread+77 at ffffffff810b233d
  #11  kthread+261 at ffffffff810b7fd5
  #12  ret_from_fork+62 at ffffffff81a0035e

Thus above forms ABBA deadlock.  The same deadlock was mentioned in
upstream commit 28f5a8a ("ocfs2: should wait dio before inode lock
in ocfs2_setattr()").  It seems that that commit only removed the
cluster lock (the victim of above dead lock) from the ABBA deadlock
party.

End-user visible effects: Process hang in truncate -> ocfs2_setattr path
and other processes hang at ocfs2_dio_end_io_write path.

This is to fix the deadlock itself.  It removes inode_lock() call from
dio completion path to remove the deadlock and add ip_alloc_sem lock in
setattr path to synchronize the inode modifications.

[wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested]
  Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com

Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant