Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random machine reboots whilst using io_uring #42

Open
AlfieC opened this issue Dec 21, 2020 · 9 comments
Open

random machine reboots whilst using io_uring #42

AlfieC opened this issue Dec 21, 2020 · 9 comments

Comments

@AlfieC
Copy link

AlfieC commented Dec 21, 2020

we deployed some io_uring stuff to end-user facing stuff (essentially giant reverse proxies) that run Netty.

we usually average around 800 concurrent connections on each, during busy periods scaling up to 2000 on each. when we run this using the io_uring transport, after a few hours, the machines just simply die (nothing in logs relating to a reboot, etc). tried on various Debian versions, different machines, different kernel versions. the only constant is that when we run it w/ io_uring, the machines die. willing to provide any logs requested, but the issue is that its almost as if someone unplugged the power cable to these servers (no reboot feedback in syslog, events etc).

could this be related to some form of leak in either kernel io_uring or the c implementation in the Netty impl?

@1Jo1
Copy link
Member

1Jo1 commented Dec 22, 2020

that's strange, which kernel version? it's not a VM right? it could be a kernel panic, so it won't be written into a log, what you can do is: redirect the console to the serial port and a second machine must be connect to the serial port

@1Jo1
Copy link
Member

1Jo1 commented Dec 22, 2020

how can i reproduce it?

@AlfieC
Copy link
Author

AlfieC commented Dec 23, 2020

how can i reproduce it?

im actually not entirely sure, it happens to us a few hours after constant 500ish connection load (is reverse proxy so 1000 total, one in one out). it happens a lot faster when the connection load is higher

@1Jo1
Copy link
Member

1Jo1 commented Dec 23, 2020

that's strange, which kernel version? it's not a VM right? it could be a kernel panic, so it won't be written into a log, what you can do is: redirect the console to the serial port and a second machine must be connect to the serial port

Ok, it's probably a kernel panic, please could you try that?

@AlfieC
Copy link
Author

AlfieC commented Dec 26, 2020

happened again last night on a separate machine, kernel 5.9 with latest netty io_uring. general rule of thumb we noticed was that long, sustained high throughput with constant reconnections seems to cause kernel panics. I think there's a leak somewhere. I haven't been able to do the serial port part yet as these are unmanaged remote servers, but I've sent in the ticket to get KVM

@AlfieC
Copy link
Author

AlfieC commented Dec 26, 2020

we've also noticed that restarting the java process every few hours prevents the machine deaths

@normanmaurer
Copy link
Member

Thats interesting... can you gather memory usage (direct, heap and kernel) while the app is running so we can see if it grows ? Also do you create and shutdown IOUringEventLoopGroup frequently ?

@AlfieC
Copy link
Author

AlfieC commented Dec 28, 2020

I'll grab numbers next time we run the process under similar circumstances.

We don't create or shutdown the event loop group frequently - one time on application startup and we reuse it throughout.

@kvcache
Copy link

kvcache commented Aug 19, 2021

This happens on my application & I've had to fall back to EpollEventLoopGroup and EpollServerSocketChannel.

I am using AWS ami amzn2-ami-kernel-5.10-hvm-2.0.20210721.2-x86_64-gp2 which is kernel 5.10.

I am using grpc with io.netty.incubator:netty-incubator-transport-native-io_uring:0.0.5.Final:linux-x86_64.

My heap remains stable around 9gb while sustaining a throughput of around 10k requests per second.

My jvm process resident set remains around 9gb as reported by htop. My free memory steadily declines over time until dmesg records:

[13474.186661] IOUringEventLoo invoked oom-killer: gfp_mask=0x42cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_COMP), order=0, oom_score_adj=0                                                                      
[13474.195024] CPU: 7 PID: 6778 Comm: IOUringEventLoo Not tainted 5.10.50-44.132.amzn2.x86_64 #1                                                                                                       
[13474.202102] Hardware name: Amazon EC2 c5.4xlarge/, BIOS 1.0 10/16/2017                                                                                                                              
[13474.206693] Call Trace:                                                                                                                                                                             
[13474.209603]  dump_stack+0x57/0x70                                                                                                                                                                   
[13474.212821]  dump_header+0x4a/0x1f0                                                                                                                                                                 
[13474.216081]  oom_kill_process.cold+0xb/0x10                                                                                                                                                         
[13474.219681]  out_of_memory.part.0+0xbd/0x260                                                                                                                                                        
[13474.223289]  out_of_memory+0x3d/0x80                                                                                                                                                                
[13474.226600]  __alloc_pages_slowpath.constprop.0+0x8a8/0x960                                                                                                                                         
[13474.230763]  __alloc_pages_nodemask+0x2d6/0x300                                                                                                                                                     
[13474.234472]  allocate_slab+0x33b/0x450                                                                                                                                                              
[13474.237929]  ___slab_alloc+0x355/0x470                                                                                                                                                              
[13474.241296]  ? io_submit_sqes+0x36d/0x5a0                                                                                                                                                           
[13474.345983]  ? kmem_cache_free+0x3a7/0x410                                                                                                                                                          
[13474.349509]  kmem_cache_alloc_bulk+0x17c/0x280                                                                                                                                                      
[13474.353164]  io_submit_sqes+0x36d/0x5a0                                                                                                                                                             
[13474.356593]  ? finish_wait+0x42/0x80                                                                                                                                                                
[13474.359878]  ? io_cqring_wait+0x1a5/0x310                                                                                                                                                           
[13474.363399]  __do_sys_io_uring_enter+0x23e/0x3a0                                                                                                                                                    
[13474.367196]  do_syscall_64+0x33/0x40                                                                                                                                                                
[13474.370468]  entry_SYSCALL_64_after_hwframe+0x44/0xa9                                                                                                                                               
[13474.374427] RIP: 0033:0x7f269e4a8e39                                                                                                                                                                
[13474.377712] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 17 d0 2b 00 f7
 d8 64 89 01 48                                                                                                                                                                                        
[13474.390713] RSP: 002b:00007f25f3c6b718 EFLAGS: 00000202 ORIG_RAX: 00000000000001aa                                                                                                                  
[13474.397392] RAX: ffffffffffffffda RBX: 00000000000000a5 RCX: 00007f269e4a8e39                                                                                                                       
[13474.402599] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 00000000000000a5                                                                                                                       
nfo -XX:MaxMetaspaceSize=128m -se
[13474.407651] RBP: 00007f25f3c6b730 R08: 0000000000000000 R09: 00007f2500000008                                                                                     
nfo -XX:MaxMetaspaceSize=128m -se
[13474.412766] R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000000001                                                                                     
nfo -XX:MaxMetaspaceSize=128m -se
[13474.417866] R13: 0000000000000002 R14: 0000000000000001 R15: 00007f2698f334f0                                                                                     

To resolve, I change my initialization to

serverBuilder.bossEventLoopGroup(EpollEventLoopGroup(4))
serverBuilder.workerEventLoopGroup(EpollEventLoopGroup(32))
serverBuilder.channelType(EpollServerSocketChannel::class.java)

and it no longer leaks memory while sustaining the load.

When the process is killed, the memory is returned so this isn't a case of the kernel holding on to permanent leaked memory. I can also restart my service to keep it alive - but prefer it to stay alive without hourly restarts. I'd love to use io_uring because it delivers really nice consistent low latency! I hope this information is helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants