You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Version
List the versions of all tokio crates you are using. The easiest way to get
this information is using cargo tree subcommand:
1.37.0
Platform
The output of uname -a (UNIX), or version and 32 or 64-bit (Windows)
Linux
Description
We're using a lot the block_in_place + block_on pattern described in #5843 . It has many caveats, but it seems to work OK for us, as a async drop workaround.
However today I'm debugging a hang on shutdown. Basically Runtime is dropping and the whole process hangs. When I attach to gdb I can see that only a handful worker threads remain, and a timer thread as well. All worker threads seems to be inside block_in_place + block_on section, parked, waiting for something to wake them up, but I don't think there's any thread left to actually poll the event loop anymore.
I don't know how well supported this pattern should be, and I might be wrong about the whole thing altogether, but it seems to me that if tokio just reserved a single worker for the purpose of polling events and shut it down last, or somehow just avoided getting all worker threads block_in_placed, or shut down the event polling thread last (if a dedicated thread is used) the whole thing would just work.
The text was updated successfully, but these errors were encountered:
I would expect IO resources and timers to return errors once runtime shutdown start, so it surprises me that it is hanging. Could you give more details?
implDropforProcessHandleInner{fndrop(&mutself){letSome(child) = &mutself.childelse{
return;};let name = self.name.clone();block_in_place(move || {
tokio::runtime::Handle::current().block_on(asyncmove{debug!(
target: LOG_DEVIMINT,
"sending SIGKILL to {name} and waiting for it to exit");send_sigkill(child);ifletErr(e) = child.wait().await{warn!(target: LOG_DEVIMINT, "failed to wait for {name}: {e:?}");}})})}}
@Darksonn the child is tokio::process::Child. Could it be it's just this one particular case does doesn't get handled?
Oh, shoot. Now I see it's not actually a send_sigkill and I'm not 100% sure if the process didn't hang. I don't think it did, because they all get killed on ctrl+c all the time reliably, but I'll try to verify.
Edit:
Nah, I changed to send_sigkill and I get the same result.
I pasted relevant part of gdb session: https://pastebin.com/VzHF0B5T , including list of threads and stackstrace that is mostly tokio functions if it's of any help.
Version
List the versions of all
tokio
crates you are using. The easiest way to getthis information is using
cargo tree
subcommand:1.37.0
Platform
The output of
uname -a
(UNIX), or version and 32 or 64-bit (Windows)Linux
Description
We're using a lot the
block_in_place
+block_on
pattern described in #5843 . It has many caveats, but it seems to work OK for us, as aasync drop
workaround.However today I'm debugging a hang on shutdown. Basically
Runtime
is dropping and the whole process hangs. When I attach togdb
I can see that only a handful worker threads remain, and a timer thread as well. All worker threads seems to be insideblock_in_place
+block_on
section,park
ed, waiting for something to wake them up, but I don't think there's any thread left to actually poll the event loop anymore.I don't know how well supported this pattern should be, and I might be wrong about the whole thing altogether, but it seems to me that if
tokio
just reserved a single worker for the purpose of polling events and shut it down last, or somehow just avoided getting all worker threadsblock_in_place
d, or shut down the event polling thread last (if a dedicated thread is used) the whole thing would just work.The text was updated successfully, but these errors were encountered: