Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock with tokio 1.7.0 #546

Closed
drmingdrmer opened this issue Jun 18, 2021 · 10 comments
Closed

Deadlock with tokio 1.7.0 #546

drmingdrmer opened this issue Jun 18, 2021 · 10 comments

Comments

@drmingdrmer
Copy link

With the latest tokio 1.7.0 there is a deadlock when tokio runtime is shutting down.

lib versions:

h2 = "0.3.3"
hyper = "0.14.9"
tokio = "1.7.0"

The problem

#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
async fn test_meta_cluster_write_on_non_leader() -> anyhow::Result<()> {
    // - Start a server
    // - spawn a task sending RPC to the server...
    Ok(())
}

The problem code snippet is a unit test that brings up a grpc server and a client that keeps sending RPC to the server in another tokio task.

When the test quits(and tokio runtime is shutting down), the task that keeps sending RPC is still running.
Then there is a deadlock that hangs the world and never quits.

The same codes work fine with tokio 1.6.0; Since in 1.7.0 a new feature is added: tokio-rs/tokio#3752
which I believe causes this problem.

The detail

  • The deadlock happens when the tokio runtime is shutting down and trying to drop a stream: in src/proto/streams/streams.rs, it acquired the lock of the stream to do some cleanup jobs.

  • Then while holding the lock me, maybe_cancel() tries to wake up the task this stream belongs to.

  • Because tokio runtime is closed thus another round of dropping happens.
    Finally in src/proto/connection.rs, it tried again to acquire the same lock to release resource. deadlock.

All these happens in one thread with tokio 1.7.0 .

impl Drop for OpaqueStreamRef {
    fn drop(&mut self) {
        drop_stream_ref(&self.inner, self.key);
    }
}

fn drop_stream_ref(inner: &Mutex<Inner>, key: store::Key) {

    let mut me = match inner.lock() {
        Ok(inner) => inner,
        // ...
    };

    // ...

    me.counts.transition(stream, |counts, stream| {
        maybe_cancel(stream, actions, counts);
        // ...
    });
}
impl<T, P, B> Drop for Connection<T, P, B>
// ...
{
    fn drop(&mut self) {

        {
            // Check if lock is held
            let _v = self.inner.streams.inner.try_lock();
            if _v.is_err() {
                let bt = backtrace::Backtrace::new();
                tracing::debug!("--- bt: {:?}", bt);
            }
        }

        // BUG: recv_eof requires lock on self.inner.streams.inner
        let _ = self.inner.streams.recv_eof(true);
    }
}

Stack summary when deadlock:

   4: <h2::proto::connection::Connection<T,P,B> as core::ops::drop::Drop>::drop
             at /Users/drdrxp/xp/vcs/h2/src/proto/connection.rs:574:30

     ...

  52: h2::proto::streams::streams::drop_stream_ref
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/streams.rs:1423:5
  53: <h2::proto::streams::streams::OpaqueStreamRef as core::ops::drop::Drop>::drop
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/streams.rs:1380:9

The entire backtrace(first lock acquire is at frame 53):

   0: backtrace::backtrace::libunwind::trace
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.60/src/backtrace/libunwind.rs:90:5
      backtrace::backtrace::trace_unsynchronized
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.60/src/backtrace/mod.rs:66:5
   1: backtrace::backtrace::trace
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.60/src/backtrace/mod.rs:53:14
   2: backtrace::capture::Backtrace::create
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.60/src/capture.rs:176:9
   3: backtrace::capture::Backtrace::new
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.60/src/capture.rs:140:22
   4: <h2::proto::connection::Connection<T,P,B> as core::ops::drop::Drop>::drop
             at /Users/drdrxp/xp/vcs/h2/src/proto/connection.rs:574:30
   5: core::ptr::drop_in_place<h2::proto::connection::Connection<tonic::transport::service::io::BoxedIo,h2::client::Peer,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
   6: core::ptr::drop_in_place<h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
   7: core::ptr::drop_in_place<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
   8: core::ptr::drop_in_place<futures_util::future::try_future::into_future::IntoFuture<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
   9: core::ptr::drop_in_place<futures_util::future::future::map::Map<futures_util::future::try_future::into_future::IntoFuture<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>>,futures_util::fns::MapErrFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  10: core::ptr::drop_in_place<futures_util::future::future::Map<futures_util::future::try_future::into_future::IntoFuture<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>>,futures_util::fns::MapErrFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  11: core::ptr::drop_in_place<futures_util::future::try_future::MapErr<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  12: core::ptr::drop_in_place<(futures_util::future::try_future::MapErr<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,futures_util::future::future::Map<futures_util::stream::stream::into_future::StreamFuture<futures_channel::mpsc::Receiver<hyper::common::never::Never>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>)>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  13: core::ptr::drop_in_place<core::option::Option<(futures_util::future::try_future::MapErr<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,futures_util::future::future::Map<futures_util::stream::stream::into_future::StreamFuture<futures_channel::mpsc::Receiver<hyper::common::never::Never>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>)>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  14: core::ptr::drop_in_place<futures_util::future::select::Select<futures_util::future::try_future::MapErr<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,futures_util::future::future::Map<futures_util::stream::stream::into_future::StreamFuture<futures_channel::mpsc::Receiver<hyper::common::never::Never>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  15: core::ptr::drop_in_place<hyper::proto::h2::client::conn_task<futures_util::future::try_future::MapErr<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,futures_util::future::future::Map<futures_util::stream::stream::into_future::StreamFuture<futures_channel::mpsc::Receiver<hyper::common::never::Never>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>>::{{closure}}>
             at /Users/drdrxp/xp/vcs/hyper/src/proto/h2/client.rs:176:11
  16: core::ptr::drop_in_place<core::future::from_generator::GenFuture<hyper::proto::h2::client::conn_task<futures_util::future::try_future::MapErr<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,futures_util::future::future::Map<futures_util::stream::stream::into_future::StreamFuture<futures_channel::mpsc::Receiver<hyper::common::never::Never>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>>::{{closure}}>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  17: core::ptr::drop_in_place<tokio::runtime::task::core::Stage<core::future::from_generator::GenFuture<hyper::proto::h2::client::conn_task<futures_util::future::try_future::MapErr<futures_util::future::either::Either<futures_util::future::poll_fn::PollFn<hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,h2::client::Connection<tonic::transport::service::io::BoxedIo,hyper::proto::h2::SendBuf<bytes::bytes::Bytes>>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>,futures_util::future::future::Map<futures_util::stream::stream::into_future::StreamFuture<futures_channel::mpsc::Receiver<hyper::common::never::Never>>,hyper::proto::h2::client::handshake<tonic::transport::service::io::BoxedIo,tonic::body::BoxBody>::{{closure}}::{{closure}}::{{closure}}>>::{{closure}}>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  18: tokio::runtime::task::core::CoreStage<T>::set_stage::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:296:35
  19: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/loom/std/unsafe_cell.rs:14:9
  20: tokio::runtime::task::core::CoreStage<T>::set_stage
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:296:9
  21: tokio::runtime::task::core::CoreStage<T>::drop_future_or_output
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:262:13
  22: tokio::runtime::task::harness::cancel_task::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:392:9
  23: core::ops::function::FnOnce::call_once
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ops/function.rs:227:5
  24: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panic.rs:347:9
  25: std::panicking::try::do_call
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:401:40
  26: <unknown>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:429:6
  27: std::panicking::try
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:365:19
  28: std::panic::catch_unwind
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panic.rs:434:14
  29: tokio::runtime::task::harness::cancel_task
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:391:15
  30: tokio::runtime::task::harness::Harness<T,S>::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:168:19
  31: tokio::runtime::task::raw::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/raw.rs:130:5
  32: tokio::runtime::task::raw::RawTask::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/raw.rs:90:18
  33: tokio::runtime::task::Task<S>::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/mod.rs:164:9
  34: tokio::runtime::task::Notified<S>::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/mod.rs:177:9
  35: tokio::runtime::queue::Inject<T>::push
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/queue.rs:521:13
  36: tokio::runtime::thread_pool::worker::Shared::schedule::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/thread_pool/worker.rs:735:13
  37: tokio::macros::scoped_tls::ScopedKey<T>::with
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/macros/scoped_tls.rs:74:22
  38: tokio::runtime::thread_pool::worker::Shared::schedule
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/thread_pool/worker.rs:722:9
  39: tokio::runtime::thread_pool::worker::<impl tokio::runtime::task::Schedule for alloc::sync::Arc<tokio::runtime::thread_pool::worker::Worker>>::schedule
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/thread_pool/worker.rs:712:9
  40: tokio::runtime::task::core::Scheduler<S>::schedule::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:172:36
  41: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/loom/std/unsafe_cell.rs:10:9
  42: tokio::runtime::task::core::Scheduler<S>::schedule
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:168:9
  43: tokio::runtime::task::harness::Harness<T,S>::wake_by_ref
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:139:13
  44: tokio::runtime::task::harness::Harness<T,S>::wake_by_val
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:133:9
  45: tokio::runtime::task::waker::wake_by_val
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/waker.rs:102:5
  46: core::task::wake::Waker::wake
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/task/wake.rs:217:18
  47: h2::proto::streams::prioritize::Prioritize::schedule_send
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/prioritize.rs:124:17
  48: h2::proto::streams::send::Send::schedule_implicit_reset
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/send.rs:242:9
  49: h2::proto::streams::streams::maybe_cancel
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/streams.rs:1446:9
  50: h2::proto::streams::streams::drop_stream_ref::{{closure}}
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/streams.rs:1424:9
  51: h2::proto::streams::counts::Counts::transition
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/counts.rs:127:19
  52: h2::proto::streams::streams::drop_stream_ref
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/streams.rs:1423:5
  53: <h2::proto::streams::streams::OpaqueStreamRef as core::ops::drop::Drop>::drop
             at /Users/drdrxp/xp/vcs/h2/src/proto/streams/streams.rs:1380:9
  54: core::ptr::drop_in_place<h2::proto::streams::streams::OpaqueStreamRef>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  55: core::ptr::drop_in_place<h2::client::ResponseFuture>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  56: core::ptr::drop_in_place<futures_util::future::future::map::Map<h2::client::ResponseFuture,<hyper::proto::h2::client::ClientTask<tonic::body::BoxBody> as core::future::future::Future>::poll::{{closure}}>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  57: core::ptr::drop_in_place<futures_util::future::future::Map<h2::client::ResponseFuture,<hyper::proto::h2::client::ClientTask<tonic::body::BoxBody> as core::future::future::Future>::poll::{{closure}}>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  58: core::ptr::drop_in_place<hyper::client::dispatch::Callback<http::request::Request<tonic::body::BoxBody>,http::response::Response<hyper::body::body::Body>>::send_when<futures_util::future::future::Map<h2::client::ResponseFuture,<hyper::proto::h2::client::ClientTask<tonic::body::BoxBody> as core::future::future::Future>::poll::{{closure}}>>::{{closure}}::{{closure}}>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  59: core::ptr::drop_in_place<futures_util::future::poll_fn::PollFn<hyper::client::dispatch::Callback<http::request::Request<tonic::body::BoxBody>,http::response::Response<hyper::body::body::Body>>::send_when<futures_util::future::future::Map<h2::client::ResponseFuture,<hyper::proto::h2::client::ClientTask<tonic::body::BoxBody> as core::future::future::Future>::poll::{{closure}}>>::{{closure}}::{{closure}}>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  60: core::ptr::drop_in_place<hyper::client::dispatch::Callback<http::request::Request<tonic::body::BoxBody>,http::response::Response<hyper::body::body::Body>>::send_when<futures_util::future::future::Map<h2::client::ResponseFuture,<hyper::proto::h2::client::ClientTask<tonic::body::BoxBody> as core::future::future::Future>::poll::{{closure}}>>::{{closure}}>
             at /Users/drdrxp/xp/vcs/hyper/src/client/dispatch.rs:242:9
  61: core::ptr::drop_in_place<core::future::from_generator::GenFuture<hyper::client::dispatch::Callback<http::request::Request<tonic::body::BoxBody>,http::response::Response<hyper::body::body::Body>>::send_when<futures_util::future::future::Map<h2::client::ResponseFuture,<hyper::proto::h2::client::ClientTask<tonic::body::BoxBody> as core::future::future::Future>::poll::{{closure}}>>::{{closure}}>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  62: core::ptr::drop_in_place<tokio::runtime::task::core::Stage<core::future::from_generator::GenFuture<hyper::client::dispatch::Callback<http::request::Request<tonic::body::BoxBody>,http::response::Response<hyper::body::body::Body>>::send_when<futures_util::future::future::Map<h2::client::ResponseFuture,<hyper::proto::h2::client::ClientTask<tonic::body::BoxBody> as core::future::future::Future>::poll::{{closure}}>>::{{closure}}>>>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ptr/mod.rs:192:1
  63: tokio::runtime::task::core::CoreStage<T>::set_stage::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:296:35
  64: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/loom/std/unsafe_cell.rs:14:9
  65: tokio::runtime::task::core::CoreStage<T>::set_stage
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:296:9
  66: tokio::runtime::task::core::CoreStage<T>::drop_future_or_output
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:262:13
  67: tokio::runtime::task::harness::cancel_task::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:392:9
  68: core::ops::function::FnOnce::call_once
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ops/function.rs:227:5
  69: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panic.rs:347:9
  70: std::panicking::try::do_call
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:401:40
  71: <unknown>
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:429:6
  72: std::panicking::try
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:365:19
  73: std::panic::catch_unwind
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panic.rs:434:14
  74: tokio::runtime::task::harness::cancel_task
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:391:15
  75: tokio::runtime::task::harness::Harness<T,S>::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:168:19
  76: tokio::runtime::task::raw::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/raw.rs:130:5
  77: tokio::runtime::task::raw::RawTask::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/raw.rs:90:18
  78: tokio::runtime::task::core::Header::shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:306:13
  79: tokio::runtime::thread_pool::worker::Core::pre_shutdown
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/thread_pool/worker.rs:557:13
  80: tokio::runtime::thread_pool::worker::Context::run
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/thread_pool/worker.rs:332:9
  81: tokio::runtime::thread_pool::worker::run::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/thread_pool/worker.rs:303:17
  82: tokio::macros::scoped_tls::ScopedKey<T>::set
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/macros/scoped_tls.rs:61:9
  83: tokio::runtime::thread_pool::worker::run
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/thread_pool/worker.rs:300:5
  84: tokio::runtime::thread_pool::worker::Launch::launch::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/thread_pool/worker.rs:279:45
  85: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/blocking/task.rs:42:21
  86: tokio::runtime::task::core::CoreStage<T>::poll::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:243:17
  87: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/loom/std/unsafe_cell.rs:14:9
  88: tokio::runtime::task::core::CoreStage<T>::poll
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/core.rs:233:13
  89: tokio::runtime::task::harness::poll_future::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:427:23
  90: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panic.rs:347:9
  91: std::panicking::try::do_call
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:401:40
  92: <unknown>
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:422:18
  93: std::panicking::try
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:365:19
  94: std::panic::catch_unwind
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panic.rs:434:14
  95: tokio::runtime::task::harness::poll_future
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:414:19
  96: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:89:9
  97: tokio::runtime::task::harness::Harness<T,S>::poll
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:59:15
  98: tokio::runtime::task::raw::poll
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/raw.rs:104:5
  99: tokio::runtime::task::raw::RawTask::poll
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/raw.rs:66:18
 100: tokio::runtime::task::Notified<S>::run
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/mod.rs:171:9
 101: tokio::runtime::blocking::pool::Inner::run
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/blocking/pool.rs:265:17
 102: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/blocking/pool.rs:245:17
 103: std::sys_common::backtrace::__rust_begin_short_backtrace
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/sys_common/backtrace.rs:125:18
 104: std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/thread/mod.rs:481:17
 105: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panic.rs:347:9
 106: std::panicking::try::do_call
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:401:40
 107: <unknown>
             at /Users/drdrxp/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.7.0/src/runtime/task/harness.rs:422:18
 108: std::panicking::try
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panicking.rs:365:19
 109: std::panic::catch_unwind
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/panic.rs:434:14
 110: std::thread::Builder::spawn_unchecked::{{closure}}
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/thread/mod.rs:480:30
 111: core::ops::function::FnOnce::call_once{{vtable.shim}}
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/core/src/ops/function.rs:227:5
 112: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/alloc/src/boxed.rs:1575:9
      <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/alloc/src/boxed.rs:1575:9
      std::sys::unix::thread::Thread::new::thread_start
             at /rustc/657bc01888e6297257655585f9c475a0801db6d2/library/std/src/sys/unix/thread.rs:71:17
 113: __pthread_body
 114: __pthread_start
@Darksonn
Copy link

Darksonn commented Jun 18, 2021

Thanks for the report. It seems like this is exposing both a problem in Tokio and a problem in h2.

  • Tokio shouldn't be calling shutdown here for tasks that are already on the runtime.
  • H2 should not be waking wakers while keeping a mutex locked. The waker can run arbitrary user code, and the waker could come from anywhere. Even if you're using Tokio, sub-schedulers such as FuturesUnordered can introduce their own wakers, and they might hold a ref-count on the future its going to wake. If the waker getting waked consumes the last ref-count, this would cause that future to get dropped, which is exactly what happened here.

@carllerche
Copy link
Collaborator

carllerche commented Jun 18, 2021

Alice is currently working to try create a repro for the Tokio side of the bug.

@Darksonn
Copy link

@drmingdrmer Are you able to post a complete code example that runs into the deadlock?

@carllerche
Copy link
Collaborator

It looks like we have a repro

@carllerche
Copy link
Collaborator

A fix has been posted as a PR: tokio-rs/tokio#3870 are you able to verify that it fixes your deadlock?

@BohuTANG
Copy link

@carllerche

Confirmed that this issue is fixed, thanks.

But we have fixed it by this:
https://github.com/datafuselabs/datafuse/pull/841/files

@drmingdrmer
Copy link
Author

@drmingdrmer Are you able to post a complete code example that runs into the deadlock?

https://github.com/datafuselabs/datafuse/pull/839/checks?check_run_id=2836068821

This is our CI that has this problem. Clone the repo and check out b7e6bb13772104cb2544bb200f0ce1b95f247932 and cargo test shows up this deadlock.

@nox
Copy link
Contributor

nox commented Oct 21, 2021

So is there anything more to do here?

@olix0r
Copy link
Collaborator

olix0r commented Oct 21, 2021

@nox I may have missed it, but as far as I know the second point in #546 (comment) has not yet been addressed in h2.

@drmingdrmer
Copy link
Author

@nox
Thanks for reminding!

tokio fixed this issue in: tokio-rs/tokio#3870

Let's close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants