Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: Remove readiness assertion in `watch::Receiver::changed() #2839

Merged
merged 3 commits into from Sep 22, 2020

Conversation

zaharidichev
Copy link
Contributor

@zaharidichev zaharidichev commented Sep 17, 2020

Motivation

After chatting with @hawkw I decided to try and tackle #2800. They pointed me to a bit of code that might be useful when learning about how the Notify type works. I decided to play with loom and see how it works by writing a few tests. Then came around a problem with the Watch type. The current loom smoke test for Watch exercises only one receiver. Changing the test to have more than one will result in a panic:

running 1 test
test sync::tests::loom_watch::smoke ... thread 'main' panicked at 'assertion failed: !res.is_ready()', tokio/src/sync/watch.rs:258:13
stack backtrace:
   0: std::panicking::begin_panic
             at /Users/zaharidichev/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/std/src/panicking.rs:505
   1: tokio::sync::watch::Receiver<T>::changed::{{closure}}::{{closure}}
             at ./src/sync/watch.rs:258
   2: <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll
             at ./src/future/poll_fn.rs:36
   3: tokio::sync::watch::Receiver<T>::changed::{{closure}}
             at ./src/sync/watch.rs:256
   4: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /Users/zaharidichev/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/future/mod.rs:79
   5: loom::future::block_on
             at /Users/zaharidichev/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.3.5/src/future/mod.rs:34
   6: tokio::sync::tests::loom_watch::smoke::{{closure}}
             at ./src/sync/tests/loom_watch.rs:18
   7: loom::model::Builder::check::{{closure}}
             at /Users/zaharidichev/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.3.5/src/model.rs:198
   8: core::ops::function::FnOnce::call_once{{vtable.shim}}
             at /Users/zaharidichev/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:227
   9: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
             at /Users/zaharidichev/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/boxed.rs:1042
  10: loom::rt::scheduler::spawn_threads::{{closure}}::{{closure}}
             at /Users/zaharidichev/.cargo/registry/src/github.com-1ecc6299db9ec823/loom-0.3.5/src/rt/scheduler.rs:140
  11: generator::gen_impl::GeneratorImpl<A,T>::init_code::{{closure}}
             at /Users/zaharidichev/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.6.21/src/gen_impl.rs:308
  12: generator::stack::StackBox<F>::call_once
             at /Users/zaharidichev/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.6.21/src/stack/mod.rs:135
  13: generator::stack::Func::call_once
             at /Users/zaharidichev/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.6.21/src/stack/mod.rs:117
  14: generator::gen_impl::gen_init::{{closure}}
             at /Users/zaharidichev/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.6.21/src/gen_impl.rs:513
  15: core::ops::function::FnOnce::call_once
             at /Users/zaharidichev/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:227
  16: std::panicking::try::do_call
             at /Users/zaharidichev/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/std/src/panicking.rs:381
  17: ___rust_try
  18: std::panicking::try
             at /Users/zaharidichev/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/std/src/panicking.rs:345
  19: std::panic::catch_unwind
             at /Users/zaharidichev/.rustup/toolchains/nightly-x86_64-apple-darwin/lib/rustlib/src/rust/library/std/src/panic.rs:382
  20: generator::gen_impl::gen_init
             at /Users/zaharidichev/.cargo/registry/src/github.com-1ecc6299db9ec823/generator-0.6.21/src/gen_impl.rs:527
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
FAILED

failures:

failures:
    sync::tests::loom_watch::smoke

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 69 filtered out

error: test failed, to rerun pass '--lib'

This happens because in watch::Receiver::changed Notified was polled for the firt time to ensure that we register the waiter. The problem is that when there are more than one Notified instances tied to one Notify, it is possible for a Notified to be dropped without receiving its notification. If that happens this notification is left for another Notified instance to consume it. This means that it is not safe to assume that calling Notified::poll() for the first time shall always result in returning Pending, even if we are never calling notify_one.

Solution

We handle the case where polling the Notified future returns Ready right away.

Signed-off-by: Zahari Dichev zaharidichev@gmail.com

In `watch::Receiver::changed` `Notified` was polled
for the first time to ensure the waiter is registered while
assuming that the first poll will always return `Pending`.
It is the case however that another instance of `Notified`
is dropped without receiving its notification, this "orphaned"
notification can be used to satisfy another waiter without
even registering it. This commit accounts for that scenario.

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
@Darksonn Darksonn requested a review from hawkw September 17, 2020 12:55
@Darksonn Darksonn added A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-sync Module: tokio/sync labels Sep 17, 2020
Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right to me, thanks for the fix! @carllerche, anything I'm overlooking?

I had a couple minor nits, but the change seems right.

Comment on lines 254 to 259
// Polling the future here has dual purpose. The first one is to register
// the waiter so when `notify_waiters` is called it is notified. The second
// is to cover the case where another instance of `Notiified` has been dropped
// without receiving its notification. If that was the case polling the
// future for the first time will use this "lost" notification and return
// `Ready` immediatelly without registering any waiter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple typos and grammar nits:

Suggested change
// Polling the future here has dual purpose. The first one is to register
// the waiter so when `notify_waiters` is called it is notified. The second
// is to cover the case where another instance of `Notiified` has been dropped
// without receiving its notification. If that was the case polling the
// future for the first time will use this "lost" notification and return
// `Ready` immediatelly without registering any waiter
// Polling the future here has a dual purpose. The first one is to register
// the waiter so that it is notified when `notify_waiters` is called. The second
// is to cover the case where another instance of `Notified` has been dropped
// without receiving its notification. If this has happened, polling the future
// for the first time will use this "lost" notification and return `Ready`
// immediately without registering any waiter.

Comment on lines 260 to 267
let aquired_lost_notification =
crate::future::poll_fn(|cx| match Pin::new(&mut notified).poll(cx) {
Poll::Ready(()) => Poll::Ready(true),
Poll::Pending => Poll::Ready(false),
})
.await;

if aquired_lost_notification {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, take it or leave it: "acquired_lost_notification" is kind of a mouthful, what about something like

Suggested change
let aquired_lost_notification =
crate::future::poll_fn(|cx| match Pin::new(&mut notified).poll(cx) {
Poll::Ready(()) => Poll::Ready(true),
Poll::Pending => Poll::Ready(false),
})
.await;
if aquired_lost_notification {
let already_notified =
crate::future::poll_fn(|cx| match Pin::new(&mut notified).poll(cx) {
Poll::Ready(()) => Poll::Ready(true),
Poll::Pending => Poll::Ready(false),
})
.await;
if already_notified {

(also, "acquired" is spelled wrong in the code, so we should fix that even if we don't change the name :) )

@carllerche
Copy link
Member

Ack, you are correct. There should be more tests 🙃 thanks for catching this.

I don't think this will entirely solve the issue as we can concurrently send many updates and reproduce the problem.

As I see it, we have two options:

a) Update Notify to not resend the notification when Notified is dropped if the notification is from a notify_waiters call. In general, this would be a better behavior.

b) Implement a loop in changed() that awaits on notified() and checks if the version increased.

@zaharidichev
Copy link
Contributor Author

@carllerche

a) Update Notify to not resend the notification when Notified is dropped if the notification is from a notify_waiters call. In general, this would be a better behavior.

That indeed makes the most sense to me as well. Just out of curiosity, how can I triger the problem after this change by sending concurrent updates. Can I have a loom test to exercise that part ?

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
@zaharidichev
Copy link
Contributor Author

@hawkw I incorporated the suggestion to avoid resending the notifications if they have been trigerred by notify_waiters

Copy link
Member

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍 thanks. I'm not sure why CI is failing though. I will try restarting it.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks right to me — i had a couple small style nits & typo corrections, but no issues with the implementation now that carl's comments have been addressed!

Comment on lines 110 to 113
// Notification trigerred by calling `notify_waiters`
AllWaiters,
// Notification trigerred by calling `notify_one`
OneWaiter,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos:

Suggested change
// Notification trigerred by calling `notify_waiters`
AllWaiters,
// Notification trigerred by calling `notify_one`
OneWaiter,
// Notification triggered by calling `notify_waiters`
AllWaiters,
// Notification triggered by calling `notify_one`
OneWaiter,

Comment on lines 592 to 594
// See if the node was notified but not received. In this case, the
// notification must be sent to another waiter.
// notification must be sent to another waiter, only if it was
// triggered via `notify_one`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might rephrase this to something like

            // See if the node was notified but not received. In this case, if
            // the notification was triggered via `notify_one`, it must be sent
            // to the next waiter.

Comment on lines 107 to 109
}
#[derive(Debug, Clone, Copy)]
enum NotificationType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i'd add a newline here (surprised that rustfmt doesn't do that?)

Suggested change
}
#[derive(Debug, Clone, Copy)]
enum NotificationType {
}
#[derive(Debug, Clone, Copy)]
enum NotificationType {

Signed-off-by: Zahari Dichev <zaharidichev@gmail.com>
@carllerche carllerche merged commit e7091fd into tokio-rs:master Sep 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-sync Module: tokio/sync
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants