-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync: fix notify_waiters
notifying sequential awaits
#5404
Changes from all commits
2056c5b
2786d9e
c987491
170894e
407986c
c51ee21
0c3d6f1
98cfdb3
4735283
9b56e74
572db8d
a430f98
b1b23f9
252f2e8
5a36e1b
5da293e
f75f8b1
712566b
dbac1c7
f5c7816
063c426
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -198,10 +198,16 @@ type WaitList = LinkedList<Waiter, <Waiter as linked_list::Link>::Target>; | |
/// [`Semaphore`]: crate::sync::Semaphore | ||
#[derive(Debug)] | ||
pub struct Notify { | ||
// This uses 2 bits to store one of `EMPTY`, | ||
// `state` uses 2 bits to store one of `EMPTY`, | ||
// `WAITING` or `NOTIFIED`. The rest of the bits | ||
// are used to store the number of times `notify_waiters` | ||
// was called. | ||
// | ||
// Throughout the code there are two assumptions: | ||
// - state can be transitioned *from* `WAITING` only if | ||
// `waiters` lock is held | ||
// - number of times `notify_waiters` was called can | ||
// be modified only if `waiters` lock is held | ||
state: AtomicUsize, | ||
waiters: Mutex<WaitList>, | ||
} | ||
|
@@ -222,8 +228,13 @@ struct Waiter { | |
/// Waiting task's waker. | ||
waker: Option<Waker>, | ||
|
||
/// `true` if the notification has been assigned to this waiter. | ||
notified: Option<NotificationType>, | ||
/// Pointer to a containing queue decoupled in `notify_waiters`. | ||
/// This field is `None` if the waiter is stored in the waitlist | ||
/// `Notify::waiters` list, and `Some` if the waiter is stored | ||
/// in some other list owned by a `notify_waiters` call. | ||
notify_waiters_queue: Option<NonNull<WaitList>>, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As far as I can tell, the invariant of this field is that it is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True, I've added a note about this. |
||
|
||
notification: Option<NotificationType>, | ||
|
||
/// Should not be `Unpin`. | ||
_p: PhantomPinned, | ||
|
@@ -237,6 +248,9 @@ generate_addr_of_methods! { | |
} | ||
} | ||
|
||
unsafe impl Send for Waiter {} | ||
unsafe impl Sync for Waiter {} | ||
|
||
/// Future returned from [`Notify::notified()`]. | ||
/// | ||
/// This future is fused, so once it has completed, any future calls to poll | ||
|
@@ -258,6 +272,8 @@ unsafe impl<'a> Sync for Notified<'a> {} | |
|
||
#[derive(Debug)] | ||
enum State { | ||
// Contains the lowest `notify_waiters` call number which should | ||
// notify this waiter. | ||
Init(usize), | ||
Waiting, | ||
Done, | ||
|
@@ -387,11 +403,12 @@ impl Notify { | |
let state = self.state.load(SeqCst); | ||
Notified { | ||
notify: self, | ||
state: State::Init(state >> NOTIFY_WAITERS_SHIFT), | ||
state: State::Init(get_num_notify_waiters_calls(state)), | ||
waiter: UnsafeCell::new(Waiter { | ||
pointers: linked_list::Pointers::new(), | ||
waker: None, | ||
notified: None, | ||
notify_waiters_queue: None, | ||
notification: None, | ||
_p: PhantomPinned, | ||
}), | ||
} | ||
|
@@ -500,12 +517,9 @@ impl Notify { | |
/// } | ||
/// ``` | ||
pub fn notify_waiters(&self) { | ||
let mut wakers = WakeList::new(); | ||
|
||
// There are waiters, the lock must be acquired to notify. | ||
let mut waiters = self.waiters.lock(); | ||
|
||
// The state must be reloaded while the lock is held. The state may only | ||
// The state must be loaded while the lock is held. The state may only | ||
// transition out of WAITING while the lock is held. | ||
let curr = self.state.load(SeqCst); | ||
|
||
|
@@ -516,30 +530,76 @@ impl Notify { | |
return; | ||
} | ||
|
||
// At this point, it is guaranteed that the state will not | ||
// concurrently change, as holding the lock is required to | ||
// transition **out** of `WAITING`. | ||
'outer: loop { | ||
while wakers.can_push() { | ||
match waiters.pop_back() { | ||
Some(mut waiter) => { | ||
// Safety: `waiters` lock is still held. | ||
let waiter = unsafe { waiter.as_mut() }; | ||
|
||
assert!(waiter.notified.is_none()); | ||
// Increment the number of times this method was called | ||
// and transition to empty. | ||
let new_state = set_state(inc_num_notify_waiters_calls(curr), EMPTY); | ||
self.state.store(new_state, SeqCst); | ||
|
||
waiter.notified = Some(NotificationType::AllWaiters); | ||
// We may decouple the waiters list and store it on the stack to ensure | ||
// atomicity. This variable binding is shadowed by a reference to it | ||
// to prevent it from moving. | ||
let mut decoupled_list: Option<UnsafeCell<WaitList>> = None; | ||
let decoupled_list = &mut decoupled_list; | ||
|
||
if let Some(waker) = waiter.waker.take() { | ||
wakers.push(waker); | ||
let mut wakers = WakeList::new(); | ||
'outer: loop { | ||
{ | ||
let queue = if let Some(decoupled) = decoupled_list.as_ref() { | ||
// Safety: we hold the `waiters` lock, so we can | ||
// mutably borrow the decoupled queue. | ||
unsafe { &mut *decoupled.get() } | ||
} else { | ||
&mut *waiters | ||
}; | ||
|
||
while wakers.can_push() { | ||
match queue.pop_back() { | ||
Some(mut waiter) => { | ||
// Safety: `waiters` lock is still held. | ||
let waiter = unsafe { waiter.as_mut() }; | ||
|
||
assert!(waiter.notification.is_none()); | ||
|
||
waiter.notification = Some(NotificationType::AllWaiters); | ||
waiter.notify_waiters_queue = None; | ||
|
||
if let Some(waker) = waiter.waker.take() { | ||
wakers.push(waker); | ||
} | ||
} | ||
None => { | ||
break 'outer; | ||
} | ||
} | ||
None => { | ||
break 'outer; | ||
} | ||
} | ||
} | ||
|
||
// If there are more batches than one, decouple the list before releasing | ||
// the lock to provide atomicity. Decoupled list will still be protected | ||
// by the `waiters` lock. | ||
if decoupled_list.is_none() { | ||
// Store the list directly in the stack variable. | ||
*decoupled_list = Some(UnsafeCell::new(std::mem::take(&mut *waiters))); | ||
|
||
// We inform every waiter that the list it is stored in has been moved by | ||
// storing a raw pointer to the list. The list is not moved from the stack | ||
// of this function call, so every pointer will remain valid. | ||
// Safety: pointer to the `decoupled_list` is not null. | ||
let list_ptr = | ||
unsafe { NonNull::new_unchecked(decoupled_list.as_ref().unwrap().get()) }; | ||
|
||
// Safety: we hold the `waiters` lock, so we can borrow the decoupled list. | ||
let list_ref = unsafe { list_ptr.as_ref() }; | ||
|
||
for mut waiter in list_ref { | ||
// Safety: we hold the `waiters` lock. | ||
let waiter = unsafe { waiter.as_mut() }; | ||
waiter.notify_waiters_queue = Some(list_ptr); | ||
} | ||
} | ||
|
||
// Release the lock before notifying. | ||
// There are no longer any borrows of the list inside `decoupled_list` cell. | ||
drop(waiters); | ||
|
||
wakers.wake_all(); | ||
|
@@ -548,12 +608,6 @@ impl Notify { | |
waiters = self.waiters.lock(); | ||
} | ||
|
||
// All waiters will be notified, the state must be transitioned to | ||
// `EMPTY`. As transitioning **from** `WAITING` requires the lock to be | ||
// held, a `store` is sufficient. | ||
let new = set_state(inc_num_notify_waiters_calls(curr), EMPTY); | ||
self.state.store(new, SeqCst); | ||
|
||
// Release the lock before notifying | ||
drop(waiters); | ||
|
||
|
@@ -597,9 +651,9 @@ fn notify_locked(waiters: &mut WaitList, state: &AtomicUsize, curr: usize) -> Op | |
// Safety: `waiters` lock is still held. | ||
let waiter = unsafe { waiter.as_mut() }; | ||
|
||
assert!(waiter.notified.is_none()); | ||
assert!(waiter.notification.is_none()); | ||
|
||
waiter.notified = Some(NotificationType::OneWaiter); | ||
waiter.notification = Some(NotificationType::OneWaiter); | ||
let waker = waiter.waker.take(); | ||
|
||
if waiters.is_empty() { | ||
|
@@ -766,6 +820,13 @@ impl Notified<'_> { | |
return Poll::Ready(()); | ||
} | ||
|
||
// Optimistically check if notify_waiters has been called | ||
// after the future was created. | ||
if get_num_notify_waiters_calls(curr) != initial_notify_waiters_calls { | ||
*state = Done; | ||
return Poll::Ready(()); | ||
} | ||
|
||
// Clone the waker before locking, a waker clone can be | ||
// triggering arbitrary code. | ||
let waker = waker.cloned(); | ||
|
@@ -777,8 +838,7 @@ impl Notified<'_> { | |
// Reload the state with the lock held | ||
let mut curr = notify.state.load(SeqCst); | ||
|
||
// if notify_waiters has been called after the future | ||
// was created, then we are done | ||
// Check again if notify_waiters has been called in the meantime. | ||
if get_num_notify_waiters_calls(curr) != initial_notify_waiters_calls { | ||
*state = Done; | ||
return Poll::Ready(()); | ||
|
@@ -856,11 +916,28 @@ impl Notified<'_> { | |
// Safety: called while locked | ||
let w = unsafe { &mut *waiter.get() }; | ||
|
||
if w.notified.is_some() { | ||
// Our waker has been notified. Reset the fields and | ||
// remove it from the list. | ||
w.waker = None; | ||
w.notified = None; | ||
if w.notification.is_some() { | ||
// Our waker has been notified and our waiter is already removed from | ||
// the list. Reset the notification and convert to `Done`. | ||
w.notification = None; | ||
*state = Done; | ||
} else if w.notify_waiters_queue.is_some() { | ||
// There is a call to `notify_waiters` in progress. Since we already | ||
// have the lock, remove our entry from the waiter list. | ||
|
||
w.waker.take(); | ||
|
||
let mut decoupled_list_ptr = w.notify_waiters_queue.take().unwrap(); | ||
|
||
// as_mut safety: we hold the `waiters` lock, so we can mutably | ||
// borrow the decoupled list. | ||
// remove safety: the waiter *MUST* be stored in the `decoupled_list` | ||
// because it had a pointer to it. | ||
unsafe { | ||
decoupled_list_ptr | ||
.as_mut() | ||
.remove(NonNull::new_unchecked(w)) | ||
}; | ||
|
||
*state = Done; | ||
} else { | ||
|
@@ -913,32 +990,45 @@ impl Drop for Notified<'_> { | |
// longer stored in the linked list. | ||
if matches!(*state, Waiting) { | ||
let mut waiters = notify.waiters.lock(); | ||
let mut notify_state = notify.state.load(SeqCst); | ||
|
||
// remove the entry from the list (if not already removed) | ||
// | ||
// safety: the waiter is only added to `waiters` by virtue of it | ||
// being the only `LinkedList` available to the type. | ||
unsafe { waiters.remove(NonNull::new_unchecked(waiter.get())) }; | ||
// Safety: called while locked | ||
let w = unsafe { &mut *waiter.get() }; | ||
|
||
if waiters.is_empty() && get_state(notify_state) == WAITING { | ||
notify_state = set_state(notify_state, EMPTY); | ||
notify.state.store(notify_state, SeqCst); | ||
} | ||
if let Some(mut decoupled_list_ptr) = w.notify_waiters_queue.take() { | ||
// We hold the `waiters` lock. | ||
let decoupled_list = unsafe { decoupled_list_ptr.as_mut() }; | ||
|
||
// See if the node was notified but not received. In this case, if | ||
// the notification was triggered via `notify_one`, it must be sent | ||
// to the next waiter. | ||
// | ||
// Safety: with the entry removed from the linked list, there can be | ||
// no concurrent access to the entry | ||
if matches!( | ||
unsafe { (*waiter.get()).notified }, | ||
Some(NotificationType::OneWaiter) | ||
) { | ||
if let Some(waker) = notify_locked(&mut waiters, ¬ify.state, notify_state) { | ||
drop(waiters); | ||
waker.wake(); | ||
// safety: the waiter *MUST* be stored in the `decoupled_list` | ||
// because it had a pointer to it. | ||
unsafe { decoupled_list.remove(NonNull::new_unchecked(w)) }; | ||
} else { | ||
let mut notify_state = notify.state.load(SeqCst); | ||
|
||
// remove the entry from the list (if not already removed) | ||
// | ||
// safety: the waiter must be stored in `waiters` because it does | ||
// not have pointer to any other linked list. | ||
unsafe { waiters.remove(NonNull::new_unchecked(w)) }; | ||
|
||
if waiters.is_empty() && get_state(notify_state) == WAITING { | ||
notify_state = set_state(notify_state, EMPTY); | ||
notify.state.store(notify_state, SeqCst); | ||
} | ||
|
||
// See if the node was notified but not received. In this case, if | ||
// the notification was triggered via `notify_one`, it must be sent | ||
// to the next waiter. | ||
// | ||
// Safety: with the entry removed from the linked list, there can be | ||
// no concurrent access to the entry | ||
if matches!( | ||
unsafe { (*waiter.get()).notification }, | ||
Some(NotificationType::OneWaiter) | ||
) { | ||
if let Some(waker) = notify_locked(&mut waiters, ¬ify.state, notify_state) { | ||
drop(waiters); | ||
waker.wake(); | ||
} | ||
} | ||
Comment on lines
+1018
to
1032
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can also cause weird things ... I know this was already here, and maybe the answer is to just not change it. Imagine this sequence of actions:
This is weird since the futures that completed were created after the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought about this for a while and I think that's a tough problem. The issue is that we would like to think about cancelling a However, we can pass the permit from a future A to a waiting future B if and only if at any point between receiving the permit by A and pushing B to the queue the state of Unfortunately, the problem is deeper than that, and your example demonstrates this. We can drop multiple futures, and simulating that one of them was removed from the queue before receiving the permit will affect how we simulate the same for others. I like to think about it the following way: let The issue is that dropping a future enabled in the To sum up, we would need a complicated data structure to effectively track whether we should transfer the permit from a dropped future, for example a deque or some sort of segment tree, which probably is not feasible here. I think the best we can do is the heuristic I've mentioned earlier, but it doesn't even fix your case, so in my opinion we should probably leave this code as it is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe that we ran into a similar problem during a discussion about adding a condition variable to Tokio. To me, it seems that the best option is to just document that dropping a future that has consumed a permit from a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, I will be happy to open a PR improving the docs. |
||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed this field because it was confusing to me because of the similarly named future.