Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove DCSS #4053

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
Open

Remove DCSS #4053

wants to merge 11 commits into from

Conversation

dkhalanskyjb
Copy link
Collaborator

@dkhalanskyjb dkhalanskyjb commented Feb 28, 2024

This removes the DCSS, in exchange for giving up some linearizability.

Below is an overview of the implementation of the changes.

  • Note: we have three classes of completion handlers.
    • Class 1: handlers that should be notified about completed.
    • Class 2: handlers that should be notified about cancelling or, if that doesn't happen, completed. They have onCancelling = true.
    • Class 3: handlers that should be notified about cancelling and completed. These are ChildHandleNode instances.
  • The only case where DCSS was used was the one where we already had a NodeList; not Empty, not a single node. Let's limit ourselves to this special case.
  • For a NodeList, we define a new operation, close, that forbids some of the incoming elements from entering. Which elements are forbidden is passed as a parameter.
  • We use the new operation in three ways:
    • close(completion) closes the list completely. No new state changes can happen after completion, and so no new handlers are accepted. close(completion) happens directly before completion notifications start to be sent out.
    • close(cancellation) closes the list for cancellation handlers. Children and completion handlers are still allowed to register their callbacks, as the state is not complete. close(cancellation) happens after a job is cancelled, just before notifications about cancellation are sent out.
    • close(children) closes the list for children specifically. This happens after we notice that all children already in the list are completed; after that, we double-check if some new children managed to enter the list between when we saw that there were no non-completed children and when we actually closed the list. If there were new children after all, we proceed even with the list closed.
  • Where DCSS ensured that inserting a handler into the list would fail if the state changed in the meantime, this doesn't happen any longer. Instead, now, handlers are only rejected if the notifications were already sent out.
  • Let's separately examine the two places where DCSS was used and make sure that ignoring extra state transitions is not harmful. Case 1 is one addLastAtomic usage (the one with less indentation), case 2 is the other one. First, here are the non-child cases:
    • Case 1a: state is Incomplete && !onCancelling. If the state changed from this to some other Incomplete state during insertion, interrupting it, we'd just have to restart the insertion. If the state changed to a non-Incomplete state, with the DCSS approach, this would mean that the handler can no longer be inserted into the list. With the new approach, there'd still be a time window between the state changing and the list closing, but the behavior is still linearizable, because there is no way to know from outside the Job that a new completion handler was installed even after the state was closed. If invokeOnCompletion happens-after the moment when we know that the state has changed to the final one, then invokeOnCompletion will fulfill its contract and run the computation inline, and no other thread could have observed the intermediate state of the completion handler not having been installed despite invokeOnCompletion winning the race with completion.
    • Case 1b: state is Incomplete && state !is Finishing && handler !is ChildHandleNode && onCancelling. We want to add the handler to the list so that it is notified about any sort of completed, including cancelled. In the worst case, in the meantime, the state could change to Finishing and even almost become completed without closing the list; then, this case transforms into 2a. Otherwise, the list was closed for class 2 (or even everyone), and in both cases, the handler should be rejected.
    • Case 2a: state is Finishing && onCancelling && rootCause == null && handler !is ChildHandleNode. This means it's not yet time to cancel anything, and we can safely add the node to the list. In the meantime, the state could change to the final one, but this just means that the handler will be called soon-ish. This specific behavior is linearizable: since non-child completion handlers are not exposed via the API, there's no way to observe from the outside that the completion handler didn't manage to register before the state finalized: all that matters is that the handler does get executed in the end.

Now, for how the children are handled. Because the child-related customization of invokeOnCompletion was too intrusive, attachChild no longer even calls invokeOnCompletion, instead keeping its own copy of the code directly in the body of attachChild.

  • Case 1c: state is Incomplete && state !is Finishing && handler is ChildHandleNode. In this case, we're interested in registering the child for cancelling and completed. Essentially, we try adding it to the list if it's not closed for children.
    • If the list is closed for all handlers, the completion handlers were already called and we can't enter the list. We can safely execute the logic that we normally would for attaching to completed jobs: return a NonDisposableHandle and, if the job completed with an exception, also invoke the handler with that exception.
    • If the list is closed only for children, it is still possible that the job didn't enter a final state and didn't invoke the completion handlers. The reason is that we forbid new children earlier than the state is finalized: in fact, it is possible that the list got closed even while there were some children stopping the job from completing. Because of that, when we fail to put a new child into the list, we can't just retry the operation, as that could spin-loop for an arbitrary amount of time. Still, we admit defeat, consider the state that we observe currently to be final (even though it could change later), and leave. This leads to the non-linearizable behavior below.
    • If adding a child succeeds, it's possible that the job is already in an exceptional state and we must notify the child about the cancellation. To avoid complex extra logic to support this, we do a small trick: first, we try to add the new child with an explicit cancelling permission, and only try adding it normally if that fails. If adding a child in time to be notified about cancellation fails but adding the child overall succeeds, we know that the job is cancelled: we witnessed a direct evidence of that.
  • Case 2b: state is Finishing && onCancelling && handler is ChildHandleNode && !state.isCompleting. In this case, we'd like to both put the child in the list and immediately notify it about exceptions (if there are any). With DCSS, if the state changed to be the final one, adding the element to the list would fail. We preserve this behavior by ensuring that the children are prohibited from entering the list even before the final state is reached, namely at the moment when children are queried.

In case 1c, the following non-linearizable behavior is possible:

  • Child A is added.
  • The parent starts waiting for the children.
  • Child A gets completed.
  • Child B enters the child list. (meanwhile, child A closes the parent's list)
  • Non-linearizable behavior 1: child C fails to be added, despite A being active.
  • Child B fails with an error.
  • Non-linearizable behavior 2: child D fails to be added and learns that the parent failed, despite C thinking that the parent completed normally.

This non-linearizable behavior is arguably not worse than what was there before this PR: see #3893, which this PR fixes.

@dkhalanskyjb
Copy link
Collaborator Author

Initially, I thought that 1a) also had new non-linearizable behavior: when isCompleted() was already observed to be true, we could still see new handlers being added. Now, however, I believe that behavior is still completely linearizable.

Let's look at the linearizability points of invokeOnCompletion in the case 1a). Crucially, we observed that the state was non-final, so at least some part of the operation happened while completion was in progress.

  • If we managed to add our new callback to the list, then the linearization point of the operation was back when we received the non-final state: logically, it was back then that we decided to add the callback. Since that moment, no observable-state-changing operations have happened: no other operation can observe that, since the moment we decided to add the callback to the moment it actually appeared on the list, it was missing.
  • If we didn't manage to add the callback, we restart, and the linearization point is decided by the next iteration, which will just be us observing that the state is now final.

@dkhalanskyjb
Copy link
Collaborator Author

There's a lot of new changes here.

  • Before, invokeOnCompletion contained the logic for both the normal handlers and the children. Now, this logic is split into two separate functions: attachChild gets its own copy of the invokeOnCompletion logic. This allows us to get rid of a few assertions, type casts, and other magic knowledge in attachChild of what's going on in invokeOnCompletion.
  • The contention is significantly reduced. Before, onCancelling handlers and children used to add themselves to the list while holding the lock on the state, ensuring that it couldn't become completing or obtain the cancellation root cause in the meantime. This is no longer done. For normal onCancelling handlers, the fact that they were added to the list means that someone will notify them; for children, we check whether there was an exception only after we add them to the list, and if there is one, we let the child know.
  • Inconsistent behaviour when child coroutine attaches to the parent during "completing" -> "completed" transition #3893 is fixed: now, children are recognized as a separate entity, and the moment they get forbidden from being added is the moment the parent thinks it has no more children to wait for. Either the parent waits for the child, or the child fails to be added.

@dkhalanskyjb
Copy link
Collaborator Author

Investigating the test failures.

@dkhalanskyjb dkhalanskyjb marked this pull request as ready for review March 26, 2024 13:11
@dkhalanskyjb dkhalanskyjb force-pushed the dk-remove-dcss-2 branch 3 times, most recently from dab2195 to 3f27c6f Compare March 26, 2024 13:13
This change is mostly a refactoring, except now, an arbitrary
`onCancelling` handler that's not a child will not add itself in
a `synchronized` block. Instead, only the root cause is read under
a lock.
Before this change, when children were prohibited from adding
themselves to the list, cancellation handlers were also prohibited
from doing so. This was plain incorrect, because a list that's
closed for new children could still get cancelled later, and also
because being closed for new children happened before advancing the
state to the final one.
`rootCause` is atomic anyway.
The failure went like this:
* A child arrives.
* In the meantime, the parent enters `tryMakeCompletingSlowPath`
  and remembers the current list of handlers, which is an empty
  or a single-element one.
* The parent updates the state to the finishing one.
* The child enters the list.
* The parent traverses *an old list*, the one from before the
  child arrived. It sees no children in the empty/single-element
  list and forgets about the child.

Why, then, was it that this worked before?

It was because there was a guarantee that no new children are going
to be registered if three conditions are true:
* The state of the `JobSupport` is a list,
* The root cause of the error is set to something,
* And the state is already "completing".

`tryMakeCompletingSlowPath` sets the state to completing, and
because it updates the state inside `synchronized`, there was a
guarantee that the child would see either the old state (and, if
it adds itself successfully, then `tryMakeCompletingSlowPath` will
retry) or the complete new one, with `isCompleting` and the error
set to something.

So, there could be no case when a child entered a *list*, but this
list was something different from what `tryMakeCompletingSlowPath`
stores in its state.
Copy link
Member

@qwwdfsad qwwdfsad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job!

I'm yet to internalize a few things and revisit some tests, but in general it looks ready.
I also checked a few our benchmarks and everything seems smooth

* Forbids adding new items to this list.
*/
public actual fun close(forbiddenElementsBit: Int) {
addLast(ListClosed(forbiddenElementsBit), forbiddenElementsBit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the close call ever fail? Maybe we can assert that?

)
if (addedBeforeCancellation) {
// The child managed to be added before the parent started to cancel or complete. Success.
true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Pretty subjective] return@tryPutNodeIntoList true and getting rid of later nesting would've simplified my understanding here.

I feel a deja vu regarding that though 😅

…List.kt

Co-authored-by: Vsevolod Tolstopyatov <qwwdfsad@gmail.com>
if (child != null && tryWaitForChild(finishing, child, proposedUpdate))
return COMPLETING_WAITING_CHILDREN
list.close(LIST_CHILD_PERMISSION)
Copy link
Member

@qwwdfsad qwwdfsad May 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on (and leave a comment in the code) about this sequence?

It's unclear why we first wait a child, then close a list, and then wait for the rest if the child was finished; I would naturally expect "close + wait" sequence

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you first close and then wait, then you'll forbid adding new children as soon as the job becomes completing, even if we have to wait arbitrarily long after that for the existing children to complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants