Rework binding of new tasks #3955

Darksonn · 2021-07-12T16:38:05Z

This PR reworks how new tasks are bound to the runtime they are spawned on. The gist of the PR is that the task is now bound to the runtime immediately as part of spawning the task, instead of being bound lazily on first poll. The PR also introduces a conceptual change for the meaning of the Notified type — that type is now "just" a notification, and someone who holds a Notified is never responsible for cleaning up the task.

To implement this, I make the following changes:

Remove the bind method from the Schedule trait.
Change the various schedule methods to just consume and drop the Notified when the runtime is shutting down, instead of reporting an error so the task can be shut down immediately when it is newly spawned.
Introduce an UnboundTask type, which is the type used by a task until it is bound to a scheduler.
Introduce a bind method on OwnedTasks that takes an UnboundTask and returns a Notified for that task.
Make an OwnedTasks closeable to prevent binding new tasks to it when the runtime is shutting down.

Darksonn · 2021-07-12T16:38:43Z

This is marked as a draft because it seems like I broke spawn_blocking, which I need to look in to.

Darksonn · 2021-07-13T11:11:11Z

The reason spawn_blocking was broken was that the blocking tasks were not bound to the NoopScheduler, so polling it failed. Now all conversions from UnboundTask involve binding a scheduer.

Darksonn

The review below mostly contains comments that are intended to be helpful for people reviewing this PR.

Darksonn · 2021-07-13T12:35:58Z

tokio/src/runtime/task/state.rs

 /// the `JoinHandle`. As the task starts with a `JoinHandle`, `JOIN_INTEREST` is
 /// set. A new task is immediately pushed into the run queue for execution and
 /// starts with the `NOTIFIED` flag set.
-const INITIAL_STATE: usize = (REF_ONE * 2) | JOIN_INTEREST | NOTIFIED;
+const INITIAL_STATE: usize = (REF_ONE * 3) | JOIN_INTEREST | NOTIFIED;


Before this change, the third ref-count is incremented when lazily binding the executor. Since we eagerly bind the executor now, that ref-count is here from the beginning.

Darksonn · 2021-07-13T12:36:55Z

tokio/src/runtime/task/state.rs

-    pub(super) fn transition_to_running(&self, ref_inc: bool) -> UpdateResult {
+    pub(super) fn transition_to_running(&self) -> UpdateResult {


This ref-count parameter was used only when lazily binding the scheduler.

Darksonn · 2021-07-13T12:38:21Z

tokio/src/runtime/tests/mod.rs

@@ -1,21 +1,27 @@
-#[cfg(not(all(tokio_unstable, feature = "tracing")))]


The changes in this file are to construct a method like tokio::runtime::task::joinable, except that it immediately binds the task to a NoopScheduler and returns a Notified rather than an UnboundTask. The method is used in tests of the inject queue only.

Darksonn · 2021-07-13T12:40:15Z

tokio/src/runtime/tests/task.rs

@@ -1,43 +1,209 @@
-use crate::runtime::task::{self, OwnedTasks, Schedule, Task};


This file has miri tests. Miri will catch memory leaks.

Darksonn · 2021-07-13T12:46:15Z

tokio/src/runtime/thread_pool/mod.rs

-            // is shutting down. The task must be explicitly shutdown at this point.
-            task.shutdown();
-        }
+        worker::Shared::bind_new_task(&self.shared, task);


The self: &Arc<Self> calling convention is not stable on MSRV, so use free-standing function instead.

Darksonn · 2021-07-13T12:55:11Z

tokio/tests/support/mock_file.rs

@@ -211,7 +211,7 @@ impl Read for &'_ File {
                assert!(dst.len() >= data.len());
                assert!(dst.len() <= 16 * 1024, "actual = {}", dst.len()); // max buffer

-                &mut dst[..data.len()].copy_from_slice(&data);
+                dst[..data.len()].copy_from_slice(&data);


This just fixes a warning from the nightly compiler.

Darksonn · 2021-07-13T13:00:25Z

tokio/src/runtime/thread_pool/worker.rs

+        // Close the OwnedTasks to prevent spawning new tasks during shutdown.
+        worker.shared.owned.close();
+


Should we do this when closing the inject queue?

Probably? I would guess we should close at the same point.

Also, having two "closed" states to update makes me uncomfortable. It seems like a potential source of bugs. Is there a way to have a single closed flag (either the owned set or the inject queue)?

What would happen if we only had shared.owned track closed instead of the inject queue? Then, when closing, there are no more tasks that can be spanwed, all existing tasks are "shutdown", then the queues drained? The inject queue should no longer be used I think.

Please see the comment in deefadc for why we need both close bits.

Darksonn · 2021-07-13T14:55:12Z

tokio/src/runtime/task/mod.rs

+    // This method is used by the blocking spawner.
+    pub(crate) fn into_notified(self, scheduler: S) -> Notified<S>
+    where
+        T: Send,


An unfortunate consequence of this is that we break the fast path in the destructor of join handles for blocking tasks.

Could you point to the fast path you are referencing?

Yes. The destructor of JoinHandle is here. The implementation of drop_join_handle_fast is here. This CAS will fail because the into_notified conversion decrements the ref-count, meaning that the task will no longer be in the initial state when the JoinHandle is returned to the user.

carllerche · 2021-07-13T23:11:51Z

tokio/src/runtime/task/mod.rs

@@ -50,20 +50,68 @@ unsafe impl<S> Sync for Task<S> {}
 #[repr(transparent)]
 pub(crate) struct Notified<S: 'static>(Task<S>);

+/// A task not yet bound to an executor. This object holds two ref-counts to
+/// the task to enable it to be split into two for free.
+pub(crate) struct UnboundTask<T, S: 'static> {


Could you remind me why we have UnboundTask vs. immediately bind in spawn?

It is the more direct change to do it like this, but I will think about whether I can remove it without changing too much code.

carllerche · 2021-07-14T23:47:16Z

tokio/src/runtime/task/inject.rs

-    /// if** it is a newly spawned task.
-    pub(crate) fn push(&self, task: task::Notified<T>) -> Result<(), task::Notified<T>> {
+    /// This does nothing if the queue is closed.
+    pub(crate) fn push(&self, task: task::Notified<T>) {


Why is the Result return not needed anymore?

I think I know, but I would like you to confirm it :)

It's because the pushed task may be a new task, and if the push fails for a new task, the caller needs to know so it can call shutdown on the task. With this PR, that's no longer necessary as binding the new task happens before pushing the notification, and if binding the task succeeded, then the runtime has been given responsibility for cleaning the new task up, even if submitting a notification for the task fails.

It is what I described with the following in the original PR text:

The PR also introduces a conceptual change for the meaning of the Notified type — that type is now "just" a notification, and someone who holds a Notified is never responsible for cleaning up the task.

carllerche · 2021-07-14T23:49:23Z

tokio/src/runtime/task/core.rs

@@ -107,7 +107,7 @@ impl<T: Future, S: Schedule> Cell<T, S> {
            },
            core: Core {
                scheduler: Scheduler {
-                    scheduler: UnsafeCell::new(None),
+                    scheduler: UnsafeCell::new(Some(scheduler)),


Is this ever None anymore?

If it is never None, you don't need to remove Option in this PR but we should clean it up in a follow up.

I intentionally did not include that in this PR because the diff is already really long, and it wasn't that simple to remove the None case.

carllerche · 2021-07-14T23:54:42Z

I will probably need to keep reviewing tomorrow. Could you write a bit about why task::joinable_local is no longer needed and how LocalOwnedTasks ensures the task can only be polled on the thread that owns the tasks.

Darksonn · 2021-07-15T07:34:42Z

The only difference between joinable and joinable_local is that the latter does not have a Send bound. In this PR, the Send vs non-Send bound is now found on OwnedTasks::bind and LocalOwnedTasks::bind methods instead.

The PR currently does not enforce that local tasks are polled on the right thread, since polling happens through Notified, which is always Send. However, the follow-up PR that makes OwnedTasks::remove safe by remembering which collection owns a task will also make polling of non-Send tasks safe using the same method (locking the mutex is not necessary to make this check). The distinction between OwnedTasks and LocalOwnedTasks will be necessary to make this part safe.

carllerche · 2021-07-15T20:14:02Z

tokio/src/runtime/handle.rs

@@ -213,7 +213,7 @@ impl Handle {
        #[cfg(not(all(tokio_unstable, feature = "tracing")))]
        let _ = name;

-        let (task, handle) = task::joinable(fut);
+        let (task, handle) = task::joinable(fut, NoopSchedule);


It looks like this is the only place task::joinable(...) is used still. Is that because there is no set of owned tasks here?

Yes, it's only used for blocking tasks and tests.

carllerche · 2021-07-15T20:16:40Z

tokio/src/runtime/task/mod.rs

        S: Schedule,
    {
-        let raw = RawTask::new::<_, S>(task);
+        let raw = RawTask::new::<_, S>(task, scheduler);
+        raw.header().state.ref_dec();


We should add a comment explaining what this ref_dec() is for. My guess is it is because there is no task set owning this task?

I would probably rename the joinable function to something like unowned(...) or something like that to make it more obvious what the fn is for now.

carllerche · 2021-07-15T20:26:20Z

tokio/src/runtime/task/state.rs

@@ -54,11 +54,11 @@ const REF_ONE: usize = 1 << REF_COUNT_SHIFT;

 /// State a task is initialized with
 ///
-/// A task is initialized with two references: one for the scheduler and one for
+/// A task is initialized with three references: two for the scheduler and one for


We can probably be more explicit here. "One for OwnedTasks (held by the scheduler), one for Notified<_> (used to poll the task), and one for JoinHandle.

or something.

carllerche · 2021-07-15T21:05:06Z

tokio/src/runtime/task/list.rs

+        T: Future + Send + 'static,
+        T::Output: Send + 'static,
+    {
+        let raw = RawTask::new::<T, S>(task, scheduler);


Looks like there is a bunch of duplication between the two bind fns (and task::joinable). If possible, it would be nice to unify some.

carllerche

Looks good! A nice step in the right direction. I couldn't find anything major. I left notes inline. I know there are follow-up PRs planned to further improve things.

carllerche

👍

runtime: rework binding of new tasks

fea12c5

Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Jul 12, 2021

Darksonn added 7 commits July 12, 2021 18:46

Fix runtime/tests

d74748d

Fix miri tests and rustfmt

935ee11

Fix InstrumentedFuture vs std::Future with tracing feature

7597e85

Bind scheduler on into_notified

48b6a80

Fix loom tests

dfbc538

rustfmt

75d5a00

Fix imports

17699f6

Darksonn marked this pull request as ready for review July 13, 2021 11:46

Improve comments

8f24c4e

Darksonn commented Jul 13, 2021

View reviewed changes

carllerche requested review from hawkw and udoprog July 13, 2021 22:12

carllerche reviewed Jul 13, 2021

View reviewed changes

Remove UnboundTask

2e39e4d

carllerche reviewed Jul 14, 2021

View reviewed changes

Darksonn added 2 commits July 15, 2021 10:03

fix miri tests

283298e

rustfmt

0bc8f20

carllerche reviewed Jul 15, 2021

View reviewed changes

carllerche approved these changes Jul 15, 2021

View reviewed changes

Darksonn added 2 commits July 16, 2021 11:41

Address reviews

deefadc

Add cfg_rt_multi_thread on is_closed

5fbdb24

carllerche approved these changes Jul 16, 2021

View reviewed changes

Merge branch 'master' into rework-task-binding

7f27166

Darksonn merged commit 2087f3e into master Jul 20, 2021

Darksonn deleted the rework-task-binding branch July 20, 2021 14:43

This was referenced Jul 21, 2021

Make scheduler non-optional #3980

Merged

Add owner id for tasks in OwnedTasks #3979

Merged

Prepare Tokio v1.9.0 #3961

Merged

softdevca mentioned this pull request Jul 22, 2021

Bump tokio from 1.8.2 to 1.9.0 softdevca/mootranscode#27

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework binding of new tasks #3955

Rework binding of new tasks #3955

Darksonn commented Jul 12, 2021

Darksonn commented Jul 12, 2021

Darksonn commented Jul 13, 2021 •

edited

Darksonn left a comment

Darksonn Jul 13, 2021

Darksonn Jul 13, 2021 •

edited

Darksonn Jul 13, 2021

Darksonn Jul 13, 2021

Darksonn Jul 13, 2021

Darksonn Jul 13, 2021

Darksonn Jul 13, 2021

carllerche Jul 15, 2021

Darksonn Jul 16, 2021

Darksonn Jul 13, 2021

carllerche Jul 13, 2021

Darksonn Jul 14, 2021

carllerche Jul 13, 2021

Darksonn Jul 14, 2021

carllerche Jul 14, 2021

carllerche Jul 14, 2021

Darksonn Jul 15, 2021

carllerche Jul 14, 2021

Darksonn Jul 15, 2021

carllerche commented Jul 14, 2021

Darksonn commented Jul 15, 2021 •

edited

carllerche Jul 15, 2021

Darksonn Jul 15, 2021

carllerche Jul 15, 2021

carllerche Jul 15, 2021

carllerche Jul 15, 2021

carllerche left a comment

carllerche left a comment

		pub(super) fn transition_to_running(&self, ref_inc: bool) -> UpdateResult {
		pub(super) fn transition_to_running(&self) -> UpdateResult {

		@@ -1,21 +1,27 @@
		#[cfg(not(all(tokio_unstable, feature = "tracing")))]

		@@ -1,43 +1,209 @@
		use crate::runtime::task::{self, OwnedTasks, Schedule, Task};

		// Close the OwnedTasks to prevent spawning new tasks during shutdown.
		worker.shared.owned.close();

Rework binding of new tasks #3955

Rework binding of new tasks #3955

Conversation

Darksonn commented Jul 12, 2021

Darksonn commented Jul 12, 2021

Darksonn commented Jul 13, 2021 • edited

Darksonn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Darksonn Jul 13, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carllerche commented Jul 14, 2021

Darksonn commented Jul 15, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carllerche left a comment

Choose a reason for hiding this comment

carllerche left a comment

Choose a reason for hiding this comment

Darksonn commented Jul 13, 2021 •

edited

Darksonn Jul 13, 2021 •

edited

Darksonn commented Jul 15, 2021 •

edited