Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rt: Allow concurrent block_on's with basic_scheduler #2804

Merged
merged 12 commits into from Sep 23, 2020

Conversation

LucioFranco
Copy link
Member

@LucioFranco LucioFranco commented Aug 31, 2020

This PR changes how Runtime::block_on works when the basic_scheduler is selected. Before this change, we used a mutex to guard when entering a block_on with a basic_scheduler which meant that we could never call block_on concurrently event hos this was totally possible. This PR changes that to allow us to call block_on concurrently. This also allows multiple block on's to "steal" the driver/parker. Doing this by sharing the ParkThread condvar and allowing us to notify that thread to check for the parker to allow moving it forward.

The stealing works by the first thread to call Runtime::block_on will acquire the driver. Every other call to block_on while the first block_on is still running, will not acquire the driver but will instead enter the runtime context and attempt to poll the passed in future. The parker is slightly modified for this case to allow us to provide our own Condvar that we will use once the initial block_on finishes to notify the second/other threads that they can now steal the driver.

Related to #2720

@Darksonn Darksonn added A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-runtime Module: tokio/runtime labels Aug 31, 2020
@LucioFranco LucioFranco force-pushed the lucio/refactor-basic-shell branch 4 times, most recently from 6d0e01b to db92967 Compare August 31, 2020 21:38
This allows us to concurrently call `Runtime::block_on` with the basic_scheduler
and allowing other threads to steal the dedicated parker.
@LucioFranco LucioFranco marked this pull request as ready for review August 31, 2020 21:43
@LucioFranco LucioFranco requested review from a team and jonhoo August 31, 2020 21:43
tokio/src/park/thread.rs Outdated Show resolved Hide resolved
@hawkw
Copy link
Member

hawkw commented Sep 2, 2020

The stealing works by the first thread to call Runtime::block_on will acquire the driver. Every other call to block_on while the first block_on is still running, will not acquire the driver but will instead enter the runtime context and attempt to poll the passed in future. The parker is slightly modified for this case to allow us to provide our own Condvar that we will use once the initial block_on finishes to notify the second/other threads that they can now steal the driver.

At one point when we talked about this, I thought you said that the solution was to just force the initial block_on to continue running the driver until all other block_ons finish, and then return that future's output — am i misremembering, or just wrong?

tokio/src/park/thread.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
Comment on lines 163 to 164
// TODO: Consider using an atomic load here intead of locking
// the mutex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this shouldn't be too tricky if we just ensure that the state is only changed while inside a lock.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to punt this to the next PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, that's fine, i was just mentioning this for whenever you get to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but when we do this, we probably want to define some "cell" type in src/util.

tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
@LucioFranco
Copy link
Member Author

At one point when we talked about this, I thought you said that the solution was to just force the initial block_on to continue running the driver until all other block_ons finish, and then return that future's output — am i misremembering, or just wrong?

No, we wanted to actually move the driver between threads. I think that would be odd to have the block_on run longer than expected? That feels like an implicit dependency that would be pretty surprising.

@LucioFranco
Copy link
Member Author

@carllerche and I had a chat last week, I have some tweeks I am going to change this. I want to see if we can get away with a LinkedList style approach for notifying other threads instead of going via the condvar/mutex pair.

@hawkw
Copy link
Member

hawkw commented Sep 8, 2020

At one point when we talked about this, I thought you said that the solution was to just force the initial block_on to continue running the driver until all other block_ons finish, and then return that future's output — am i misremembering, or just wrong?

No, we wanted to actually move the driver between threads. I think that would be odd to have the block_on run longer than expected? That feels like an implicit dependency that would be pretty surprising.

Yeah, I think this is better, I just wasn't sure if I remembered our chat from earlier. Carry on!

@LucioFranco
Copy link
Member Author

Okay, I've refactored this to use a VecDeque<Waker> instead of a Condvar/Mutex pair. I've also addressed issues around double panics.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we need to rework the way we're handling mutex poisoning here. i think we just want to always put everything back regardless of whether the mutex is poisoned. we can use PoisonError::into_inner to get a useable guard out of a poisoned mutex, ignoring the poison. i think we should do that here.

otherwise, this all seems fine to me.

tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
Copy link
Member

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work. I left comments.

I mentioned in the original issue that I thought this could be done mostly by working with a Park layer, but we can punt that to another PR.

tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
Copy link
Member

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I think we are almost there. Left some notes.

tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
tokio/src/runtime/basic_scheduler.rs Outdated Show resolved Hide resolved
Copy link
Member

@carllerche carllerche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

@LucioFranco LucioFranco merged commit f25f12d into master Sep 23, 2020
@LucioFranco LucioFranco deleted the lucio/refactor-basic-shell branch September 23, 2020 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-runtime Module: tokio/runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants