Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rt: unhandled panic config for current thread rt #4518

Closed
wants to merge 5 commits into from

Conversation

carllerche
Copy link
Member

Allows the user to configure the runtime's behavior when a spawned task
panics. Currently, the panic is propagated to the JoinHandle and the
runtime resumes. This patch lets the user set the runtime to shutdown on
unhandled panic.

So far, this is only implemented for the current-thread runtime.

I will work on documentation next.

Refs: #4516

Allows the user to configure the runtime's behavior when a spawned task
panics. Currently, the panic is propagated to the JoinHandle and the
runtime resumes. This patch lets the user set the runtime to shutdown on
unhandled panic.

So far, this is only implemented for the current-thread runtime.

Refs: #4516
@github-actions github-actions bot added the R-loom Run loom tests on this PR label Feb 18, 2022
tokio/tests/rt_basic.rs Outdated Show resolved Hide resolved
@@ -473,13 +477,20 @@ fn poll_future<T: Future>(core: &CoreStage<T>, cx: Context<'_>) -> Poll<()> {
let output = match output {
Ok(Poll::Pending) => return Poll::Pending,
Ok(Poll::Ready(output)) => Ok(output),
Err(panic) => Err(JoinError::panic(panic)),
Err(panic) => {
scheduler.unhandled_panic();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only unhandled if the JoinHandle gets dropped, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"unhandled" refers to unhandled by the spawn task. As per #4516, two separate behaviors are to always shutdown if the task panics or to only shutdown if the JoinHandle is dropped. I implemented the first one here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think this option will be most useful for tests where the application is expected to gracefully handle all possible user inputs.

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime labels Feb 19, 2022
@carllerche carllerche marked this pull request as ready for review February 19, 2022 21:48
@carllerche
Copy link
Member Author

This should be ready to review.

@carllerche
Copy link
Member Author

Unless there are objections or known changes that should be made, I'd like to merge this as it is an unstable API.

@carllerche
Copy link
Member Author

carllerche commented Feb 22, 2022

I added initial support for LocalSet unhandled panics.

This raises some questions. One, LocalSet has no builder API. We probably should add one. Also, I have follow up things I want to investigate about LocalSet behavior, but I want to do that after this initial exploration PR lands.

For example, should LocalSet be UnwindSafe?

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this looks correct.

one thing i noticed that i'm not the biggest fan of is that unhandled panics are propagated by creating a new panic, rather than continuing the original panic. this feels not great to me, personally --- although the original panic will still be printed out (by the default panic handler) it won't be the last panic printed before the program exited/the test failed.

i understand the reason we don't continue the original panic is that it has to be stored in the task's JoinHandle. but, it seems like we could potentially come up with a scheme where the panic is stored in a cell that can be taken by either the JoinHandle or the runtime's unhandled panic handler, if the joinhandle doesn't consume the panic. this does seem a bit more complex though, and i would be fine with saving something like that for a follow-up PR.

self.owned.close_and_shutdown_all();
}
}
_ => panic!("runtime core not set in CURRENT thread-local"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this will only happen in the event of a Tokio bug, so this should probably be

Suggested change
_ => panic!("runtime core not set in CURRENT thread-local"),
_ => unreachable!("runtime core not set in CURRENT thread-local"),

Some(ret) => ret,
None => {
// `block_on` panicked.
panic!("a spawned task panicked and the runtime is configured to shutdown on unhandled panic");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's kind of a bummer that we just create a new panic here, rather than continuing the original panic...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I don't think its a big deal. The panic is still printed to stderr by the panic handler.

Comment on lines +640 to +642
// TODO: This should be set as a builder
Arc::get_mut(&mut self.context.shared)
.expect("TODO: we shouldn't panic")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to address this TODO before merging, or is it being saved for later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saving it for later. It is tracked as an "open question" in the original issue (related to LocalSet::builder().

tokio/src/task/local.rs Outdated Show resolved Hide resolved
@carllerche
Copy link
Member Author

@hawkw I considered this, but unless we never propagate the panic to the JoinHandle, the behavior becomes racy, which seems worse. I would think consistent behavior would be preferable. In this specific case, because the runtime always shuts down even if the JoinHandle is still outstanding, I could see an argument for saying the panic is propagated. However, how do you handle concurrent calls to block_on? In that case, multiple threads will panic but we can only forward the original panic to one thread.

Anyway, I will track the question in the original issue.

Co-authored-by: Eliza Weisman <eliza@buoyant.io>
@hawkw
Copy link
Member

hawkw commented Feb 24, 2022

However, how do you handle concurrent calls to block_on?

hmm, yeah, that's a good point; I was considering this specifically in the case of the current thread runtime; it does get a lot messier when the runtime is shared...

carllerche added a commit that referenced this pull request Jun 14, 2022
Extracts the refactor from #4518.

The basic scheduler takes many configuration options as arguments to the
constructor. This cleans it up a bit by defining a `Config` struct and
using that to pass arguments to the constructor.
carllerche added a commit that referenced this pull request Jun 15, 2022
Extracts the refactor from #4518.

The basic scheduler takes many configuration options as arguments to the
constructor. This cleans it up a bit by defining a `Config` struct and
using that to pass arguments to the constructor.
@carllerche
Copy link
Member Author

Closed in favor of #4770

@carllerche carllerche closed this Jun 16, 2022
@Darksonn Darksonn deleted the unhandled-panics branch June 16, 2022 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-runtime Module: tokio/runtime R-loom Run loom tests on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants