Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundness issues and edge cases for scoped_async_spawn #6

Open
Lej77 opened this issue Nov 5, 2022 · 4 comments
Open

Soundness issues and edge cases for scoped_async_spawn #6

Lej77 opened this issue Nov 5, 2022 · 4 comments

Comments

@Lej77
Copy link
Contributor

Lej77 commented Nov 5, 2022

I reviewed the scoped_async_spawn crate after reading the blog post Scoped threads in Rust, and why its async counterpart would be unsound. I found a couple of soundness issues related to panics and some things that probably should be documented.

I will briefly describe each issue I found and suggest what should be done about it. You can find the code for the test cases at the end of this post.

incorrect_not_in_scope test case

This problem does not cause any soundness issues, but it is somewhat unexpected. If a SpawnGuard is not held over an await point then it will be stored on the current thread's stack instead of inside a future. This causes the SpawnGuard::convert_future method to panic at line 46 of async_ui/guard.rs since the check_scope function returns false.

Suggested solution: document this issue or try to allow futures stored on the current thread's stack (,store a stack address when entering a scope and check if a pointer is between that stored stack address and the current stack address).

failed_abort_recursion test case

<PinCell<Inner<F>> as InnerTrait>::abort can fail when calling RefCell::borrow_mut at line 61 of async_ui/common.rs. This causes a panic which prevents the SpawnGuard from aborting other futures .

Suggested solution: abort the process if SpawnGuard's Drop implementation panics.

failed_abort_panic test case

<PinCell<Inner<F>> as InnerTrait>::abort can fail when calling Pin::set at line 63 of async_ui/common.rs if the overwritten future type's Drop implementation panics. This panic prevents the SpawnGuard from aborting other futures.

Suggested solution: abort the process if SpawnGuard's Drop implementation panics.

forgettable_stack_allocations test case

The soundness of the scoped_async_spawn crate relies on the fact that you can't forget values that are pinned to the stack. This assumption is true for normal stack pinning via pin_mut and is also upheld by pin projections (AFAIK). Still the assumption isn't guaranteed by std's documentation and I think it would be sound for another crate to implement something like the StackAlloc type in my test case which allows creating Box-like types that store their data on the stack. Such crates would break the assumption that the scoped_async_spawn crate relies on and cause soundness issues.

Suggested solution: document that the scoped_async_spawn crate is incompatible with stack allocations that can be forgotten.

Code for test cases

Click here to show/hide the code
use std::cell::{Cell, UnsafeCell};
use std::future::Future;
use std::marker::PhantomPinned;
use std::mem::{ManuallyDrop, MaybeUninit};
use std::ops::{Deref, DerefMut};
use std::pin::Pin;
use std::sync::{mpsc, Arc};
use std::task::{Context, Poll, Wake, Waker};

use pin_utils::pin_mut;
use scoped_async_spawn::{GiveUnforgettableScope, SpawnGuard};

fn poll_once<T>(f: impl Future<Output = T>) -> Option<T> {
    struct NullWaker;
    impl Wake for NullWaker {
        fn wake(self: Arc<Self>) {}
    }

    let waker = Waker::from(Arc::new(NullWaker));
    let mut context = Context::from_waker(&waker);
    pin_mut!(f);
    match f.poll(&mut context) {
        Poll::Ready(v) => Some(v),
        Poll::Pending => None,
    }
}

fn with_context<R>(f: impl Unpin + FnMut(&mut Context<'_>) -> Poll<R>) -> impl Future<Output = R> {
    struct WithContext<F>(F);
    impl<F, R> Future for WithContext<F>
    where
        F: FnMut(&mut Context<'_>) -> Poll<R> + Unpin,
    {
        type Output = R;

        fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
            (self.0)(cx)
        }
    }
    WithContext(f)
}

#[derive(Debug)]
struct IsDropped {
    was_dropped: bool,
}
impl IsDropped {
    fn new() -> Self {
        Self { was_dropped: false }
    }
}
impl Drop for IsDropped {
    fn drop(&mut self) {
        self.was_dropped = true;
    }
}

/// This will panic with "Not in scope." despite the fact that the guard is
/// inside a scope.
///
/// The reason for this is because there is no yield/await points inside the
/// future so the `guard` is stored on the thread's stack instead of inside the
/// future's memory. Therefore the `Pointer::contains` method will return
/// `false` since the thread stack is not stored inside the
/// `GiveUnforgettableScope` future.
#[test]
#[should_panic]
fn incorrect_not_in_scope() {
    poll_once(GiveUnforgettableScope::new_static(async move {
        let guard = SpawnGuard::new();
        pin_mut!(guard);
        let _ = guard.convert_future(async {});
    }));
}

/// When `SpawnGuard` is dropped it attempts to abort all "converted" futures.
/// Unfortunately this can fail if those futures are being polled when
/// `SpawnGuard` is dropped. This will cause a panic which allows the caller to
/// observe dropped state by catching that panic or by doing things in `Drop`
/// impls.
#[test]
fn failed_abort_recursion() {
    poll_once(async {
        let (tx_static, rx_static) = mpsc::channel();
        let (tx_scoped, rx_scoped) = mpsc::channel();
        let scoped = GiveUnforgettableScope::new_static(async move {
            let data = String::from("some data that might be freed");
            let data2 = IsDropped::new();
            {
                let guard = SpawnGuard::new();
                pin_mut!(guard);
                let static_fut = guard.convert_future(async {
                    let scoped = std::panic::AssertUnwindSafe(
                        rx_scoped.try_recv().expect("failed to recv scope future"),
                    );

                    println!("{data2:?}");
                    assert!(!data2.was_dropped, "not dropped yet, all is well!");
                    assert!(
                        std::panic::catch_unwind(|| drop(scoped)).is_err(),
                        "the SpawnGuard should have panicked when it tried to abort this future."
                    );

                    println!("{data}");
                    println!("{data2:?}");
                    assert!(
                        data2.was_dropped,
                        "observing data after it was dropped, not good!"
                    );
                });
                tx_static.send(static_fut).expect("failed to send future");
                // Yield to delay drop of `guard`:
                with_context::<()>(|_| Poll::Pending).await;
            }
            // Ensure the data is dropped when this future is dropped:
            drop((data, data2));
        });
        let mut scoped = Box::pin(scoped) as Pin<Box<dyn Future<Output = ()>>>;
        poll_once(&mut scoped);
        tx_scoped.send(scoped).expect("failed to send scope future");
        let static_fut = rx_static.try_recv().expect("failed to recv future");
        static_fut.await;
    });
}

/// When `SpawnGuard` is dropped it attempts to abort all "converted" futures.
/// Unfortunately this can fail if the `SpawnGuard` needs to drop multiple
/// futures and one of those futures panic when it is dropped.
#[test]
fn failed_abort_panic() {
    struct EvilWasDropped;

    struct Evil;
    impl Drop for Evil {
        fn drop(&mut self) {
            // Custom panic payload so we can catch it later:
            std::panic::resume_unwind(Box::new(EvilWasDropped));
        }
    }

    struct OnDrop<S, F: FnMut(&mut S)>(S, F);
    impl<S, F: FnMut(&mut S)> Drop for OnDrop<S, F> {
        fn drop(&mut self) {
            (self.1)(&mut self.0)
        }
    }

    match std::panic::catch_unwind(|| {
        poll_once(GiveUnforgettableScope::new_static(async move {
            let delay_drop;
            let delay_evil_drop;
            {
                let data = String::from("some data that might be freed");
                let data2 = IsDropped::new();
                {
                    let guard = SpawnGuard::new();
                    pin_mut!(guard);

                    // Store an evil future first into the guard:
                    println!("adding evil future");
                    let evil = Evil;
                    delay_evil_drop = guard.as_mut().convert_future(async move {
                        drop(evil);
                    });

                    // Then a future that will therefore not be aborted:
                    println!("adding future that should be leaked");
                    delay_drop = OnDrop(
                        guard.convert_future(async {
                            println!("{data}");
                            println!("{data2:?}");
                            assert!(
                                data2.was_dropped,
                                "observing data after it was dropped, not good!"
                            );
                        }),
                        |static_fut| {
                            // Called when `delay_drop` is dropped (even if we are panicking!).
                            println!("static future was dropped");
                            assert!(
                                std::thread::panicking(),
                                "the current thread should already be panicking!"
                            );
                            poll_once(static_fut);
                        },
                    );

                    // Yield to ensure guard is stored inside parent future:
                    with_context(|_| Poll::Ready(())).await;
                    println!("SpawnGuard is about to be dropped");
                }
                // Ensure the data is stored in the outer future:
                drop((data, data2));
            }
            drop((delay_drop, delay_evil_drop));
        }))
    }) {
        Ok(Some(())) => panic!("should have dropped Evil"),
        Ok(None) => panic!("should have completed all async tasks"),
        Err(e) if e.is::<EvilWasDropped>() => (),
        Err(e) => std::panic::resume_unwind(e),
    }
}

/// This crate relies on the assumption that stack allocated values can't be
/// forgotten. This is not documented as a guarantee for the `Pin` type but it
/// seems to be a property that is upheld by most crates.
///
/// This test demonstrates how this crate is unsound if forgettable stack
/// allocations are allowed.
///
/// # Forgettable stack allocation
///
/// Pin a value to the stack and then forget it. We don't free/reuse/invalidate
/// the memory of the value though so this should be allowed! It is the same
/// thing that `Box::pin` and `Box::leak` allows after all.
#[test]
fn forgettable_stack_allocations() {
    struct StackAlloc<const N: usize> {
        data: UnsafeCell<[MaybeUninit<u8>; N]>,
        used_length: Cell<usize>,
        pinned_count: Cell<usize>,
        _phantom_pin: PhantomPinned,
    }
    impl<const N: usize> StackAlloc<N> {
        #[inline(always)]
        pub const fn new() -> Self {
            Self {
                data: UnsafeCell::new([MaybeUninit::uninit(); N]),
                used_length: Cell::new(0),
                pinned_count: Cell::new(0),
                _phantom_pin: PhantomPinned,
            }
        }
        fn ptr(&self) -> *mut u8 {
            self.data.get().cast()
        }
        pub fn alloc<T>(&self, value: T) -> StackBox<'_, T> {
            let used_len = self.used_length.get();
            let first_free = unsafe { self.ptr().add(used_len) };

            let align_fix = first_free.align_offset(std::mem::align_of::<T>());
            let new_used_len = used_len
                .saturating_add(align_fix)
                .saturating_add(std::mem::size_of::<T>());
            if new_used_len > N || new_used_len >= isize::MAX as usize {
                panic!("StackAlloc: out of memory");
            }

            let value_ptr = unsafe { first_free.add(align_fix) }.cast::<T>();
            unsafe { value_ptr.write(value) };
            self.used_length.set(new_used_len);
            StackBox {
                data: unsafe { &mut *value_ptr.cast::<ManuallyDrop<T>>() },
                pinned_count: None,
            }
        }
        pub fn alloc_pinned<T>(self: Pin<&Self>, value: T) -> Pin<StackBox<'_, T>> {
            let mut value = self.get_ref().alloc(value);

            // Track pinned allocations, so that we don't free the backing memory if
            // they are leaked:
            self.pinned_count
                .set(self.pinned_count.get().checked_add(1).unwrap());
            value.pinned_count = Some(&self.get_ref().pinned_count);

            // Safety: we don't move the allocation and are careful to uphold the
            // drop guarantees.
            unsafe { Pin::new_unchecked(value) }
        }
    }
    impl<const N: usize> Drop for StackAlloc<N> {
        fn drop(&mut self) {
            if self.pinned_count.get() != 0 {
                struct AbortBomb;
                impl Drop for AbortBomb {
                    fn drop(&mut self) {
                        std::process::abort();
                    }
                }

                let _bomb = AbortBomb;
                panic!(
                    "Aborting process because {} pinned stack allocation(s) were leaked.",
                    self.pinned_count.get()
                );
            }
        }
    }

    struct StackBox<'a, T> {
        data: &'a mut ManuallyDrop<T>,
        pinned_count: Option<&'a Cell<usize>>,
    }
    impl<T> Deref for StackBox<'_, T> {
        type Target = T;

        fn deref(&self) -> &Self::Target {
            self.data
        }
    }
    impl<T> DerefMut for StackBox<'_, T> {
        fn deref_mut(&mut self) -> &mut Self::Target {
            self.data
        }
    }
    impl<T> Drop for StackBox<'_, T> {
        fn drop(&mut self) {
            unsafe { ManuallyDrop::drop(self.data) };
            if let Some(pinned_count) = self.pinned_count {
                pinned_count.set(pinned_count.get() - 1);
            }
        }
    }

    let scoped = GiveUnforgettableScope::new_static(async move {
        let buffer = StackAlloc::<1024>::new();
        pin_mut!(buffer);

        let mut static_fut = None;
        {
            let data = String::from("some data that might be freed");
            let data2 = IsDropped::new();

            let mut inner = buffer.as_ref().alloc_pinned(async {
                {
                    let guard = SpawnGuard::new();
                    pin_mut!(guard);
                    static_fut = Some(guard.convert_future(async {
                        println!("{data}");
                        println!("{data2:?}");
                        assert!(
                            data2.was_dropped,
                            "observing data after it was dropped, not good!"
                        );
                    }));
                    with_context::<()>(|_| Poll::Pending).await
                }
            });
            poll_once(inner.as_mut());
            std::mem::forget(inner);
        }
        static_fut
            .expect("converted a future to static lifetime")
            .await;
        // Yield forever so that `buffer` isn't dropped which would abort the process:
        with_context::<()>(|_| Poll::Pending).await;
    });
    let mut scoped = Box::pin(scoped);
    poll_once(scoped.as_mut());
    // Leak the future so that `buffer` isn't dropped which would abort the
    // process
    std::mem::forget(scoped);
}
wishawa added a commit that referenced this issue Nov 6, 2022
@wishawa
Copy link
Owner

wishawa commented Nov 6, 2022

Thanks so much for the review!

incorrect_not_in_scope test case

This is indeed unexpected. I wasn't even aware async behaves this way.

I'll document this issue. Since this library is only being used for Async UI for now, I don't think we need the stack location check yet. But if a use case where the check would be useful comes up it can be added.

failed_abort_recursion test case

forgettable_stack_allocations test case

Good catch! As apparent I've completely neglected panicking behaviors. I think it is fair that we abort if drop panic as you suggested, since panic on drop often leads to double panic and aborting anyway. I updated the code to do this.

forgettable_stack_allocations test case

I've been thinking about cases like this. It seems that scoped_async_spawn and stack allocation are each sound in isolation, but unsound when combined.

Figuring out who is correct is complicated. The pin doc mention that leaking pinned item is okay, but it also say that

Memory can be “invalidated” by deallocation, but also by replacing a Some(v) by None, or calling Vec::set_len to “kill” some elements off of a vector.

To me, leaking in stack allocation feels like Vec::set_len. The question is how exactly does set_len invalidate memory? I've asked on IRLO for documentation clarification.

@Lej77
Copy link
Contributor Author

Lej77 commented Nov 6, 2022

I think that forgetting stack allocations is allowed by the pin docs as long as the stack memory is never re-used after it was forgotten. My example stack allocator does this by aborting the process but I guess the thread could also sleep forever or prevent re-use in other ways.

Slightly unrelated I actually investigated if there was an existing arena allocator that could give out pinned references. I found bumpalo::boxed::Box::pin_in which I think is actually unsound since if the returned Box is leaked then the backing memory is freed when Bump is dropped. So the pin guarantees are definitely easy to get wrong!

@wishawa
Copy link
Owner

wishawa commented Nov 8, 2022

From the IRLO thread RalfJung mentioned that criteria like Vec::set_len can be loosened further. So indeed this crate is assuming more than what Pin provide and is thus unsound.

I don't think there's anything that could be done to fix this. I'll try to implement a truly sound (based on FuturesUnordered) version of fragment for Async UI. I'll probably keep this subtly unsound version as the default though, as it provides better performance and is really hard (requires additional unsafe, abort/infinite loop, manual poll, ...) to break.

You should open an issue in bumpalo about their unsoundness!

@Lej77
Copy link
Contributor Author

Lej77 commented Nov 8, 2022

It does seem like forgettable stack pinning isn't used much so scoped_async_spawn should be compatible with most async crates and therefore it should be okay to keep using it.

Also, I have opened an issue with bumpalo about its unsound pinning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants