wasi-io: Reimplement wasi-io/poll using a Pollable trait #7812

badeend · 2024-01-24T20:15:30Z

Prior discussion: https://bytecodealliance.zulipchat.com/#narrow/stream/217126-wasmtime/topic/Change.20Subscribe.20trait

Renamed the existing Pollable struct to PollableResource
Reimplemented wasi-io/poll. This introduces a new Pollable trait which is lower level, doesn't require heap allocations to poll, has mutable access to the WasiView, and can be used as a standalone resource without a parent. The Subscribe trait is kept intact, but this is now a utility interface, implemented in terms of Pollable.
Eliminate the (now) unnecessary surrogate parent resource of clock pollables
Added ResourceTable take & restore as a general purpose replacement for iter_entries. That one was used only by the old poll implementation.

Additionally:

@pchickey Forbid empty poll list. Fixes: Clarify poll with empty list WebAssembly/wasi-io#67

… Preview2 resources. Removed the _mut suffixes to align with WasiHttpView.

…er they were preopened or opened using open_at. This fixes build errors regarding overlapping mutable lifetimes introduced in the previous commit.

…nto no-sync2

…is lower level, doesn't require heap allocations to poll, has mutable access to the WasiView, and can be used as a standalone resource without a parent. The Subscribe trait is kept intact, but this is now a utility interface, implemented in terms of Pollable.

…` implementation. And its is now superseded by take&restore

…llables

Fixes: WebAssembly/wasi-io#67

…nto pollable

github-actions · 2024-01-24T21:45:10Z

Subscribe to Label Action

cc @peterhuene

This issue or pull request has been labeled: "wasi", "wasmtime:api"

Thus the following users have been cc'd because of the following labels:

peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

pchickey

Thanks, this is excellent work. The new internal interface is definitely superior to the old one, and I appreciate the new tests as well. The lease system for resource table is a much better interface than iter_entries.

I believe that all of the Slot and SlotIdentity implementation makes sense and is correct (especially because, surprisingly to me, the only unsafe is in unsafe impl Send), but I wanted to tag @alexcrichton to double-check that part because I do not feel super confident in my ability to assess that code. If he is happy with that, this can land.

pchickey · 2024-01-24T23:44:43Z

crates/wasi-http/wit/deps/io/poll.wit

-    /// value, this function traps.
+    /// This function traps if either:
+    /// - the list is empty, or:
+    /// - the list contains more elements than can be indexed with a `u32` value.


I agree we need this change in the wits, lets just be sure to upstream these to the spec repo as well. We will come up with some process for how we keep the docs evolving and improving while assuring that the interface itself doesn't change.

WebAssembly/wasi-io#69

pchickey · 2024-01-24T23:55:58Z

crates/wasi/src/preview2/poll.rs

    Ok(table.push_child(pollable, &resource)?)
 }

+/// A host representation of the `wasi:io/poll.pollable` resource.
+pub struct PollableResource {


This is a very bikesheddy suggestion, so feel free to disregard it, but is BoxPollable a better name for this? That way Resource<PollableResource> isnt repeating the word.

The design has changed in the meantime. Now it's not just a box anymore, so I went with PollableHandle.

alexcrichton

Thanks for this! I've left a few comments but it's getting a bit later here so I'm going to head out. I want to read more about the Lease<T> though as that looks quite subtle and I need to think more about it.

alexcrichton · 2024-01-25T01:15:00Z

crates/wasi/src/preview2/poll.rs

+impl<T: Subscribe> Pollable for T {
+    fn poll_ready(&mut self, cx: &mut Context<'_>, _view: &mut dyn WasiView) -> Poll<()> {
+        self.ready().as_mut().poll(cx)
+    }
+}


This I think is actually subtly incorrect because it drops the future after the call to poll which signals that the future should be cancelled rather than keeping it alive until the whole poll is done. I think that means that we may not be guaranteed to get wakeups from cancelled futures, although we might get those for now given how the code is currently constructed. I think that this'll need to be a bit fancier with the adapter instead of having a blanket impl

The tests were failing precisely because of that. To fix it, I ended up rougly back at the Subscribe design, but this time returning a custom WasiFuture type that has an additional WasiView parameter on its poll method. To prevent scope creep of this PR, I kept Subscribe alive for now (as PollableAsync). But in the long run, I don't think there's much value in havin them both, as Subscribe can be trivially converted to Pollable from:

#[async_trait::async_trait] impl PollableAsync for HostFutureIncomingResponse { async fn ready(&mut self) { if let Self::Pending(handle) = self { *self = Self::Ready(handle.await); } } }

to:

impl Pollable for HostFutureIncomingResponse { fn ready<'a>(&'a mut self) -> Pin<Box<dyn WasiFuture<Output = ()> + Send + 'a>> { Box::pin(async { if let Self::Pending(handle) = self { *self = Self::Ready(handle.await); } }) } }

One obvious downside is the added visual noise.

crates/wasi/tests/all/sync.rs

crates/wasi/src/preview2/poll.rs

alexcrichton

Ok I've gotten a chance now to take a closer look at Lease<T> and the changes ResourceTable. Given this new API design for the pollable trait something along these lines is required (e.g. iter_children can't work any more). The specific implementation here I think has a drawback where it's very "panicky" if you get it wrong. Most other aspects of WASI are "error-y" in that they try to return traps if anything is gotten wrong. This I think is pretty important for not accidentally becoming a DoS vector for embeddings. For example a panicking Drop implementation means that an early-return in an embedder function might accidentally take down the whole process where a wasm trap would only take down a single instance.

Given that my main thought on this is that this should ideally not ever panic and instead should switch to returning errors where possible. I might also recommending going a little bit further perhaps with a scheme such as:

Leave TableEntry::entry as Box<dyn Any>.
Change take to returning Box<T>. This would replace entry with something like Box::new(Tombstone) which is a private type to this module.
Change restore to taking Box<T> and Resource<T>.

That way it's largely up to embedders to "get everything right" but they'd already be required to do so with this current API. Additionally any failed downcasts can additionally add a check for Tombstone to perhaps return a more precise error other than ResourceTableError::WrongType with a new variant such as TakenValue.

badeend · 2024-01-26T20:57:42Z

Thanks for the feedback.
I agree on the "panicky" point. I'll add an error type and remove the panics.

One thing that's not in this PR, but I assume will most likely be added at some point, are untyped take_any and restore_any variants. The drawback of reverting the Lease & SlotIdentity design, is that the restore(_any) API becomes (even) easier to misuse. Because then the consumer can restore any value at the index of a previously differently typed entry. I'm worried about the developer experience of this, as the corruption would happen silently and the place where they encounter the WrongType errors could be miles away from where the problem actually is.

Anyway, I'm fine with your suggestions. I just wanted to make sure the trade-offs are known.

alexcrichton · 2024-01-26T21:25:26Z

Hm ok, for that I think you have a good point about accidentally messing up these APIs. I think this may still be surmountable perhaps with some trickery, but I'd also need to see the usage of take_any and restore to know better. Want to discuss on a future PR with that implemented or hash it out here? (I'm fine either way)

…l calls.

… are now nearly the same.

…nto pollable

badeend · 2024-02-01T16:38:27Z

I chose to go with a hybrid approach. For the public API, I changed it to what you suggested. Internally, I removed Lease & SlotIdentity. But I kept Slot to perform the resource type check. Also, as part of the updated design (see above) I needed take_any and restore_any so I included those as well.

alexcrichton · 2024-02-05T17:12:57Z

Reading over this and see how this all turned out, I'm personally starting to get second thoughts on this. We're effectively reimplementing our own Future trait and as I'm sure you've seen we start implementing our own primitive functions (e.g. poll_fn) as well as we can't use standard things like async fn or #[async_trait]. I'm a bit worried that the direction this is taking us is straying off the path of maintainability for async support as things get more advanced over time.

Now that's all easy to say but this PR is still solving a concrete problem which is letting implementations access resources while polling, so I don't think simply closing this PR is an option. That being said after having read over this I wonder if there's perhaps an alternative implementation route that we can take.

Originally when designing Future-the-trait we ran into this issue of situations wanting to pass more context along through the poll method but the context doesn't survive longer than a single call to poll. To do that we ended up creating task-local variables which are like thread locals but instead stick with a task. That doesn't solve the immediate problem at hand though since you want mutable access, not just readable access.

To solve the mutability problem I realized that the take/restore bits look like Option<T> and so they've already got runtime state associated with them. One alternative would be to use a RefCell<T> instead and effectively repurpose that runtime state. That would enable acquiring &mut T from &ResourceTable so long as it's done "correctly" which is basically already the situation we have today (make sure you restore after you take).

How would you feel about something like that? We could still preserve get_mut as a method which has no runtime overhead (apart from storage space) but I'm imagining that a borrow_mut() method would be added. I'll note that get would have to go away in this world and be replaced with borrow() as a consequence, which likely affects code we have today.

While RefCell is unlikely to win any award for being the most ergonomic thing in the world this feels like it might provide a better tradeoff because we wouldn't fall off the well-trodden-path of Rust async into custom traits and such. I would want to make sure it works for your use case though.

I also realize though that you've probably already put in a great deal of work to this PR with 2 versions now so I'm hesitant to ask for a third. I'd be happy to help sketch this out and do some of the refactoring work to see if I feel like it's going to pay off.

badeend · 2024-02-05T22:57:32Z

I think you mean changing

async fn ready(&mut self);

to

async fn ready(&mut self, table: &ResourceTable);

right?

That doesn;t work because &ResourceTable is not Send as ResourceTable is not Sync.

alexcrichton · 2024-02-09T11:13:29Z

Good point, yes, I'm more-or-less saying we should do that. (either that or use a task-local but I think that still captures &T).

Mind trying make ResourceTable implement Sync? I think that's probably the addition of a few trait bounds in its internal trait objects. I think everything we put in there is already Sync although if it things aren't currently Sync that'll pose a larger problem.

…nto pollable

…ableHandle -> Pollable

badeend · 2024-02-11T13:23:03Z

I understand your concerns, yet I'd rather not go for round three right now, which would include reverting #7802. So instead, I've changed the questionable types to be private to the poll.rs module. That way, all the iffy-ness is contained to just a single file that we can iterate on later. From the outside nothing significant has changed, except that now I can use poll_ready_fn, which is what I personally was after.

Hope that's OK for you

alexcrichton · 2024-02-13T21:02:51Z

Wanted to say I have not forgotten about this, I have been looking for time to write up something longer-form, which I hope to get to by tomorrow. Is this blocking anything though that it would be prudent to land now rather than later? If so I think it's good to go as-is, but otherwise I'd like to take some more time to write up longer-form thoughts.

badeend · 2024-02-13T21:12:15Z

There's no immediate rush from my side, so feel free to take your time.

alexcrichton · 2024-02-14T18:57:05Z

Ok thanks again for your patience here, very much appreciated!

I've gotten some time to think and work on this. I was leaning towards merging this, but then I realized that I'd prefer to avoid a situation where we land this and then later revert most of it towards a different strategy. In that sense I wanted, time permitting, to take a moment and figure out if alternative strategies would work. I'm getting a growing sense of unease with this direction as it's more-or-less a custom Future trait and is something we'd ideally avoid.

So assuming that the main goal of this PR is to get access to ResourceTable during async fn ready I originally suggested the borrow/borrow_mut idea above using RefCell. I tried implementing that and turns out it doesn't work. That means that ResourceTable contains RefCell and async fn ready would close over &ResourceTable (e.g. it's a new function argument). In such a situation it means that the returned future, which must be Send, closes over &ResourceTable. That type is not Send because it requires ResourceTable: Sync which is not satisfied with RefCell. So that cans the idea of. using RefCell.

After talking a bit more with @pchickey, however, I'm growing more fond of the idea of using RwLock<T> here instead of RefCell<T>. Not for the actual blocking aspect but instead only for the "it's Sync" aspect. To that end I implemented this on a branch and got tests passing with it. The changes are:

Replace ResourceTable::get with ResourceTable::{borrow, borrow_mut} that return RwLock{Read,Write}Guard<T>.
Remove table methods returning Any
Add &ResourceTable as an argument to the ready async function.
Replace most usage of table().get() with table().get_mut() (avoids locks)
Use u32 indices in Pollable's make_future internals instead of Any
Rewrite headers in wasi-http to avoid needing Any by representing headers as Resource<Resource<hyper::HeaderMap>>

The major consequences of this decision, however, are:

ResourceTable::{borrow, borrow_mut} require atomic manipulations. No blocking, but it's atomics for something that's not contended 99.9% of the time.
The std::sync::RwLock type cannot be used because std::sync::RwLock{Read,Write}Guard is not Send. I temporarily added a tokio dependency to wasmtime-the-crate and used tokio::sync::RwLock instead. Long-term I would like to avoid a tokio dep in the wasmtime crate.
There's a few minor cleanup still to be had in terms of threading a few more errors in a few more places.

Personally I'm inclined to take a route that looks like this, namely threading arguments through async fn rather than threading arguments through fn poll. This is a foundational change to how things work though, especially around a new footgun of not being able to borrow_mut twice. In that sense I'd like to get feedback along the lines of:

@pchickey does this all sound reasonable enough to you?
@badeend does this still solve your original use case, and if so what do you think about this approach vs the poll approach?

alexcrichton · 2024-02-14T18:57:46Z

also cc @elliottt since you've touched a lot of WASI internals and you probably want to take a look too

badeend added 15 commits January 18, 2024 22:24

Remove + Sync constraints from preview2 implementation

6b4d634

Only use mutable references in WasiView to guarantee unique access to…

a2de6ec

… Preview2 resources. Removed the _mut suffixes to align with WasiHttpView.

Always use Descriptor::Directory for directories, regardless of wheth…

1e766c2

…er they were preopened or opened using open_at. This fixes build errors regarding overlapping mutable lifetimes introduced in the previous commit.

Merge branch 'main' of https://github.com/bytecodealliance/wasmtime i…

75ce698

…nto no-sync2

Remove some more + Syncs

942769e

Remove one more

0b839b5

typo

16eaf07

Fix build errors on Rust <= 1.73. Code already compiled fine on >= 1.74

562f83d

ResourceTable take+restore

3ac45de

Rename Pollable -> PollableResource

b376c31

Remove ResourceTable::iter_entries. It was used only by the old `poll…

fcc3e87

…` implementation. And its is now superseded by take&restore

Eliminate the (now) unnecessary surrogate parent resource of clock po…

d9a9842

…llables

Forbid empty poll list.

537b3f6

Fixes: WebAssembly/wasi-io#67

Merge branch 'main' of https://github.com/bytecodealliance/wasmtime i…

59e379f

…nto pollable

badeend requested review from a team as code owners January 24, 2024 20:15

badeend requested review from fitzgen and removed request for a team January 24, 2024 20:15

badeend mentioned this pull request Jan 24, 2024

UDP override rylev/wasmtime#4

Merged

github-actions bot added wasi Issues pertaining to WASI wasmtime:api Related to the API of the `wasmtime` crate itself labels Jan 24, 2024

fitzgen requested review from sunfishcode and removed request for fitzgen January 24, 2024 22:01

pchickey self-requested a review January 24, 2024 22:51

pchickey reviewed Jan 25, 2024

View reviewed changes

pchickey requested review from alexcrichton and removed request for sunfishcode January 25, 2024 00:08

alexcrichton reviewed Jan 25, 2024

View reviewed changes

badeend added 3 commits January 25, 2024 08:55

Test for specific error

cf3d161

Simplify ready() and pending()

6a713a8

Typo

f0f5209

alexcrichton reviewed Jan 26, 2024

View reviewed changes

badeend added 7 commits January 27, 2024 12:15

Replace panics with errors

d6905b6

Remove Lease and SlotIdentity types.

53349a8

Add take_any & restore_any variants

8d54bd3

Redesign Pollable interface to not drop read() Futures in between pol…

f7d8c4c

…l calls.

Rename Subscribe -> PollableAsync. Because Pollable and PollableAsync…

96b79f0

… are now nearly the same.

PollableResource -> PollableHandle

bd67f42

Merge branch 'main' of https://github.com/bytecodealliance/wasmtime i…

3692f9d

…nto pollable

badeend added 4 commits February 11, 2024 11:58

Merge branch 'main' of https://github.com/bytecodealliance/wasmtime i…

45ce638

…nto pollable

Rename Pollable -> PollableInternal, PollableAsync -> Subscribe, Poll…

2130c43

…ableHandle -> Pollable

Make the internals actually internal.

6e12a8a

Update docs

742987b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wasi-io: Reimplement wasi-io/poll using a Pollable trait #7812

wasi-io: Reimplement wasi-io/poll using a Pollable trait #7812

badeend commented Jan 24, 2024

github-actions bot commented Jan 24, 2024

pchickey left a comment •

edited

pchickey Jan 24, 2024

badeend Feb 1, 2024

pchickey Jan 24, 2024

badeend Feb 1, 2024

alexcrichton left a comment

alexcrichton Jan 25, 2024

badeend Feb 1, 2024

alexcrichton left a comment

badeend commented Jan 26, 2024 •

edited

alexcrichton commented Jan 26, 2024

badeend commented Feb 1, 2024

alexcrichton commented Feb 5, 2024

badeend commented Feb 5, 2024

alexcrichton commented Feb 9, 2024

badeend commented Feb 11, 2024 •

edited

alexcrichton commented Feb 13, 2024

badeend commented Feb 13, 2024

alexcrichton commented Feb 14, 2024

alexcrichton commented Feb 14, 2024

wasi-io: Reimplement wasi-io/poll using a Pollable trait #7812

Are you sure you want to change the base?

wasi-io: Reimplement wasi-io/poll using a Pollable trait #7812

Conversation

badeend commented Jan 24, 2024

github-actions bot commented Jan 24, 2024

Subscribe to Label Action

pchickey left a comment • edited

Choose a reason for hiding this comment

pchickey Jan 24, 2024

Choose a reason for hiding this comment

badeend Feb 1, 2024

Choose a reason for hiding this comment

pchickey Jan 24, 2024

Choose a reason for hiding this comment

badeend Feb 1, 2024

Choose a reason for hiding this comment

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton Jan 25, 2024

Choose a reason for hiding this comment

badeend Feb 1, 2024

Choose a reason for hiding this comment

alexcrichton left a comment

Choose a reason for hiding this comment

badeend commented Jan 26, 2024 • edited

alexcrichton commented Jan 26, 2024

badeend commented Feb 1, 2024

alexcrichton commented Feb 5, 2024

badeend commented Feb 5, 2024

alexcrichton commented Feb 9, 2024

badeend commented Feb 11, 2024 • edited

alexcrichton commented Feb 13, 2024

badeend commented Feb 13, 2024

alexcrichton commented Feb 14, 2024

alexcrichton commented Feb 14, 2024

pchickey left a comment •

edited

badeend commented Jan 26, 2024 •

edited

badeend commented Feb 11, 2024 •

edited