Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WASM UDF implementation in Rust #11351

Merged
merged 10 commits into from Jan 9, 2023

Conversation

wmitros
Copy link
Contributor

@wmitros wmitros commented Aug 22, 2022

This series adds the implementation and usage of rust wasmtime bindings.

The WASM UDFs introduced by this patch are interruptable and use memory allocated using the seastar allocator.

This series includes #11102 (the first two commits) because #11102 required disabling wasm UDFs completely. This patch disables them in the middle of the series, and enables them again at the end.
After this patch, libwasmtime.a can be removed from the toolchain.
This patch also removes the workaround for ##9387 but it hasn't been tested with ARM yet - if the ARM test causes issues I'll revert this part of the change.

@wmitros
Copy link
Contributor Author

wmitros commented Aug 22, 2022

During the development of this patch I realized an issue that isn't linked to this patch, but to Rust as a whole - the behavior on OOM errors.
Stable Rust aborts() on such an error and does not provide methods that allow writing code with a fallible allocations - there are try_reserve methods for some Collections but this method is missing for e.g. the Box type, which is crucial for us. This could at least partially be helped by nightly Rust - the Box::try_new is available there, but I would still need to check if the remaining code can be made safe then

@wmitros wmitros force-pushed the rust-wasmtime branch 2 times, most recently from e7e8bf9 to d14861d Compare August 22, 2022 14:05
configure.py Outdated
@@ -1893,7 +1892,7 @@ def configure_abseil(build_dir, mode, mode_config):
pool = console
description = TEST {mode}
rule rust_lib.{mode}
command = CARGO_HOME=build/{mode}/rust/.cargo cargo build --release --manifest-path=rust/Cargo.toml --target-dir=build/{mode}/rust -p ${{pkg}}
command = CARGO_HOME=build/{mode}/.cargo RUSTFLAGS="--cfg=rustix_use_libc" cargo build --manifest-path=rust/Cargo.toml --target-dir=build/{mode} --profile=rust-{mode}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This RUSTFLAGS value fixes a weird issue where rustix - a rust crate that wasmtime depends on - returns 0 as the system page size when using linux_raw instead of libc as the backend for systemcalls

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case it helps, I believe this problem is fixed by bytecodealliance/rustix#385, which is included in rustix 0.35.9.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info. I updated the dependencies and confirmed that everything works now without this flag

@scylladb-promoter
Copy link
Contributor

@scylladb-promoter
Copy link
Contributor

Copy link
Contributor

@psarna psarna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! @piodul FYI, a fair portion of Rust code to review

@@ -9,3 +9,30 @@ inc = { path = "inc", version = "0.1.0" }

[lib]
crate-type = ["staticlib"]

[profile.rust-dev]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: maybe the prefix should indicate that this is scylla-specific - scylla-dev instead of rust-dev? rust- might suggest that it's some kind of an official standard, and it isn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rust- has a benefit that the directory that contains the build files also has rust in its name (currently it's $builddir/{mode}/rust-{mode}), but I think something like builddir/{}/rust/scylla-{mode} would be just as good. On the other hand, maybe a comment would suffice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment would suffice for me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added in the rebase

if let Ok(new_layout) = std::alloc::Layout::from_size_align(new_size, 64 * 1024) {
let new_ptr = match self.layout {
Some(layout) => unsafe { std::alloc::realloc(self.ptr, layout, new_layout.size()) },
None => unsafe { std::alloc::alloc(new_layout) },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm - is this whole thing done simply to force wasm to use host malloc/free for allocations, so that it gets automatically handled by the Seastar allocator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, otherwise the memory will be allocated with mmap that we don't override in seastar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, let's explicitly explain it in the comment, along with mentioning mmap. Technically, nothing guarantees that std::alloc doesn't switch to mmap one day

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or, just use an unsafe malloc/free pair directly to be sure)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, it's an unsafe block anyway. In this case I strongly suggest you just forward-declare malloc, realloc, free, whatever's needed, and use them directly. Then we're 100% sure there won't be any mmap sneaking in.

And in case we have strict layout requirements, like a 64k page for wasm, perhaps instead of malloc we should just go with aligned_alloc (https://linux.die.net/man/3/aligned_alloc)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ref: https://github.com/rust-lang/rust/blob/c3605f8c8020dbbe8f0d1961c7b33c4c4b78ad0d/library/std/src/sys/unix/alloc.rs#L6-L24

with this implementation it uses aligned_alloc (or malloc for small allocations), which we properly override in Seastar indeed, and it will most likely stay that way due to the 64k alignment, because mmap doesn't exactly let you choose a multi-page alignment anyway. But in theory on on certain hugepage configurations it may be beneficial to use mmap over aligned_alloc, so in my opinion we should be prepared for that

Copy link
Contributor Author

@wmitros wmitros Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I used aligned_alloc and free in the new rebase. I've also added a comment about mmap.
I removed the std::alloc::realloc because it should probably not be used here anyway, considering it doesn't keep the alignment

if (eptr) {
co_await coroutine::return_exception_ptr(eptr);
}
co_await coroutine::maybe_yield();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😌

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you do yield here. Why are the others different? Need comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't yield in other places mostly because they should finish quickly, the other cases are also not used in future functions so it would require quite a bit more changes to yield there - I'll add comments.

@wmitros wmitros requested a review from piodul August 23, 2022 12:53
@wmitros wmitros force-pushed the rust-wasmtime branch 2 times, most recently from eb8d1c0 to acdb049 Compare August 23, 2022 16:52
@scylladb-promoter
Copy link
Contributor

@wmitros wmitros force-pushed the rust-wasmtime branch 2 times, most recently from 4bc21e3 to 7185307 Compare August 29, 2022 11:50
@wmitros
Copy link
Contributor Author

wmitros commented Aug 29, 2022

I've rebased onto master, added a comment and removed some unused code.

@scylladb-promoter
Copy link
Contributor

@psarna
Copy link
Contributor

psarna commented Sep 5, 2022

@wmitros for starters, please also run cargo clippy (ref: https://github.com/rust-lang/rust-clippy#step-2-install-clippy) on the Rust source code, it will provide a few tips on how to make the code more idiomatic:

warning: explicit lifetimes given in parameter types where they could be elided (or replaced with `'_` if needed by type declaration)
  --> src/lib.rs:15:39
   |
15 |   #[cxx::bridge(namespace = "wasmtime")]
   |  _______________________________________^
16 | | mod ffi {
17 | |     enum ValKind {
18 | |         I32,
...  |
77 | |         type Fut<'a>;
78 | |         unsafe fn resume<'a>(self: &mut Fut<'a>) -> Result<bool>;
   | |________________________________________________________________^
   |
   = note: `#[warn(clippy::needless_lifetimes)]` on by default
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes

warning: explicit lifetimes given in parameter types where they could be elided (or replaced with `'_` if needed by type declaration)
  --> src/lib.rs:78:9
   |
78 |         unsafe fn resume<'a>(self: &mut Fut<'a>) -> Result<bool>;
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes

warning: explicit lifetimes given in parameter types where they could be elided (or replaced with `'_` if needed by type declaration)
  --> src/lib.rs:84:31
   |
84 |         ) -> Result<Box<Fut<'a>>>;
   |                               ^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes

warning: this expression creates a reference which is immediately dereferenced by the compiler
   --> src/lib.rs:100:39
    |
100 |     let mut ctx = Context::from_waker(&waker);
    |                                       ^^^^^^ help: change this to: `waker`
    |
    = note: `#[warn(clippy::needless_borrow)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

warning: the function `wasmtime::Store::new` doesn't need a mutable reference
   --> src/lib.rs:136:42
    |
136 |     let mut store = wasmtime::Store::new(&mut engine.wasmtime_engine, wasi);
    |                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: `#[warn(clippy::unnecessary_mut_passed)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_mut_passed

warning: `wasmtime_bindings` (lib) generated 5 warnings
    Finished dev [unoptimized + debuginfo] target(s) in 58.43s

"errno-dragonfly",
"libc",
"winapi",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does rust have the equivalent of ccache that can store cached build results outside the build directory? So that rm -rf build; ninja will be fast?

I'm thinking of our builders which start from a clean directory each time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mozilla's sccache works with rust: https://github.com/mozilla/sccache . Disclaimer: I have never used it personally so I can't tell how well it works in practice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Let's start using it when we see rust compilation times having an impact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incremental option is supposed to help with rebuild times, but it does not help with the rm -rf build; ninja case and I only enabled it for the dev mode, as it might make the final binary less optimized.
I might just try this sccache because in the case of rm -rf build; ninja the rust compilation is very impactful (although this could have been expected, as rust is the only thing that is being rebuilt then)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried sccache and it helps a lot with the rm -rf build; ninja case. I don't expect we want to add it as one of our dependencies, but explain how to use it in some docs. I'll need to research it a bit more so maybe we can do this as a follow-up

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We install ccache into the dbuild environment so we may as well install sscscscscache. Certainly it could be in a separate pull request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it even supports C++, so if it's better in some way, we can replace ccache.

}
Poll::<Result<wasmtime::Instance>>::Ready(Err(e)) => {
return Err(anyhow!("Failed to instantiate module: {:?}", e));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does instantiate_async() return a future?

If it's in order to yield, then aren't we wasting the opportunity by looping? We should return to the caller and let them yield too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(instantiating an instance is less important than execution, since it's rarer, but still)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the documentation, we return a future here only to potentially yield when we "asynchronously invoke the wasm start function in case it calls any imported function which is an asynchronous host function". The only imports come from WASI, but I don't think they are used in the start function (actually I couldn't find the start function implementation - it is also possible that there is none).
If we also go with #11351 (comment) no imports will be async at all, so it won't be possible that this future yields in the middle

self.wasmtime_memory.size(&store.wasmtime_store)
}
fn grow(self: &mut Memory, store: &mut Store, delta: u64) -> Result<u64> {
self.wasmtime_memory.grow(&mut store.wasmtime_store, delta)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does wasmtime use to allocate memory?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Need to understand if it will be accounted for in seastar memory management)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from the WebAssembly Memory allocator that uses the seastar aligned_alloc, the rust code uses the seastar allocator as its global allocator (though currently this is an implementation detail that could possibly change), but wasmtime (the rust crate) still mostly uses mmap - in particular to allocate memory during module compilation.

Copy link
Contributor

@psarna psarna Sep 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed through wasmtime code and it indeed mmaps a lot. Some of it is actual file-backed mapping, but often it's used as glorified malloc with MAP_ANONYMOUS.

We could technically add an opt-in Seastar feature which makes extrernal mmap-based libraries more Seastar friendly:

  1. Override mmap to fall back to original behavior, except when MAP_ANONYMOUS is set, in which case we just call aligned_alloc with page size boundary, which effectively gives us pages. Then, we can apply requested protection with a separate mprotect call.
  2. munmap gets a little complicated, because I don't know any reliable way of checking if a page is file-backed or anonymous, except for examining /proc/self/maps output. But if we assume that the number of anonymous mappings is relatively small, perhaps adding a hashmap overhead is acceptable if it means that we regain the ability to allocate everything with a Seastar allocator. Then, we could just track our anonymous pages ourselves and override munmap behavior to a regular free if the page is tracked as "ours".

@avikivity Does it sound too obscure to you, or is it feasible? The main question is how many more mmap-ed libraries we're likely to integrate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't file-backed mmap() cause blocking memory accesses and stalls?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not just stalls, also OOM since we already allocated all memory for our own use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overriding ::mmap is likely to end up with a lot of tears, it's too fundamental and too not designed to be overridden.

Can we patch wasmtime (upstream) not to use mmap? It seems like premature optimization. The system memory allocator will call mmap when the conditionals call for it.

Perhaps the issue is the need to make the memory executable later with mprotect() (typical jit operation is to first mmap with PROT_WRITE, then mprotect to PROT_EXEC.

If so, perhaps we can pass callbacks to wasmtime for mmap/mprotect/munmap so we can intercept them. The default callbacks will just call the system mmap/mprotect/munmap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely removing mmap from wasmtime seems difficult, but maybe it is not essential after all.
I inspected the wasmtime code and I found that the mmaps are used in:

  • the default Linear Memory allocator - which we have replaced
  • the Pooling Memory Allocator (where we preallocate all the memory for all the instances that we can later use) - which we do not use
  • in the copy-on-write implementation of the initialization of Linear Memories - which is only used in the before-mentioned Pooling Memory Allocator
  • for allocation of the stacks for asynchronous calls - this could probably be replaced with malloc
  • for allocation of the alternate signal stack - this could also probably be replaced with malloc
  • in compilation of a module from a file - which we currently do not use
  • for allocation of for the jit code to be later mprotected with PROT_EXEC - the mmaps are all with MAP_ANONYMOUS but I'm not sure we can help it (according to the mprotect manual mprotect() is unspecified if it is applied to a of memory that was not obtained via mmap())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for allocation of for the jit code to be later mprotected with PROT_EXEC - the mmaps are all with MAP_ANONYMOUS but I'm not sure we can help it (according to the mprotect manual mprotect() is unspecified if it is applied to a of memory that was not obtained via mmap())

All memory in the seastar allocator is obtained via mmap(). So we should be able to mprotect memory coming from it. But we have to be very careful to make sure the result of mprotect doesn't spill over to memory used by other objects and to remove it after not needed.

if new_size == 0 {
new_ptr = ptr::null_mut()
} else {
new_ptr = unsafe { aligned_alloc(WASM_PAGE_SIZE, new_size) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This answers my previous question.

aligned_alloc isn't defined if new_size isn't a multiple of WASM_PAGE_SIZE, is it guaranteed by the caller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I look at our aligned_alloc overwrite it looks like we align the size up when necessary, but I see that in the linux specification it's as you described.
I'd expect the size here to be aligned, but I don't see it specified directly, so I think I'll just use memalign here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that memalign is deprecated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(you could just align the new size yourself)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll align the size myself then

@@ -6,8 +6,6 @@
* SPDX-License-Identifier: AGPL-3.0-or-later
*/

#ifdef SCYLLA_ENABLE_WASMTIME

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that the series continues building without wasmtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasmtime is now downloaded as part of the compilation of the .rs files, so it should never be missing, but it may not work correctly in ARM (it didn't work when we used libwasmtime.a) - I'll try to check this soon. If it doesn't work with ARM I'll bring the SCYLLA_ENABLE_WASMTIME back

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait what? Didn't we add it to the toolchain?

In any case it may not be available on all architectures.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did add Rust to the toolchain, but we didn't add all the crates that will be used in this patch.
Actually, we should probably discuss how do we want to deal with downloading the new crates.
Currently, the dependency will be downloaded if it's missing during the compilation (it won't be re-downloaded if it's already present).
If we don't want that, we could update the toolchain every time a dependency is changed.
We can also update the toolchain only periodically, and some dependencies could ever need to be downloaded during compilation.

I didn't really see benefits to the latter options because after the first build, there basically is no difference between them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, where do we want to keep the downloaded crates? Cargo uses $HOME/.cargo by default.
To test if everything works with ARM this would help a lot

lang/wasm.cc Outdated
rets->push_i32(0);

auto fut = wasmtime::get_func_future(store, *malloc_func, *argv, *rets);
while(!fut->resume());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style: space after while, it isn't a function.

Also a comment indicating that we need to yield here.

If we set need_thread, then it may be enough to call seastar::thread::maybe_yield() here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can modify the operator() in the init(_nullable)_arg_visitor so that it returns a future, then we can use the maybe_yield like when executing the UDF

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just pushing arguments, maybe unneeded. But comments are definitely needed.

lang/wasm.cc Outdated
argv->push_i32((int32_t)val.i64());
auto rets = wasmtime::get_val_vec();
auto free_fut = wasmtime::get_func_future(store, *free_func, *argv, *rets);
while(!free_fut->resume());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@avikivity
Copy link
Member

Looks good. Please add tests that prove we yield on a long loop.

configure.py Outdated
@@ -675,6 +675,14 @@ def find_headers(repodir, excluded_dirs):
'raft/log.cc',
Copy link
Contributor

@piodul piodul Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the first commit message: "in this pach" -> "in this patch"

configure.py Outdated
'rust/Cargo.lock',
'rust/inc/src/lib.rs',
'rust/inc/Cargo.toml',
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if the detection of the *.rs files were automatic. When working with a rust-only project, cargo usually detects dependent source files automatically and knows what and when to recompile. IMO we should consider it as an improvement in a followup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did also consider this but had a problem with incorporating it into the ninja build rules.
But now that I look at it, maybe we can simply add | always to the build rule for librust_combined.a and remove this list whatsoever - this doesn't cost us because if cargo determines no files changed, it will not compile them again
It even looks like this modification doesn't make rust build literally always, but only when there's a dependency on librust_combined.a


loop {
match inst_fut.as_mut().poll(&mut ctx) {
Poll::<Result<wasmtime::Instance>>::Pending => {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the turbofish needed here? It makes the line quite long and verbose. Would Poll::Pending work here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works without it. I'll also remove it for the Poll::Ready case

[dependencies]
cxx = { version = "1.0.73", features = ["c++20"] }
wasmtime = { version = "0.40.0", features = ["async"] }
wasmtime-wasi = { version = "0.40.0", features = ["tokio"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the tokio feature? We aren't running a tokio runtime, and we would rather avoid code which assumes running in the tokio runtime context (it will panic if it tries to access stuff from tokio).

Copy link
Contributor Author

@wmitros wmitros Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tokio feature was the only alternative to the sync feature, so it looked like using it was the only option for the async wasi, but now that I think about it, we might not need it here and just use the sync WASI - I'll test if the yielding works without it

}

fn get_abi(instance: &Instance, store: &mut Store, memory: &Memory) -> Result<u32> {
match instance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style suggestion: the function returns a Result<>, so you can use the ? operator and some methods on Option to make the code flatter. For example:

let export = instance
    .wasmtime_instance
    .get_export(&mut store.wasmtime_store, "_scylla_abi")
    .ok_or_else(|| anyhow!("ABI version export not found - please export `_scylla_abi` in the wasm module"))?;

let global = global
    .get(&mut store.wasmtime_store)
    .ok_or_else(|| anyhow!("Exported object _scylla_abi is not a global"))?;

// ...

Other functions would benefit from this conversion, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added some map_err in the same style in the rebase

@@ -361,7 +361,7 @@ database::database(const db::config& cfg, database_config dbcfg, service::migrat
, _feat(feat)
, _shared_token_metadata(stm)
, _sst_dir_semaphore(sst_dir_sem)
, _wasm_engine(std::make_unique<wasm::engine>())
, _wasm_engine(std::make_unique<rust::Box<wasmtime::Engine>>(wasmtime::create_engine()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't rust::Box something like a std::unique_ptr already? Would it be possible to just use rust::Box directly and remove the double indirection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried using the rust::Box alone in the rebase and looks like it works for now

Comment on lines 197 to +205
"v128",
"externref",
"funcref",
"externref",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the order changed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It matched the order of enum elements in lang/wasmtime.hh

scylladb/lang/wasmtime.hh

Lines 492 to 497 in 41b91e3

/// WebAssembly's `v128` type from the simd proposal
V128,
/// WebAssembly's `externref` type from the reference types
ExternRef,
/// WebAssembly's `funcref` type from the reference types
FuncRef,
and now it matches the order in the wasmtime crate https://docs.rs/wasmtime/latest/wasmtime/enum.Val.html

wasmtime_engine: wasmtime::Engine,
}

fn create_engine() -> Result<Box<Engine>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am I right to assume that Result gets translated by cxx to produce a function that returns a T if the result is Ok, and throws an exception in case of Err?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scylladb-promoter
Copy link
Contributor

@nyh
Copy link
Contributor

nyh commented Jan 4, 2023

There seems to be a build error:

2023-01-04T17:57:21.586Z] FAILED: build/debug/main.o 
[2023-01-04T17:57:21.586Z] clang++ -MD -MT build/debug/main.o -MF build/debug/main.o.d -I/jenkins/workspace/releng/Scylla-CI/scylla/seastar/include -I/jenkins/workspace/releng/Scylla-CI/scylla/build/debug/seastar/gen/include -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -DSEASTAR_API_LEVEL=6 -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_DEBUG -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_TYPE_ERASE_MORE -DFMT_SHARED -DSEASTAR_BROKEN_SOURCE_LOCATION -I/usr/include/p11-kit-1  -DDEBUG -DSANITIZE -DDEBUG_LSA_SANITIZER -DSCYLLA_ENABLE_ERROR_INJECTION -Og -DSCYLLA_BUILD_MODE=debug -g -gz -iquote. -iquote build/debug/gen --std=gnu++20  -ffile-prefix-map=/jenkins/workspace/releng/Scylla-CI/scylla=.  -march=westmere -DBOOST_TEST_DYN_LINK   -DNOMINMAX -DNOMINMAX -fvisibility=hidden  -Wall -Werror -Wno-mismatched-tags -Wno-tautological-compare -Wno-parentheses-equality -Wno-c++11-narrowing -Wno-sometimes-uninitialized -Wno-return-stack-address -Wno-missing-braces -Wno-unused-lambda-capture -Wno-overflow -Wno-noexcept-type -Wno-error=cpp -Wno-ignored-attributes -Wno-overloaded-virtual -Wno-unused-command-line-argument -Wno-defaulted-function-deleted -Wno-redeclared-class-member -Wno-unsupported-friend -Wno-unused-variable -Wno-delete-non-abstract-non-virtual-dtor -Wno-braced-scalar-init -Wno-implicit-int-float-conversion -Wno-delete-abstract-non-virtual-dtor -Wno-uninitialized-const-reference -Wno-psabi -Wno-narrowing -Wno-array-bounds -Wno-nonnull -Wno-uninitialized -Wno-error=deprecated-declarations -DXXH_PRIVATE_API -DSEASTAR_TESTING_MAIN -DFMT_DEPRECATED_OSTREAM -DHAVE_LZ4_COMPRESS_DEFAULT  -c -o build/debug/main.o main.cc
[2023-01-04T17:57:21.587Z] main.cc:463:13: error: no member named 'reserve_additional_memory' in 'seastar::app_template::config'
[2023-01-04T17:57:21.587Z]     app_cfg.reserve_additional_memory = 50 * 1024 * 1024;
[2023-01-04T17:57:21.587Z]     ~~~~~~~ ^
[2023-01-04T17:57:21.587Z] 1 error generated.

Weird, is this something you even changed in your patch?

@wmitros
Copy link
Contributor Author

wmitros commented Jan 4, 2023

Weird, is this something you even changed in your patch?

Yes, this is something I added to seastar and it became usable only recently, with the last seastar update.
Looks like I didn't use the latest master branch for the CI. It should build when I rebase

@scylladb-promoter
Copy link
Contributor

@piodul
Copy link
Contributor

piodul commented Jan 5, 2023

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3707/

This time it seems to be a legitimate failure in the test_fib_called_on_null.

thread '<unnamed>' panicked at 'attempt to subtract with overflow', wasmtime_bindings/src/lib.rs:181:9
...
  12:         0x1262fba3 - wasmtime$cxxbridge1$Module$remove_user
                               at /jenkins/workspace/releng/Scylla-CI/scylla/rust/wasmtime_bindings/src/lib.rs:47:42
  13:         0x11892e86 - _ZN4wasm14instance_cache17remove_module_refERN8wasmtime6ModuleE
                               at ./lang/wasm_instance_cache.cc:240:12
...

Looks like the reference count for a module went below zero. In debug mode, Rust panics on under/overflow.

@wmitros
Copy link
Contributor Author

wmitros commented Jan 5, 2023

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3707/

This time it seems to be a legitimate failure in the test_fib_called_on_null.

thread '<unnamed>' panicked at 'attempt to subtract with overflow', wasmtime_bindings/src/lib.rs:181:9
...
  12:         0x1262fba3 - wasmtime$cxxbridge1$Module$remove_user
                               at /jenkins/workspace/releng/Scylla-CI/scylla/rust/wasmtime_bindings/src/lib.rs:47:42
  13:         0x11892e86 - _ZN4wasm14instance_cache17remove_module_refERN8wasmtime6ModuleE
                               at ./lang/wasm_instance_cache.cc:240:12
...

Looks like the reference count for a module went below zero. In debug mode, Rust panics on under/overflow.

Yes, I found that while adding the "handle" for modules there was one place where I didn't remove the manual reference counting. I removed it and now the references are counted only in the handle constructor/destructor.

Copy link
Contributor

@piodul piodul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last suggestion from my side, otherwise LGTM.

Comment on lines +133 to +162
// Wasmtime instances hold references to modules, so the module can only be dropped
// when all instances are dropped. For a given module, we can have at most one
// instance for each scheduling group.
// This function is called each time a new instance is created for a given module.
// If there were no instances for the module before, i.e. this module was not
// compiled, the module is compiled and the size of the compiled code is added
// to the total size of compiled code. If the total size of compiled code exceeds
// the maximum size as a result of this, the function will evict modules until
// there is enough space for the new module. If it is not possible, the function
// will throw an exception. If this function succeeds, the counter of instances
// for the module is increased by one.
void track_module_ref(wasmtime::Module& module, wasmtime::Engine& engine);

// This function is called each time an instance for a given module is dropped.
// If the counter of instances for the module reaches zero, the module is dropped
// and the size of the compiled code is subtracted from the total size of compiled code.
void remove_module_ref(wasmtime::Module& module) noexcept;

// When a WASM UDF is executed, a separate stack is first allocated for it.
// This stack is used by the WASM code and it is not tracked by the seastar allocator.
// This function will evict cached modules until the stack can be allocated. If enough
// memory can't be freed, the function will throw an exception.
void reserve_wasm_stack();

// This function should be called after a WASM UDF finishes execution. Its stack is then
// destroyed and this function accounts for the freed memory.
void free_wasm_stack() noexcept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Is there a reason why those methods are public and not private? They seem to be internal to how the instance cached works. The methods track_module_ref/remove_module_ref now should probably be exclusively used by module_handle, so they could also be made private and module_handle could become a friend of instance_cache.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, added in rebase

@scylladb-promoter
Copy link
Contributor

@scylladb-promoter
Copy link
Contributor

avikivity and others added 10 commits January 6, 2023 14:05
Rust's cargo caches downloaded sources in ~/.cargo. However dbuild
won't provide access to this directory since it's outside the source
directory.

Prepare for sharing the cargo cache between the host and the dbuild
environment by:
 - Creating the cache if it doesn't already exist. This is likely if
   the user only builds in a dbuild environment.
 - Propagating the cache directory as a mounted volume.
 - Respecting the CARGO_HOME override.
Currently, the rust build system in Scylla creates a separate
static library for each incuded rust package. This could cause
duplicate symbol issues when linking against multiple libraries
compiled from rust.

This issue is fixed in this patch by creating a single static library
to link against, which combines all rust packages implemented in
Scylla.

The Cargo.lock for the combined build is now tracked, so that all
users of the same scylla version also use the same versions of
imported rust modules.

Additionally, the rust package implementation and usage
docs are modified to be compatible with the build changes.

This patch also adds a new header file 'rust/cxx.hh' that contains
definitions of additional rust types available in c++.
A cargo profile is created for each of build modes: dev, debug,
sanitize, realease and coverage. The names of cargo profiles are
prefixed by "rust-" because cargo does not allow separate "dev"
and "debug" profiles.

The main difference between profiles are their optimization levels,
they correlate to the levels used in configure.py. The debug info
is stripped only in the dev mode, and only this mode uses
"incremental" compilation to speed it up.
The C++ bindings provided by wasmtime are lacking a crucial
capability: asynchronous execution of the wasm functions.
This forces us to stop the execution of the function after
a short time to prevent increasing the latency. Fortunately,
this feature is implemented in the native language
of Wasmtime - Rust. Support for Rust was recently added to
scylla, so we can implement the async bindings ourselves,
which is done in this patch.

The bindings expose all the objects necessary for creating
and calling wasm functions. The majority of code implemented
in Rust is a translation of code that was previously present
in C++.

Types exported from Rust are currently required to be defined
by the  same crate that contains the bridge using them, so
wasmtime types can't be exported directly. Instead, for each
class that was supposed to be exported, a wrapper type is
created, where its first member is the wasmtime class. Note
that the members are not visible from C++ anyway, the
difference only applies to Rust code.

Aside from wasmtime types and methods, two additional types
are exported with some associated methods.
- The first one is ValVec, which is a wrapper for a rust Vec
of wasmtime Vals. The underlying vector is required by
wasmtime methods for calling wasm functions. By having it
exported we avoid multiple conversions from a Val wrapper
to a wasmtime Val, as would be required if we exported a
rust Vec of Val wrappers (the rust Vec itself does not
require wrappers if the type it contains is already wrapped)
- The second one is Fut. This class represents an computation
tha may or may not be ready. We're currently using it
to control the execution of wasm functions from C++. This
class exposes one method: resume(), which returns a bool
that signals whether the computation is finished or not.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
This patch replaces all dependencies on the wasmtime
C++ bindings with our new ones.
The wasmtime.hh and wasm_engine.hh files are deleted.
The libwasmtime.a library is no longer required by
configure.py. The SCYLLA_ENABLE_WASMTIME macro is
removed and wasm udfs are now compiled by default
on all architectures.
In terms of implementation, most of code using
wasmtime was moved to the Rust source files. The
remaining code uses names from the new bindings
(which are mostly unchanged). Most of wasmtime objects
are now stored as a rust::Box<>, to make it compatible
with rust lifetime requirements.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
The new implementation for WASM UDFs allows executing the UDFs
in pieces. This commit adds a test asserting that the UDF is in fact
divided and that each of the execution segments takes no longer than
1ms.
Different users may require different limits for their UDFs. This
patch allows them to configure the size of their cache of wasm,
the maximum size of indivitual instances stored in the cache, the
time after which the instances are evicted, the fuel that all wasm
UDFs are allowed to consume before yielding (for the control of
latency), the fuel that wasm UDFs are allowed to consume in total
(to allow performing longer computations in the UDF without
detecting an infinite loop) and the hard limit of the size of UDFs
that are executed (to avoid large allocations)
The wasmtime runtime allocates memory for the executable code of
the WASM programs using mmap and not the seastar allocator. As
a result, the memory that Scylla actually uses becomes not only
the memory preallocated for the seastar allocator but the sum of
that and the memory allocated for executable codes by the WASM
runtime.
To keep limiting the memory used by Scylla, we measure how much
memory do the WASM programs use and if they use too much, compiled
WASM UDFs (modules) that are currently not in use are evicted to
make room.
To evict a module it is required to evict all instances of this
module (the underlying implementation of modules and instances uses
shared pointers to the executable code). For this reason, we add
reference counts to modules. Each instance using a module is a
reference. When an instance is destroyed, a reference is removed.
If all references to a module are removed, the executable code
for this module is deallocated.
The eviction of a module is actually acheved by eviction of all
its references. When we want to free memory for a new module we
repeatedly evict instances from the wasm_instance_cache using its
LRU strategy until some module loses all its instances. This
process may not succeed if the instances currently in use (so not
in the cache) use too much memory - in this case the query also
fails. Otherwise the new module is added to the tracking system.
This strategy may evict some instances unnecessarily, but evicting
modules should not happen frequently, and any more efficient
solution requires an even bigger intervention into the code.
The main source of big allocations in the WASM UDF implementation
is the WASM Linear Memory. We do not want Scylla to crash even if
a memory allocation for the WASM Memory fails, so we assert that
an exception is thrown instead.

The wasmtime runtime does not actually fail on an allocation failure
(assuming the memory allocator does not abort and returns nullptr
instead - which our seastar allocator does). What happens then
depends on the failed allocation handling of the code that was
compiled to WASM. If the original code threw an exception or aborted,
the resulting WASM code will trap. To make sure that we can handle
the trap, we need to allow wasmtime to handle SIGILL signals, because
that what is used to carry information about WASM traps.

The new test uses a special WASM Memory allocator that fails after
n allocations, and the allocations include both memory growth
instructions in WASM, as well as growing memory manually using the
wasmtime API.

Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>
Before the changes intorducing the new wasmtime bindings we relied
on an downloaded static library libwasmtime.a. Now that the bindings
are introduced, we do not rely on it anymore, so all references to
it can be removed.
@wmitros
Copy link
Contributor Author

wmitros commented Jan 6, 2023

The last rebase only fixed a conflict with master.

Assuming I didn't mess up the last nitpick, we got a LGTM from @piodul, so can you take a (maybe final) look @avikivity?

@scylladb-promoter
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants