Add WASM UDF implementation in Rust #11351

wmitros · 2022-08-22T13:43:21Z

This series adds the implementation and usage of rust wasmtime bindings.

The WASM UDFs introduced by this patch are interruptable and use memory allocated using the seastar allocator.

This series includes #11102 (the first two commits) because #11102 required disabling wasm UDFs completely. This patch disables them in the middle of the series, and enables them again at the end.
After this patch, libwasmtime.a can be removed from the toolchain.
This patch also removes the workaround for ##9387 but it hasn't been tested with ARM yet - if the ARM test causes issues I'll revert this part of the change.

wmitros · 2022-08-22T14:01:46Z

During the development of this patch I realized an issue that isn't linked to this patch, but to Rust as a whole - the behavior on OOM errors.
Stable Rust aborts() on such an error and does not provide methods that allow writing code with a fallible allocations - there are try_reserve methods for some Collections but this method is missing for e.g. the Box type, which is crucial for us. This could at least partially be helped by nightly Rust - the Box::try_new is available there, but I would still need to check if the remaining code can be made safe then

wmitros · 2022-08-22T14:13:24Z

configure.py

@@ -1893,7 +1892,7 @@ def configure_abseil(build_dir, mode, mode_config):
              pool = console
              description = TEST {mode}
            rule rust_lib.{mode}
-              command = CARGO_HOME=build/{mode}/rust/.cargo cargo build --release --manifest-path=rust/Cargo.toml --target-dir=build/{mode}/rust -p ${{pkg}}
+              command = CARGO_HOME=build/{mode}/.cargo RUSTFLAGS="--cfg=rustix_use_libc" cargo build --manifest-path=rust/Cargo.toml --target-dir=build/{mode} --profile=rust-{mode}


This RUSTFLAGS value fixes a weird issue where rustix - a rust crate that wasmtime depends on - returns 0 as the system page size when using linux_raw instead of libc as the backend for systemcalls

In case it helps, I believe this problem is fixed by bytecodealliance/rustix#385, which is included in rustix 0.35.9.

Thanks for the info. I updated the dependencies and confirmed that everything works now without this flag

scylladb-promoter · 2022-08-22T15:53:46Z

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1887/

scylladb-promoter · 2022-08-23T12:00:39Z

CI state SUCCESS - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1899/

psarna

Love it! @piodul FYI, a fair portion of Rust code to review

psarna · 2022-08-23T12:11:35Z

rust/Cargo.toml

@@ -9,3 +9,30 @@ inc = { path = "inc", version = "0.1.0" }

 [lib]
 crate-type = ["staticlib"]
+
+[profile.rust-dev]


Nitpick: maybe the prefix should indicate that this is scylla-specific - scylla-dev instead of rust-dev? rust- might suggest that it's some kind of an official standard, and it isn't.

rust- has a benefit that the directory that contains the build files also has rust in its name (currently it's $builddir/{mode}/rust-{mode}), but I think something like builddir/{}/rust/scylla-{mode} would be just as good. On the other hand, maybe a comment would suffice?

a comment would suffice for me

Comment added in the rebase

psarna · 2022-08-23T12:19:46Z

rust/wasmtime_bindings/src/memory_creator.rs

+            if let Ok(new_layout) = std::alloc::Layout::from_size_align(new_size, 64 * 1024) {
+                let new_ptr = match self.layout {
+                    Some(layout) => unsafe { std::alloc::realloc(self.ptr, layout, new_layout.size()) },
+                    None => unsafe { std::alloc::alloc(new_layout) },


just to confirm - is this whole thing done simply to force wasm to use host malloc/free for allocations, so that it gets automatically handled by the Seastar allocator?

Yes, otherwise the memory will be allocated with mmap that we don't override in seastar

In this case, let's explicitly explain it in the comment, along with mentioning mmap. Technically, nothing guarantees that std::alloc doesn't switch to mmap one day

(or, just use an unsafe malloc/free pair directly to be sure)

Oh wait, it's an unsafe block anyway. In this case I strongly suggest you just forward-declare malloc, realloc, free, whatever's needed, and use them directly. Then we're 100% sure there won't be any mmap sneaking in.

And in case we have strict layout requirements, like a 64k page for wasm, perhaps instead of malloc we should just go with aligned_alloc (https://linux.die.net/man/3/aligned_alloc)

Ref: https://github.com/rust-lang/rust/blob/c3605f8c8020dbbe8f0d1961c7b33c4c4b78ad0d/library/std/src/sys/unix/alloc.rs#L6-L24

with this implementation it uses aligned_alloc (or malloc for small allocations), which we properly override in Seastar indeed, and it will most likely stay that way due to the 64k alignment, because mmap doesn't exactly let you choose a multi-page alignment anyway. But in theory on on certain hugepage configurations it may be beneficial to use mmap over aligned_alloc, so in my opinion we should be prepared for that

Ok, I used aligned_alloc and free in the new rebase. I've also added a comment about mmap.
I removed the std::alloc::realloc because it should probably not be used here anyway, considering it doesn't keep the alignment

psarna · 2022-08-23T12:22:04Z

lang/wasm.cc

+        if (eptr) {
+            co_await coroutine::return_exception_ptr(eptr);
+        }
+        co_await coroutine::maybe_yield();


I see you do yield here. Why are the others different? Need comments.

I don't yield in other places mostly because they should finish quickly, the other cases are also not used in future functions so it would require quite a bit more changes to yield there - I'll add comments.

scylladb-promoter · 2022-08-23T21:47:24Z

CI state SUCCESS - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1918/

wmitros · 2022-08-29T11:57:10Z

I've rebased onto master, added a comment and removed some unused code.

scylladb-promoter · 2022-08-29T18:13:18Z

CI state SUCCESS - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1981/

psarna · 2022-09-05T09:14:48Z

@wmitros for starters, please also run cargo clippy (ref: https://github.com/rust-lang/rust-clippy#step-2-install-clippy) on the Rust source code, it will provide a few tips on how to make the code more idiomatic:

warning: explicit lifetimes given in parameter types where they could be elided (or replaced with `'_` if needed by type declaration)
  --> src/lib.rs:15:39
   |
15 |   #[cxx::bridge(namespace = "wasmtime")]
   |  _______________________________________^
16 | | mod ffi {
17 | |     enum ValKind {
18 | |         I32,
...  |
77 | |         type Fut<'a>;
78 | |         unsafe fn resume<'a>(self: &mut Fut<'a>) -> Result<bool>;
   | |________________________________________________________________^
   |
   = note: `#[warn(clippy::needless_lifetimes)]` on by default
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes

warning: explicit lifetimes given in parameter types where they could be elided (or replaced with `'_` if needed by type declaration)
  --> src/lib.rs:78:9
   |
78 |         unsafe fn resume<'a>(self: &mut Fut<'a>) -> Result<bool>;
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes

warning: explicit lifetimes given in parameter types where they could be elided (or replaced with `'_` if needed by type declaration)
  --> src/lib.rs:84:31
   |
84 |         ) -> Result<Box<Fut<'a>>>;
   |                               ^
   |
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_lifetimes

warning: this expression creates a reference which is immediately dereferenced by the compiler
   --> src/lib.rs:100:39
    |
100 |     let mut ctx = Context::from_waker(&waker);
    |                                       ^^^^^^ help: change this to: `waker`
    |
    = note: `#[warn(clippy::needless_borrow)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#needless_borrow

warning: the function `wasmtime::Store::new` doesn't need a mutable reference
   --> src/lib.rs:136:42
    |
136 |     let mut store = wasmtime::Store::new(&mut engine.wasmtime_engine, wasi);
    |                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: `#[warn(clippy::unnecessary_mut_passed)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_mut_passed

warning: `wasmtime_bindings` (lib) generated 5 warnings
    Finished dev [unoptimized + debuginfo] target(s) in 58.43s

avikivity · 2022-09-05T09:25:40Z

rust/Cargo.lock

+ "errno-dragonfly",
+ "libc",
+ "winapi",
+]


Does rust have the equivalent of ccache that can store cached build results outside the build directory? So that rm -rf build; ninja will be fast?

I'm thinking of our builders which start from a clean directory each time.

Mozilla's sccache works with rust: https://github.com/mozilla/sccache . Disclaimer: I have never used it personally so I can't tell how well it works in practice.

Ok. Let's start using it when we see rust compilation times having an impact.

The incremental option is supposed to help with rebuild times, but it does not help with the rm -rf build; ninja case and I only enabled it for the dev mode, as it might make the final binary less optimized.
I might just try this sccache because in the case of rm -rf build; ninja the rust compilation is very impactful (although this could have been expected, as rust is the only thing that is being rebuilt then)

I tried sccache and it helps a lot with the rm -rf build; ninja case. I don't expect we want to add it as one of our dependencies, but explain how to use it in some docs. I'll need to research it a bit more so maybe we can do this as a follow-up

We install ccache into the dbuild environment so we may as well install sscscscscache. Certainly it could be in a separate pull request.

I see it even supports C++, so if it's better in some way, we can replace ccache.

avikivity · 2022-09-05T09:31:14Z

rust/wasmtime_bindings/src/lib.rs

+            }
+            Poll::<Result<wasmtime::Instance>>::Ready(Err(e)) => {
+                return Err(anyhow!("Failed to instantiate module: {:?}", e));
+            }


Why does instantiate_async() return a future?

If it's in order to yield, then aren't we wasting the opportunity by looping? We should return to the caller and let them yield too.

(instantiating an instance is less important than execution, since it's rarer, but still)

According to the documentation, we return a future here only to potentially yield when we "asynchronously invoke the wasm start function in case it calls any imported function which is an asynchronous host function". The only imports come from WASI, but I don't think they are used in the start function (actually I couldn't find the start function implementation - it is also possible that there is none).
If we also go with #11351 (comment) no imports will be async at all, so it won't be possible that this future yields in the middle

avikivity · 2022-09-05T09:32:36Z

rust/wasmtime_bindings/src/lib.rs

+        self.wasmtime_memory.size(&store.wasmtime_store)
+    }
+    fn grow(self: &mut Memory, store: &mut Store, delta: u64) -> Result<u64> {
+        self.wasmtime_memory.grow(&mut store.wasmtime_store, delta)


What does wasmtime use to allocate memory?

(Need to understand if it will be accounted for in seastar memory management)

Aside from the WebAssembly Memory allocator that uses the seastar aligned_alloc, the rust code uses the seastar allocator as its global allocator (though currently this is an implementation detail that could possibly change), but wasmtime (the rust crate) still mostly uses mmap - in particular to allocate memory during module compilation.

I skimmed through wasmtime code and it indeed mmaps a lot. Some of it is actual file-backed mapping, but often it's used as glorified malloc with MAP_ANONYMOUS.

We could technically add an opt-in Seastar feature which makes extrernal mmap-based libraries more Seastar friendly:

Override mmap to fall back to original behavior, except when MAP_ANONYMOUS is set, in which case we just call aligned_alloc with page size boundary, which effectively gives us pages. Then, we can apply requested protection with a separate mprotect call.

munmap gets a little complicated, because I don't know any reliable way of checking if a page is file-backed or anonymous, except for examining /proc/self/maps output. But if we assume that the number of anonymous mappings is relatively small, perhaps adding a hashmap overhead is acceptable if it means that we regain the ability to allocate everything with a Seastar allocator. Then, we could just track our anonymous pages ourselves and override munmap behavior to a regular free if the page is tracked as "ours".

@avikivity Does it sound too obscure to you, or is it feasible? The main question is how many more mmap-ed libraries we're likely to integrate.

Won't file-backed mmap() cause blocking memory accesses and stalls?

Not just stalls, also OOM since we already allocated all memory for our own use.

Overriding ::mmap is likely to end up with a lot of tears, it's too fundamental and too not designed to be overridden.

Can we patch wasmtime (upstream) not to use mmap? It seems like premature optimization. The system memory allocator will call mmap when the conditionals call for it.

Perhaps the issue is the need to make the memory executable later with mprotect() (typical jit operation is to first mmap with PROT_WRITE, then mprotect to PROT_EXEC.

If so, perhaps we can pass callbacks to wasmtime for mmap/mprotect/munmap so we can intercept them. The default callbacks will just call the system mmap/mprotect/munmap.

Completely removing mmap from wasmtime seems difficult, but maybe it is not essential after all.
I inspected the wasmtime code and I found that the mmaps are used in:

the default Linear Memory allocator - which we have replaced

the Pooling Memory Allocator (where we preallocate all the memory for all the instances that we can later use) - which we do not use

in the copy-on-write implementation of the initialization of Linear Memories - which is only used in the before-mentioned Pooling Memory Allocator

for allocation of the stacks for asynchronous calls - this could probably be replaced with malloc

for allocation of the alternate signal stack - this could also probably be replaced with malloc

in compilation of a module from a file - which we currently do not use

for allocation of for the jit code to be later mprotected with PROT_EXEC - the mmaps are all with MAP_ANONYMOUS but I'm not sure we can help it (according to the mprotect manual mprotect() is unspecified if it is applied to a of memory that was not obtained via mmap())

for allocation of for the jit code to be later mprotected with PROT_EXEC - the mmaps are all with MAP_ANONYMOUS but I'm not sure we can help it (according to the mprotect manual mprotect() is unspecified if it is applied to a of memory that was not obtained via mmap())

All memory in the seastar allocator is obtained via mmap(). So we should be able to mprotect memory coming from it. But we have to be very careful to make sure the result of mprotect doesn't spill over to memory used by other objects and to remove it after not needed.

avikivity · 2022-09-05T09:37:34Z

rust/wasmtime_bindings/src/memory_creator.rs

+        if new_size == 0 {
+            new_ptr = ptr::null_mut()
+        } else {
+            new_ptr = unsafe { aligned_alloc(WASM_PAGE_SIZE, new_size) };


This answers my previous question.

aligned_alloc isn't defined if new_size isn't a multiple of WASM_PAGE_SIZE, is it guaranteed by the caller?

When I look at our aligned_alloc overwrite it looks like we align the size up when necessary, but I see that in the linux specification it's as you described.
I'd expect the size here to be aligned, but I don't see it specified directly, so I think I'll just use memalign here

note that memalign is deprecated

(you could just align the new size yourself)

I'll align the size myself then

avikivity · 2022-09-05T09:38:24Z

lang/wasm.cc

@@ -6,8 +6,6 @@
 * SPDX-License-Identifier: AGPL-3.0-or-later
 */

-#ifdef SCYLLA_ENABLE_WASMTIME
-


Please check that the series continues building without wasmtime.

Wasmtime is now downloaded as part of the compilation of the .rs files, so it should never be missing, but it may not work correctly in ARM (it didn't work when we used libwasmtime.a) - I'll try to check this soon. If it doesn't work with ARM I'll bring the SCYLLA_ENABLE_WASMTIME back

Wait what? Didn't we add it to the toolchain?

In any case it may not be available on all architectures.

We did add Rust to the toolchain, but we didn't add all the crates that will be used in this patch.
Actually, we should probably discuss how do we want to deal with downloading the new crates.
Currently, the dependency will be downloaded if it's missing during the compilation (it won't be re-downloaded if it's already present).
If we don't want that, we could update the toolchain every time a dependency is changed.
We can also update the toolchain only periodically, and some dependencies could ever need to be downloaded during compilation.

I didn't really see benefits to the latter options because after the first build, there basically is no difference between them

Actually, where do we want to keep the downloaded crates? Cargo uses $HOME/.cargo by default.
To test if everything works with ARM this would help a lot

avikivity · 2022-09-05T09:39:59Z

lang/wasm.cc

+                    rets->push_i32(0);
+
+                    auto fut = wasmtime::get_func_future(store, *malloc_func, *argv, *rets);
+                    while(!fut->resume());


Code style: space after while, it isn't a function.

Also a comment indicating that we need to yield here.

If we set need_thread, then it may be enough to call seastar::thread::maybe_yield() here.

I can modify the operator() in the init(_nullable)_arg_visitor so that it returns a future, then we can use the maybe_yield like when executing the UDF

If this is just pushing arguments, maybe unneeded. But comments are definitely needed.

avikivity · 2022-09-05T09:40:16Z

lang/wasm.cc

+            argv->push_i32((int32_t)val.i64());
+            auto rets = wasmtime::get_val_vec();
+            auto free_fut = wasmtime::get_func_future(store, *free_func, *argv, *rets);
+            while(!free_fut->resume());


avikivity · 2022-09-05T09:43:06Z

Looks good. Please add tests that prove we yield on a long loop.

piodul · 2022-09-05T10:10:52Z

configure.py

@@ -675,6 +675,14 @@ def find_headers(repodir, excluded_dirs):
    'raft/log.cc',


Typo in the first commit message: "in this pach" -> "in this patch"

piodul · 2022-09-05T10:21:46Z

configure.py

+    'rust/Cargo.lock',
+    'rust/inc/src/lib.rs',
+    'rust/inc/Cargo.toml',
+]


It would be nice if the detection of the *.rs files were automatic. When working with a rust-only project, cargo usually detects dependent source files automatically and knows what and when to recompile. IMO we should consider it as an improvement in a followup.

I did also consider this but had a problem with incorporating it into the ninja build rules.
But now that I look at it, maybe we can simply add | always to the build rule for librust_combined.a and remove this list whatsoever - this doesn't cost us because if cargo determines no files changed, it will not compile them again
It even looks like this modification doesn't make rust build literally always, but only when there's a dependency on librust_combined.a

piodul · 2022-09-05T11:00:36Z

rust/wasmtime_bindings/src/lib.rs

+
+    loop {
+        match inst_fut.as_mut().poll(&mut ctx) {
+            Poll::<Result<wasmtime::Instance>>::Pending => {}


Is the turbofish needed here? It makes the line quite long and verbose. Would Poll::Pending work here instead?

It works without it. I'll also remove it for the Poll::Ready case

piodul · 2022-09-05T11:06:12Z

rust/wasmtime_bindings/Cargo.toml

+[dependencies]
+cxx = { version = "1.0.73", features = ["c++20"] }
+wasmtime = { version = "0.40.0", features = ["async"] }
+wasmtime-wasi = { version = "0.40.0", features = ["tokio"] }


Why do we need the tokio feature? We aren't running a tokio runtime, and we would rather avoid code which assumes running in the tokio runtime context (it will panic if it tries to access stuff from tokio).

The tokio feature was the only alternative to the sync feature, so it looked like using it was the only option for the async wasi, but now that I think about it, we might not need it here and just use the sync WASI - I'll test if the yielding works without it

piodul · 2022-09-05T11:12:31Z

rust/wasmtime_bindings/src/lib.rs

+}
+
+fn get_abi(instance: &Instance, store: &mut Store, memory: &Memory) -> Result<u32> {
+    match instance


Style suggestion: the function returns a Result<>, so you can use the ? operator and some methods on Option to make the code flatter. For example:

let export = instance .wasmtime_instance .get_export(&mut store.wasmtime_store, "_scylla_abi") .ok_or_else(|| anyhow!("ABI version export not found - please export `_scylla_abi` in the wasm module"))?; let global = global .get(&mut store.wasmtime_store) .ok_or_else(|| anyhow!("Exported object _scylla_abi is not a global"))?; // ...

Other functions would benefit from this conversion, too.

I also added some map_err in the same style in the rebase

piodul · 2022-09-05T11:20:06Z

replica/database.cc

@@ -361,7 +361,7 @@ database::database(const db::config& cfg, database_config dbcfg, service::migrat
    , _feat(feat)
    , _shared_token_metadata(stm)
    , _sst_dir_semaphore(sst_dir_sem)
-    , _wasm_engine(std::make_unique<wasm::engine>())
+    , _wasm_engine(std::make_unique<rust::Box<wasmtime::Engine>>(wasmtime::create_engine()))


Isn't rust::Box something like a std::unique_ptr already? Would it be possible to just use rust::Box directly and remove the double indirection?

I tried using the rust::Box alone in the rebase and looks like it works for now

piodul · 2022-09-05T11:22:47Z

lang/wasm.cc

            "v128",
-            "externref",
            "funcref",
+            "externref",


Why was the order changed here?

It matched the order of enum elements in lang/wasmtime.hh

scylladb/lang/wasmtime.hh

Lines 492 to 497 in 41b91e3

/// WebAssembly's `v128` type from the simd proposal

V128,

/// WebAssembly's `externref` type from the reference types

ExternRef,

/// WebAssembly's `funcref` type from the reference types

FuncRef,

and now it matches the order in the wasmtime crate https://docs.rs/wasmtime/latest/wasmtime/enum.Val.html

psarna · 2022-09-05T11:29:07Z

rust/wasmtime_bindings/src/lib.rs

+    wasmtime_engine: wasmtime::Engine,
+}
+
+fn create_engine() -> Result<Box<Engine>> {


am I right to assume that Result gets translated by cxx to produce a function that returns a T if the result is Ok, and throws an exception in case of Err?

Yup: https://cxx.rs/binding/result.html#returning-result-from-rust-to-c

scylladb-promoter · 2023-01-04T18:01:24Z

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3703/

nyh · 2023-01-04T22:30:08Z

There seems to be a build error:

2023-01-04T17:57:21.586Z] FAILED: build/debug/main.o 
[2023-01-04T17:57:21.586Z] clang++ -MD -MT build/debug/main.o -MF build/debug/main.o.d -I/jenkins/workspace/releng/Scylla-CI/scylla/seastar/include -I/jenkins/workspace/releng/Scylla-CI/scylla/build/debug/seastar/gen/include -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -DSEASTAR_API_LEVEL=6 -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_DEBUG -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_TYPE_ERASE_MORE -DFMT_SHARED -DSEASTAR_BROKEN_SOURCE_LOCATION -I/usr/include/p11-kit-1  -DDEBUG -DSANITIZE -DDEBUG_LSA_SANITIZER -DSCYLLA_ENABLE_ERROR_INJECTION -Og -DSCYLLA_BUILD_MODE=debug -g -gz -iquote. -iquote build/debug/gen --std=gnu++20  -ffile-prefix-map=/jenkins/workspace/releng/Scylla-CI/scylla=.  -march=westmere -DBOOST_TEST_DYN_LINK   -DNOMINMAX -DNOMINMAX -fvisibility=hidden  -Wall -Werror -Wno-mismatched-tags -Wno-tautological-compare -Wno-parentheses-equality -Wno-c++11-narrowing -Wno-sometimes-uninitialized -Wno-return-stack-address -Wno-missing-braces -Wno-unused-lambda-capture -Wno-overflow -Wno-noexcept-type -Wno-error=cpp -Wno-ignored-attributes -Wno-overloaded-virtual -Wno-unused-command-line-argument -Wno-defaulted-function-deleted -Wno-redeclared-class-member -Wno-unsupported-friend -Wno-unused-variable -Wno-delete-non-abstract-non-virtual-dtor -Wno-braced-scalar-init -Wno-implicit-int-float-conversion -Wno-delete-abstract-non-virtual-dtor -Wno-uninitialized-const-reference -Wno-psabi -Wno-narrowing -Wno-array-bounds -Wno-nonnull -Wno-uninitialized -Wno-error=deprecated-declarations -DXXH_PRIVATE_API -DSEASTAR_TESTING_MAIN -DFMT_DEPRECATED_OSTREAM -DHAVE_LZ4_COMPRESS_DEFAULT  -c -o build/debug/main.o main.cc
[2023-01-04T17:57:21.587Z] main.cc:463:13: error: no member named 'reserve_additional_memory' in 'seastar::app_template::config'
[2023-01-04T17:57:21.587Z]     app_cfg.reserve_additional_memory = 50 * 1024 * 1024;
[2023-01-04T17:57:21.587Z]     ~~~~~~~ ^
[2023-01-04T17:57:21.587Z] 1 error generated.

Weird, is this something you even changed in your patch?

wmitros · 2023-01-04T23:20:47Z

Weird, is this something you even changed in your patch?

Yes, this is something I added to seastar and it became usable only recently, with the last seastar update.
Looks like I didn't use the latest master branch for the CI. It should build when I rebase

scylladb-promoter · 2023-01-05T02:17:49Z

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3707/

piodul · 2023-01-05T09:38:00Z

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3707/

This time it seems to be a legitimate failure in the test_fib_called_on_null.

thread '<unnamed>' panicked at 'attempt to subtract with overflow', wasmtime_bindings/src/lib.rs:181:9
...
  12:         0x1262fba3 - wasmtime$cxxbridge1$Module$remove_user
                               at /jenkins/workspace/releng/Scylla-CI/scylla/rust/wasmtime_bindings/src/lib.rs:47:42
  13:         0x11892e86 - _ZN4wasm14instance_cache17remove_module_refERN8wasmtime6ModuleE
                               at ./lang/wasm_instance_cache.cc:240:12
...

Looks like the reference count for a module went below zero. In debug mode, Rust panics on under/overflow.

wmitros · 2023-01-05T09:41:44Z

CI state FAILURE - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3707/

This time it seems to be a legitimate failure in the test_fib_called_on_null.
thread '<unnamed>' panicked at 'attempt to subtract with overflow', wasmtime_bindings/src/lib.rs:181:9
...
  12:         0x1262fba3 - wasmtime$cxxbridge1$Module$remove_user
                               at /jenkins/workspace/releng/Scylla-CI/scylla/rust/wasmtime_bindings/src/lib.rs:47:42
  13:         0x11892e86 - _ZN4wasm14instance_cache17remove_module_refERN8wasmtime6ModuleE
                               at ./lang/wasm_instance_cache.cc:240:12
...
Looks like the reference count for a module went below zero. In debug mode, Rust panics on under/overflow.

Yes, I found that while adding the "handle" for modules there was one place where I didn't remove the manual reference counting. I removed it and now the references are counted only in the handle constructor/destructor.

piodul

One last suggestion from my side, otherwise LGTM.

piodul · 2023-01-05T12:27:51Z

lang/wasm_instance_cache.hh

+    // Wasmtime instances hold references to modules, so the module can only be dropped
+    // when all instances are dropped. For a given module, we can have at most one
+    // instance for each scheduling group.
+    // This function is called each time a new instance is created for a given module.
+    // If there were no instances for the module before, i.e. this module was not
+    // compiled, the module is compiled and the size of the compiled code is added
+    // to the total size of compiled code. If the total size of compiled code exceeds
+    // the maximum size as a result of this, the function will evict modules until
+    // there is enough space for the new module. If it is not possible, the function
+    // will throw an exception. If this function succeeds, the counter of instances
+    // for the module is increased by one.
+    void track_module_ref(wasmtime::Module& module, wasmtime::Engine& engine);
+
+    // This function is called each time an instance for a given module is dropped.
+    // If the counter of instances for the module reaches zero, the module is dropped
+    // and the size of the compiled code is subtracted from the total size of compiled code.
+    void remove_module_ref(wasmtime::Module& module) noexcept;
+
+    // When a WASM UDF is executed, a separate stack is first allocated for it.
+    // This stack is used by the WASM code and it is not tracked by the seastar allocator.
+    // This function will evict cached modules until the stack can be allocated. If enough
+    // memory can't be freed, the function will throw an exception.
+    void reserve_wasm_stack();
+
+    // This function should be called after a WASM UDF finishes execution. Its stack is then
+    // destroyed and this function accounts for the freed memory.
+    void free_wasm_stack() noexcept;


Nit: Is there a reason why those methods are public and not private? They seem to be internal to how the instance cached works. The methods track_module_ref/remove_module_ref now should probably be exclusively used by module_handle, so they could also be made private and module_handle could become a friend of instance_cache.

Makes sense, added in rebase

scylladb-promoter · 2023-01-05T13:47:50Z

CI state SUCCESS - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3710/

scylladb-promoter · 2023-01-05T19:57:22Z

CI state SUCCESS - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3721/

Rust's cargo caches downloaded sources in ~/.cargo. However dbuild won't provide access to this directory since it's outside the source directory. Prepare for sharing the cargo cache between the host and the dbuild environment by: - Creating the cache if it doesn't already exist. This is likely if the user only builds in a dbuild environment. - Propagating the cache directory as a mounted volume. - Respecting the CARGO_HOME override.

Currently, the rust build system in Scylla creates a separate static library for each incuded rust package. This could cause duplicate symbol issues when linking against multiple libraries compiled from rust. This issue is fixed in this patch by creating a single static library to link against, which combines all rust packages implemented in Scylla. The Cargo.lock for the combined build is now tracked, so that all users of the same scylla version also use the same versions of imported rust modules. Additionally, the rust package implementation and usage docs are modified to be compatible with the build changes. This patch also adds a new header file 'rust/cxx.hh' that contains definitions of additional rust types available in c++.

A cargo profile is created for each of build modes: dev, debug, sanitize, realease and coverage. The names of cargo profiles are prefixed by "rust-" because cargo does not allow separate "dev" and "debug" profiles. The main difference between profiles are their optimization levels, they correlate to the levels used in configure.py. The debug info is stripped only in the dev mode, and only this mode uses "incremental" compilation to speed it up.

The C++ bindings provided by wasmtime are lacking a crucial capability: asynchronous execution of the wasm functions. This forces us to stop the execution of the function after a short time to prevent increasing the latency. Fortunately, this feature is implemented in the native language of Wasmtime - Rust. Support for Rust was recently added to scylla, so we can implement the async bindings ourselves, which is done in this patch. The bindings expose all the objects necessary for creating and calling wasm functions. The majority of code implemented in Rust is a translation of code that was previously present in C++. Types exported from Rust are currently required to be defined by the same crate that contains the bridge using them, so wasmtime types can't be exported directly. Instead, for each class that was supposed to be exported, a wrapper type is created, where its first member is the wasmtime class. Note that the members are not visible from C++ anyway, the difference only applies to Rust code. Aside from wasmtime types and methods, two additional types are exported with some associated methods. - The first one is ValVec, which is a wrapper for a rust Vec of wasmtime Vals. The underlying vector is required by wasmtime methods for calling wasm functions. By having it exported we avoid multiple conversions from a Val wrapper to a wasmtime Val, as would be required if we exported a rust Vec of Val wrappers (the rust Vec itself does not require wrappers if the type it contains is already wrapped) - The second one is Fut. This class represents an computation tha may or may not be ready. We're currently using it to control the execution of wasm functions from C++. This class exposes one method: resume(), which returns a bool that signals whether the computation is finished or not. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>

This patch replaces all dependencies on the wasmtime C++ bindings with our new ones. The wasmtime.hh and wasm_engine.hh files are deleted. The libwasmtime.a library is no longer required by configure.py. The SCYLLA_ENABLE_WASMTIME macro is removed and wasm udfs are now compiled by default on all architectures. In terms of implementation, most of code using wasmtime was moved to the Rust source files. The remaining code uses names from the new bindings (which are mostly unchanged). Most of wasmtime objects are now stored as a rust::Box<>, to make it compatible with rust lifetime requirements. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>

The new implementation for WASM UDFs allows executing the UDFs in pieces. This commit adds a test asserting that the UDF is in fact divided and that each of the execution segments takes no longer than 1ms.

Different users may require different limits for their UDFs. This patch allows them to configure the size of their cache of wasm, the maximum size of indivitual instances stored in the cache, the time after which the instances are evicted, the fuel that all wasm UDFs are allowed to consume before yielding (for the control of latency), the fuel that wasm UDFs are allowed to consume in total (to allow performing longer computations in the UDF without detecting an infinite loop) and the hard limit of the size of UDFs that are executed (to avoid large allocations)

The wasmtime runtime allocates memory for the executable code of the WASM programs using mmap and not the seastar allocator. As a result, the memory that Scylla actually uses becomes not only the memory preallocated for the seastar allocator but the sum of that and the memory allocated for executable codes by the WASM runtime. To keep limiting the memory used by Scylla, we measure how much memory do the WASM programs use and if they use too much, compiled WASM UDFs (modules) that are currently not in use are evicted to make room. To evict a module it is required to evict all instances of this module (the underlying implementation of modules and instances uses shared pointers to the executable code). For this reason, we add reference counts to modules. Each instance using a module is a reference. When an instance is destroyed, a reference is removed. If all references to a module are removed, the executable code for this module is deallocated. The eviction of a module is actually acheved by eviction of all its references. When we want to free memory for a new module we repeatedly evict instances from the wasm_instance_cache using its LRU strategy until some module loses all its instances. This process may not succeed if the instances currently in use (so not in the cache) use too much memory - in this case the query also fails. Otherwise the new module is added to the tracking system. This strategy may evict some instances unnecessarily, but evicting modules should not happen frequently, and any more efficient solution requires an even bigger intervention into the code.

The main source of big allocations in the WASM UDF implementation is the WASM Linear Memory. We do not want Scylla to crash even if a memory allocation for the WASM Memory fails, so we assert that an exception is thrown instead. The wasmtime runtime does not actually fail on an allocation failure (assuming the memory allocator does not abort and returns nullptr instead - which our seastar allocator does). What happens then depends on the failed allocation handling of the code that was compiled to WASM. If the original code threw an exception or aborted, the resulting WASM code will trap. To make sure that we can handle the trap, we need to allow wasmtime to handle SIGILL signals, because that what is used to carry information about WASM traps. The new test uses a special WASM Memory allocator that fails after n allocations, and the allocations include both memory growth instructions in WASM, as well as growing memory manually using the wasmtime API. Signed-off-by: Wojciech Mitros <wojciech.mitros@scylladb.com>

Before the changes intorducing the new wasmtime bindings we relied on an downloaded static library libwasmtime.a. Now that the bindings are introduced, we do not rely on it anymore, so all references to it can be removed.

wmitros · 2023-01-06T13:12:50Z

The last rebase only fixed a conflict with master.

Assuming I didn't mess up the last nitpick, we got a LGTM from @piodul, so can you take a (maybe final) look @avikivity?

scylladb-promoter · 2023-01-06T16:41:03Z

CI state SUCCESS - https://jenkins.scylladb.com/job/releng/job/Scylla-CI/3731/

wmitros requested review from psarna and avikivity August 22, 2022 14:02

wmitros force-pushed the rust-wasmtime branch 2 times, most recently from e7e8bf9 to d14861d Compare August 22, 2022 14:05

wmitros commented Aug 22, 2022

View reviewed changes

wmitros mentioned this pull request Aug 22, 2022

Improve compilation and linking of Rust files #11102

Closed

wmitros force-pushed the rust-wasmtime branch from d14861d to 329c00b Compare August 23, 2022 07:18

psarna reviewed Aug 23, 2022

View reviewed changes

wmitros requested a review from piodul August 23, 2022 12:53

wmitros force-pushed the rust-wasmtime branch 2 times, most recently from eb8d1c0 to acdb049 Compare August 23, 2022 16:52

wmitros force-pushed the rust-wasmtime branch 2 times, most recently from 4bc21e3 to 7185307 Compare August 29, 2022 11:50

avikivity reviewed Sep 5, 2022

View reviewed changes

piodul reviewed Sep 5, 2022

View reviewed changes

psarna reviewed Sep 5, 2022

View reviewed changes

wmitros force-pushed the rust-wasmtime branch from e13ab8a to 02739d5 Compare January 4, 2023 23:20

wmitros force-pushed the rust-wasmtime branch from 02739d5 to 01934a9 Compare January 5, 2023 09:38

piodul approved these changes Jan 5, 2023

View reviewed changes

wmitros force-pushed the rust-wasmtime branch from 01934a9 to 9a43b76 Compare January 5, 2023 14:13

avikivity and others added 10 commits January 6, 2023 14:05

test: check that wasmtime functions yield

3214f5c

The new implementation for WASM UDFs allows executing the UDFs in pieces. This commit adds a test asserting that the UDF is in fact divided and that each of the execution segments takes no longer than 1ms.

build: remove references to unused c bindings of wasmtime

903c487

Before the changes intorducing the new wasmtime bindings we relied on an downloaded static library libwasmtime.a. Now that the bindings are introduced, we do not rely on it anymore, so all references to it can be removed.

wmitros force-pushed the rust-wasmtime branch from 9a43b76 to 903c487 Compare January 6, 2023 13:07

This was referenced Jan 7, 2023

Long reactor stalls when running a UDA #12465

Closed

cql3/functions: Fix reactor stalls when aggregating query results #12466

Closed

scylladb-promoter closed this in 02c9968 Jan 9, 2023

scylladb-promoter merged commit 02c9968 into scylladb:master Jan 9, 2023

wmitros mentioned this pull request Feb 7, 2023

test_wasm failed with AddressSanitizer: SEGV on unknown address on aarch64/debug #12428

Closed

		@@ -675,6 +675,14 @@ def find_headers(repodir, excluded_dirs):
		'raft/log.cc',

	/// WebAssembly's `v128` type from the simd proposal
	V128,
	/// WebAssembly's `externref` type from the reference types
	ExternRef,
	/// WebAssembly's `funcref` type from the reference types
	FuncRef,

Add WASM UDF implementation in Rust #11351

Add WASM UDF implementation in Rust #11351

Conversation

wmitros commented Aug 22, 2022

wmitros commented Aug 22, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scylladb-promoter commented Aug 22, 2022

scylladb-promoter commented Aug 23, 2022

psarna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wmitros Aug 23, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

scylladb-promoter commented Aug 23, 2022

wmitros commented Aug 29, 2022

scylladb-promoter commented Aug 29, 2022

psarna commented Sep 5, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

psarna Sep 8, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avikivity commented Sep 5, 2022

piodul Sep 5, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wmitros Sep 5, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wmitros Aug 23, 2022 •

edited

psarna Sep 8, 2022 •

edited

piodul Sep 5, 2022 •

edited

wmitros Sep 5, 2022 •

edited