You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was profiling my serde deserialization library and I went down the rabbit hole of benchmarking different KV container storage types. Got down to Vec<(&'a[u8], &'a[u8]> as the fastest and I wanted to see if my code would be faster by putting those string pointers on the stack. So I tried out SmallVec, was actually a little bit slower (20%) than vec. Should this be expected? I did not consider memory gains but the deserialization change doesn't make sense to me.
I extract my benchmark to run inside of rust-smallvec:
#[macro_use]
extern crate smallvec;
extern crate test;
extern crate bincode;
extern crate serde;
#[macro_use] extern crate serde_derive;
use self::test::Bencher;
use smallvec::{ExtendFromSlice, SmallVec};
#[bench]
fn smallvec_i32_benchmark(b: &mut Bencher) {
use self::bincode::{serialize, deserialize};
let data = {
let tinyvec : SmallVec<[i32; 5]> = smallvec![1,2,3,4,5];
let sv = tinyvec;
serialize(&sv).unwrap()
};
b.iter(|| {
let tinyvec_2 : SmallVec<[i32; 5]> = deserialize(&data[..]).unwrap();
});
}
#[bench]
fn vec_i32_benchmark(b: &mut Bencher) {
use self::bincode::{serialize, deserialize};
let data = {
let tinyvec : Vec<i32> = vec![1,2,3,4,5];
let sv = tinyvec;
serialize(&sv).unwrap()
};
b.iter(|| {
let tinyvec_2 : Vec<i32> = deserialize(&data[..]).unwrap();
});
}
#[bench]
fn smallvec_tuple_benchmark(b: &mut Bencher) {
use self::bincode::{serialize, deserialize};
let data = {
let k1 = "hey";
let v1 = "now";
let k2 = "you're";
let v2 = "an";
let k3 = "all";
let v3 = "star";
let k4 = "get";
let v4 = "your";
let k5 = "game";
let v5 = "one";
let tinyvec : SmallVec<[(&str, &str); 5]> = smallvec![(k1, v1), (k2, v2), (k3, v3), (k4, v4), (k5, v5)];
let sv = tinyvec;
serialize(&sv).unwrap()
};
b.iter(|| {
let tinyvec_2 : SmallVec<[(&str,&str); 5]> = deserialize(&data[..]).unwrap();
});
}
#[bench]
fn vec_tuple_benchmark(b: &mut Bencher) {
use smallvec::{ExtendFromSlice, SmallVec};
use self::bincode::{serialize, deserialize};
let data = {
let k1 = "hey";
let v1 = "now";
let k2 = "you're";
let v2 = "an";
let k3 = "all";
let v3 = "star";
let k4 = "get";
let v4 = "your";
let k5 = "game";
let v5 = "one";
let regvec : Vec<(&str, &str)> = vec![(k1, v1), (k2, v2), (k3, v3), (k4, v4), (k5, v5)];
let v = regvec;
serialize(&v).unwrap()
};
b.iter(|| {
let tinyvec_2 : Vec<(&str,&str)> = deserialize(&data[..]).unwrap();
});
}
on my laptop this spits out these numbers:
running 4 tests
test smallvec_i32_benchmark ... bench: 84 ns/iter (+/- 10)
test smallvec_tuple_benchmark ... bench: 222 ns/iter (+/- 23)
test vec_i32_benchmark ... bench: 45 ns/iter (+/- 4)
test vec_tuple_benchmark ... bench: 186 ns/iter (+/- 20)
test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out
The text was updated successfully, but these errors were encountered:
Thanks for the benchmark! May I add it to the smallvec repo?
SmallVec's deserialize implementation is almost identical to Vec's: It creates an empty vector with capacity based on the sequence's size_hint (or zero if the hint is not available), then pushes to it in a loop until the sequence is exhausted. The SmallVec one is presumably slower because its push has extra branches that aren't optimized out.
In the case where size_hint is available, we can instead use unsafe code that writes to a raw pointer in a loop (similar to SmallVec::extend) to eliminate most of the performance difference.
Unfortunately this is no longer as fast as Vec after I fixed a bug where it could panic on malformed input. It's only a 5–10% improvement (still about 25–50% slower than Vec), and I'm not sure that's worth adding more unsafe code.
I was profiling my serde deserialization library and I went down the rabbit hole of benchmarking different KV container storage types. Got down to
Vec<(&'a[u8], &'a[u8]>
as the fastest and I wanted to see if my code would be faster by putting those string pointers on the stack. So I tried out SmallVec, was actually a little bit slower (20%) than vec. Should this be expected? I did not consider memory gains but the deserialization change doesn't make sense to me.I extract my benchmark to run inside of rust-smallvec:
on my laptop this spits out these numbers:
The text was updated successfully, but these errors were encountered: