Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fallible eq and hasher in raw API #456

Open
udoprog opened this issue Aug 23, 2023 · 8 comments
Open

Support fallible eq and hasher in raw API #456

udoprog opened this issue Aug 23, 2023 · 8 comments

Comments

@udoprog
Copy link
Contributor

udoprog commented Aug 23, 2023

Hi!

I want to use the raw API of hashbrown as the underlying table implementation in Rune.

One aspect which has proven to be awkward is that the hasher I need to pass into RawTable::find_or_find_insert_slot is infallible, which works well for a statically typed language, but less so in Rune where I want to store and deal with dynamically typed entries directly.

Note: I've implemented a prototype of this in my raw-infallible-context branch.

What I need to do is something like this:

let mut table: RawTable<(Value, Value)> = todo!();
let entry: (Value, Value) = todo!();

// NB: might fail if the key can't be hashed.
let hash = key.hash()?;

// Comparison might fail.
let eq = |other: &(Value, Value)| -> Result<bool, MyError> { Value::eq(&entry.0, &other.0) };
// Hashing might fail.
let hasher = |key: &(Value, Value)| -> Result<u64, MyError> { Value::hash(&key.0) }

let result = table.find_or_find_insert_slot(hash, eq, hasher);
// deal with my custom result

The change needed would be fairly straightforward (I think) but a bit tedious, what I think is needed is to refactor the API into a pair of methods like this (ignore naming):

use std::convert::Infallible;

#[inline(always)]
fn into_ok<T>(result: Result<T, Infallible>) -> T {
    match result {
        Ok(value) => value,
        Err(error) => match error {},
    }
}

#[inline(always)]
fn infallible_eq<T>(mut f: impl FnMut(&T) -> bool) -> impl FnMut(&T) -> Result<bool, Infallible> {
    move |value| Ok::<_, Infallible>(f(value))
}

#[inline(always)]
fn infallible_hasher<T>(f: impl Fn(&T) -> u64) -> impl Fn(&T) -> Result<u64, Infallible> {
    move |value| Ok::<_, Infallible>(f(value))
}

impl RawTable {
    /// Backwards compatibility
    #[inline(always)]
    pub fn find_or_find_insert_slot(
        &mut self,
        hash: u64,
        eq: impl FnMut(&T) -> bool,
        hasher: impl Fn(&T) ->  u64,
    ) -> Result<Bucket<T>, InsertSlot> {
        into_ok(self.find_or_find_insert_slot2(hash, infallible_eq(eq), infallible_hasher(hasher)))
    }

    /// New method
    pub fn find_or_find_insert_slot2<E>(
        &mut self,
        hash: u64,
        mut eq: impl FnMut(&T) -> Result<bool, E>,
        hasher: impl Fn(&T) -> Result<u64, E>,
    ) -> Result<Result<Bucket<T>, InsertSlot>, E> {
        todo!()
    }
}

The signatures leave something to be desired and such a change would have to be propagated through the rest of the API, the majority if which have proven to be internal. I would also hope that the compiler can easily prove the infallible variants are no-ops but this is something that should probably be investigated.

@udoprog
Copy link
Contributor Author

udoprog commented Aug 23, 2023

Another feature I'd appreciate is to be able to somehow pass a mutable environment through to both closures since I somehow need to reference the virtual machine being used. This currently can't be done because eq and hasher refer to separate closures and both can't capture the same variables mutably, so I currently have to resort to using thread locals.

This could be accomplished by either passing an environment into the closures or using a different common trait for equality checks and hashing.

impl RawTable {
    pub fn find_or_find_insert_slot2<C: ?Sized, E>(
        &mut self,
        ctx: &mut C,
        hash: u64,
        mut eq: impl FnMut(&mut C, &T) -> Result<bool, E>,
        hasher: impl Fn(&mut C, &T) -> Result<u64, E>,
    ) -> Result<Result<Bucket<T>, InsertSlot>, E> {
        todo!()
    }
}

EDIT: Here are benchmarks and size comparisons for my branch which includes fallible eq, hasher and passing around a mutable context:

Benches without the changes
test clone_from_large            ... bench:       6,256 ns/iter (+/- 29)
test clone_from_small            ... bench:          65 ns/iter (+/- 0)
test clone_large                 ... bench:       6,509 ns/iter (+/- 40)
test clone_small                 ... bench:         111 ns/iter (+/- 0)
test grow_insert_ahash_highbits  ... bench:      20,852 ns/iter (+/- 154)
test grow_insert_ahash_random    ... bench:      20,862 ns/iter (+/- 159)
test grow_insert_ahash_serial    ... bench:      20,816 ns/iter (+/- 161)
test grow_insert_std_highbits    ... bench:      37,082 ns/iter (+/- 711)
test grow_insert_std_random      ... bench:      36,962 ns/iter (+/- 214)
test grow_insert_std_serial      ... bench:      36,871 ns/iter (+/- 169)
test insert_ahash_highbits       ... bench:      19,712 ns/iter (+/- 253)
test insert_ahash_random         ... bench:      19,947 ns/iter (+/- 84)
test insert_ahash_serial         ... bench:      20,242 ns/iter (+/- 91)
test insert_erase_ahash_highbits ... bench:      21,603 ns/iter (+/- 620)
test insert_erase_ahash_random   ... bench:      21,112 ns/iter (+/- 57)
test insert_erase_ahash_serial   ... bench:      21,103 ns/iter (+/- 85)
test insert_erase_std_highbits   ... bench:      34,403 ns/iter (+/- 273)
test insert_erase_std_random     ... bench:      35,150 ns/iter (+/- 228)
test insert_erase_std_serial     ... bench:      34,370 ns/iter (+/- 225)
test insert_std_highbits         ... bench:      25,299 ns/iter (+/- 294)
test insert_std_random           ... bench:      25,439 ns/iter (+/- 410)
test insert_std_serial           ... bench:      25,238 ns/iter (+/- 166)
test iter_ahash_highbits         ... bench:         812 ns/iter (+/- 5)
test iter_ahash_random           ... bench:         812 ns/iter (+/- 6)
test iter_ahash_serial           ... bench:         835 ns/iter (+/- 15)
test iter_std_highbits           ... bench:         811 ns/iter (+/- 1)
test iter_std_random             ... bench:         818 ns/iter (+/- 4)
test iter_std_serial             ... bench:         819 ns/iter (+/- 8)
test lookup_ahash_highbits       ... bench:       4,433 ns/iter (+/- 181)
test lookup_ahash_random         ... bench:       4,357 ns/iter (+/- 15)
test lookup_ahash_serial         ... bench:       4,072 ns/iter (+/- 70)
test lookup_fail_ahash_highbits  ... bench:       4,278 ns/iter (+/- 62)
test lookup_fail_ahash_random    ... bench:       4,236 ns/iter (+/- 11)
test lookup_fail_ahash_serial    ... bench:       3,914 ns/iter (+/- 14)
test lookup_fail_std_highbits    ... bench:      10,352 ns/iter (+/- 165)
test lookup_fail_std_random      ... bench:      10,337 ns/iter (+/- 24)
test lookup_fail_std_serial      ... bench:      10,248 ns/iter (+/- 46)
test lookup_std_highbits         ... bench:      10,319 ns/iter (+/- 48)
test lookup_std_random           ... bench:      10,570 ns/iter (+/- 61)
test lookup_std_serial           ... bench:      10,356 ns/iter (+/- 65)
test rehash_in_place             ... bench:     176,636 ns/iter (+/- 1,843)
test insert                      ... bench:       8,853 ns/iter (+/- 29)
test insert_unique_unchecked     ... bench:       6,084 ns/iter (+/- 41)
Benches with the changes
test clone_from_large            ... bench:       6,220 ns/iter (+/- 73)
test clone_from_small            ... bench:          66 ns/iter (+/- 0)
test clone_large                 ... bench:       6,504 ns/iter (+/- 74)
test clone_small                 ... bench:         112 ns/iter (+/- 1)
test grow_insert_ahash_highbits  ... bench:      20,686 ns/iter (+/- 323)
test grow_insert_ahash_random    ... bench:      20,665 ns/iter (+/- 166)
test grow_insert_ahash_serial    ... bench:      20,583 ns/iter (+/- 157)
test grow_insert_std_highbits    ... bench:      37,383 ns/iter (+/- 523)
test grow_insert_std_random      ... bench:      37,318 ns/iter (+/- 391)
test grow_insert_std_serial      ... bench:      37,240 ns/iter (+/- 499)
test insert_ahash_highbits       ... bench:      19,885 ns/iter (+/- 210)
test insert_ahash_random         ... bench:      19,944 ns/iter (+/- 131)
test insert_ahash_serial         ... bench:      20,343 ns/iter (+/- 132)
test insert_erase_ahash_highbits ... bench:      21,533 ns/iter (+/- 192)
test insert_erase_ahash_random   ... bench:      21,163 ns/iter (+/- 382)
test insert_erase_ahash_serial   ... bench:      21,108 ns/iter (+/- 477)
test insert_erase_std_highbits   ... bench:      34,268 ns/iter (+/- 989)
test insert_erase_std_random     ... bench:      35,408 ns/iter (+/- 1,840)
test insert_erase_std_serial     ... bench:      34,213 ns/iter (+/- 1,801)
test insert_std_highbits         ... bench:      25,093 ns/iter (+/- 141)
test insert_std_random           ... bench:      25,007 ns/iter (+/- 159)
test insert_std_serial           ... bench:      25,403 ns/iter (+/- 85)
test iter_ahash_highbits         ... bench:         812 ns/iter (+/- 4)
test iter_ahash_random           ... bench:         812 ns/iter (+/- 9)
test iter_ahash_serial           ... bench:         828 ns/iter (+/- 13)
test iter_std_highbits           ... bench:         812 ns/iter (+/- 4)
test iter_std_random             ... bench:         821 ns/iter (+/- 11)
test iter_std_serial             ... bench:         813 ns/iter (+/- 10)
test lookup_ahash_highbits       ... bench:       4,456 ns/iter (+/- 34)
test lookup_ahash_random         ... bench:       4,363 ns/iter (+/- 34)
test lookup_ahash_serial         ... bench:       4,074 ns/iter (+/- 32)
test lookup_fail_ahash_highbits  ... bench:       4,284 ns/iter (+/- 41)
test lookup_fail_ahash_random    ... bench:       4,235 ns/iter (+/- 14)
test lookup_fail_ahash_serial    ... bench:       3,914 ns/iter (+/- 28)
test lookup_fail_std_highbits    ... bench:      10,314 ns/iter (+/- 62)
test lookup_fail_std_random      ... bench:      10,411 ns/iter (+/- 218)
test lookup_fail_std_serial      ... bench:      10,241 ns/iter (+/- 21)
test lookup_std_highbits         ... bench:      10,371 ns/iter (+/- 48)
test lookup_std_random           ... bench:      10,575 ns/iter (+/- 86)
test lookup_std_serial           ... bench:      10,453 ns/iter (+/- 189)
test rehash_in_place             ... bench:     176,986 ns/iter (+/- 1,336)
test insert                      ... bench:       8,747 ns/iter (+/- 204)
test insert_unique_unchecked     ... bench:       5,987 ns/iter (+/- 41)
Binary size differences when applying my branch to the [gimli-rs/object](https://github.com/gimli-rs/object) repo and doing a release build
--- without.txt 2023-08-23 20:46:02.666016800 +0200
+++ with.txt    2023-08-23 20:46:11.031017000 +0200
@@ -1,28 +1,28 @@
-target/release/ar 4715408
+target/release/ar 4715472
 target/release/ar.d 3315
 target/release/build 4096
 target/release/deps 4096
-target/release/dyldcachedump 4723616
+target/release/dyldcachedump 4723664
 target/release/dyldcachedump.d 3337
-target/release/elfcopy 4888528
+target/release/elfcopy 4899232
 target/release/elfcopy.d 3325
-target/release/elftoefi 4739656
+target/release/elftoefi 4739704
 target/release/elftoefi.d 3327
 target/release/examples 4096
 target/release/incremental 4096
 target/release/libobject.d 2965
-target/release/libobject.rlib 11035614
+target/release/libobject.rlib 11051504
 target/release/libobject_examples.d 3287
-target/release/libobject_examples.rlib 3826960    
-target/release/nm 4867848
+target/release/libobject_examples.rlib 3828668    
+target/release/nm 4882424
 target/release/nm.d 3315
-target/release/objcopy 5067632
+target/release/objcopy 5077600
 target/release/objcopy.d 3325
-target/release/objdump 5046328
+target/release/objdump 5046240
 target/release/objdump.d 3325
-target/release/objectmap 4859896
+target/release/objectmap 4859768
 target/release/objectmap.d 3329
-target/release/pecopy 4724840
+target/release/pecopy 4724912
 target/release/pecopy.d 3323
-target/release/readobj 5334536
+target/release/readobj 5334816
 target/release/readobj.d 3325

@Amanieu
Copy link
Member

Amanieu commented Aug 25, 2023

I don't think supporting fallible hasher is worth adding this level of complexity to the API. In most cases this isn't needed, and you can always emulate it by returning a hash value of 0 and recording the error in an Option that you check after the hash table operation is complete.

However I do think there is value in find_or_find_insert_slot_with_context: it is very common to want both eq and hasher to have access to the same object. It just isn't an issue in practice since mutable access is not usually required (shared access is sufficient).

@udoprog
Copy link
Contributor Author

udoprog commented Aug 25, 2023

I don't think supporting fallible hasher is worth adding this level of complexity to the API. In most cases this isn't needed, and you can always emulate it by returning a hash value of 0 and recording the error in an Option that you check after the hash table operation is complete.

I considered this initially, but would this not result in potentially ill-defined downstream behaviors in fallible cases (same for eq)? Preferably I'd want to ensure that any errors prevent the modifications from taking place to achieve my version of exception safety.

@Amanieu
Copy link
Member

Amanieu commented Aug 25, 2023

hasher absolutely cannot be allowed to fail, we rely on this when rehashing the table. See rehash_in_place where if the hasher panics then we drop all elements in the table. If this is not something that you can guarantee then you may want to consider caching hash values separately so they are always available.

eq failing is less serious since you can always just return false, and it doesn't modify the table.

@udoprog
Copy link
Contributor Author

udoprog commented Aug 25, 2023

hasher absolutely cannot be allowed to fail, we rely on this when rehashing the table. See rehash_in_place where if the hasher panics then we drop all elements in the table.

Right, so the behavior I'd want to emulate is what would happen if the hash implementation panics today (which is how I believe this PR works). Think of this error propagation as my own "structured panics".

If this is not something that you can guarantee then you may want to consider caching hash values separately so they are always available.

What do you mean? Are raw tables not exception safe?

eq failing is less serious since you can always just return false, and it doesn't modify the table.

Doesn't that depend on where it fails? If it fails during insertion (e.g. find_or_find_insert_slot) and eq returns false wouldn't the table allocate another bucket since two values are not considered eq but have the same hash?

@udoprog
Copy link
Contributor Author

udoprog commented Aug 25, 2023

I couldn't produce a modification for a failing eq, but here is one which is avoided for a failing hasher:

use hashbrown::raw::RawTable;

fn main() {
    let mut table = RawTable::<()>::new();

    unsafe {
        // Insert three elements (with the same hash) to fill up initial capacity:
        for _ in 0..3u64 {
            let Err(slot) = table.find_or_find_insert_slot(0, |_| false, |_| 0) else {
                panic!();
            };

            table.insert_in_slot(0, slot, ());
        }

        dbg!(table.capacity()); // 3

        // Table does not grow when `hasher` fails like this:
        let Err(()) = table.find_or_find_insert_slot_with(&mut (), 0, |_, _| Ok(false), |_, _| Err(())) else {
            panic!();
        };

        dbg!(table.capacity()); // 3

        // Table does grow when a dummy value is returned instead:
        let Err(..) = table.find_or_find_insert_slot(0, |_| false, |_| 0) else {
            panic!();
        };

        dbg!(table.capacity()); // 7
    }
}

@Amanieu
Copy link
Member

Amanieu commented Aug 26, 2023

What do you mean? Are raw tables not exception safe?

Not completely: because we don't store the full hash in the table, we rely on the hasher when rehashing the table in-place. If the hasher panics then we have no way of knowing where to place elements in the table. It's exception-sense in the sense that you won't get UB, but you may lose elements in the table.

hashbrown/src/raw/mod.rs

Lines 2277 to 2280 in bb6521e

// If the hash function panics then properly clean up any elements
// that we haven't rehashed yet. We unfortunately can't preserve the
// element since we lost their hash and have no way of recovering it
// without risking another panic.

Doesn't that depend on where it fails? If it fails during insertion (e.g. find_or_find_insert_slot) and eq returns false wouldn't the table allocate another bucket since two values are not considered eq but have the same hash?

find_or_find_insert_slot first calls reserve(1) to ensure there is enough space in the table in case an insertion is needed, before doing anything. This function doesn't actually perform any insertion into the table, that is done by insert_in_slot.

@udoprog
Copy link
Contributor Author

udoprog commented Aug 26, 2023

Right. Thanks for the clarity. So that is what I gathered from reading the implementation as well. The only extent I expect the table to be exception safe is that it doesn't permit anything memory unsafe to happen if we do end up catching a panic. But the exact end state is not guaranteed to be the same as prior to the attempted modification.

I am still interested to ensure that my "errors" exhibit the same behaviour as a caught Rust panic, mainly because I want to provide as much feature parity between Rust and Rune as possible. So I still want to provide fallibility to the methods.

It's nice that you're interested in passing the context around though (possibly the bigger change). But until fallible methods are supported or the exact behavior of returning "dummy values" is understood and tested* I'll probably end up maintaining a private fork of the raw table with those minor changes.

*: To the degree that it's possible to emulate the behavior of an equivalent Rust panic.

@udoprog udoprog mentioned this issue Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants