Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash struct lacks an Ord implementation #202

Open
sofia-snow opened this issue Oct 8, 2021 · 12 comments · May be fixed by #267
Open

Hash struct lacks an Ord implementation #202

sofia-snow opened this issue Oct 8, 2021 · 12 comments · May be fixed by #267

Comments

@sofia-snow
Copy link

Without an Ord implementation (or even a Hash implementation) blake3::Hash values cannot be used as keys in BTreeMaps or HashMaps, among many other valid uses for (vartime) orderings.

To be safe; these implementations ought to be constant-time. Either way, the subtle crate provides traits which are explicitly intended for constant time equality and ordering.

As a workaround, I'm using the newtype pattern to implement these traits on my own OrdHash type.

If a PR would be welcome for this addition, should the implementation be constant time or variable?

@oconnor663
Copy link
Member

Interesting question. My instinct is that for most applications where constant-time ordering isn't an issue, the easiest thing is to convert to [u8; 32] with .into(). Given that option, I think if we were going to add an Ord/PartialOrd impl in this crate, it would make sense for it to be constant time. That way we maintain the general rule that "you can convert to [u8; 32] when you don't care about constant-time stuff."

@oconnor663
Copy link
Member

For what it's worth, I've often thought about splitting out the Hash type into its own crate. Especially now that const generics are stabilized, it might make sense to revisit that. I wonder if the RustCrypto folks would be interested in sharing a single type here.

@sofia-snow
Copy link
Author

My understanding is not all of const-generics have landed, and RustCrypto need more than currently offered even by nightly.

RustCrypto/traits#238 (comment)

@ppamorim
Copy link

ppamorim commented Feb 7, 2022

@oconnor663 Hi, is there any work done on that? I am trying to use Hash on my BtreeMap but I need to convert it to hex every time. :(

@oconnor663
Copy link
Member

I haven't done any work on this myself, no. It looks like the right way to do it will be to replace our dependency on constant_time_eq with a dependency on subtle, which can do both Eq and Ord. If anyone's interested, this is probably a good beginner PR.

In the meantime, you might want to convert Hash value to bytes rather than hex, either with .as_bytes() or with .into(). That'll be the same size as the original Hash, which might be important if you're storing lots of them.

@Eitu33
Copy link

Eitu33 commented Mar 4, 2022

Shouldn't #203 be merged before eventually finding a better alternative?

Also why is a constant-time Ord implementation required?

After thinking a bit about how it could be done when comparing 2 Hash I realised that having a constant-time implementation for this would require to always go through the whole byte array. This would be the equivalent of the worst case scenario of the default implementation which stops the comparison whenever the bytes aren't equal, so I don't see why that would make sense.

@imuli
Copy link

imuli commented Nov 11, 2022

Same reason Hash has a constant time Eq implementation. Though I can't think of an attack off-hand, I would bet there's some use case out there that leaks information though the time it takes to compare (probably many) hashes - especially if you could control what hashes it's being compared to...

Anyway, I'm glad there's already an issue open here with something of a plan, I'll take a look at subtle.

@elichai
Copy link
Contributor

elichai commented Nov 12, 2022

Basically whenever Blake3 is used as an HMAC you really really want CT comparison https://threatpost.com/timing-attack-google-keyczar-library-060209/72738/

I'd also say that I doubt this could be a bottleneck anywhere, it's 4 words xoring them instead of branching over them might even be faster

@oconnor663
Copy link
Member

One issue to keep in mind here is that, even if Hash provided a constant-time comparison, you could still leak information by using it in a data structure that doesn't make higher-level constant-time guarantees. For example, suppose we have a search tree of MACs, and our application takes a MAC from the user and asks "is this in the tree?" Maybe the tree looks like this:

    mac2
   /    \
mac1    mac4
       /    \
     mac3  mac5

Even if comparing individual MACs is constant time, the tree structure itself still has a timing leak. If I query it with a MAC that's less than mac2, it's going to do 2 comparisons. But if I query it with a MAC that's greater than mac2, it's going to do 3. If I can measure that timing difference, I can work from the high bits down to the low bits and recover mac2 in ~256 queries! (In general, an oracle that answers "is your MAC equal to the valid one?" is normal and no problem, but an oracle that answers "is your MAC greater than the valid one?" is broken and quickly leaks the valid MAC.)

We could probably come up with some way to fix the tree. We could add a requirement that all the leaves are at the same depth, and we could design some sort of "empty node" concept to make that work. I'm sure someone somewhere has done this before, but it's a very exotic problem to be solving, and no ordinary tree-map implementation is going to do anything like this.

So one worry I have about a constant-time Ord implementation, is that it might inspire false confidence that putting MACs inside a BTreeSet is safe. But that's not safe, and there's nothing Hash can do to make it safe. I'm curious what other use cases folks here have in mind.

@imuli imuli linked a pull request Nov 12, 2022 that will close this issue
@imuli
Copy link

imuli commented Nov 12, 2022

So one worry I have about a constant-time Ord implementation, is that it might inspire false confidence that putting MACs inside a BTreeSet is safe. But that's not safe, and there's nothing Hash can do to make it safe.

I agree with that concern. Even in a non-MAC scenario it could be a problem - if you can fetch data based on it's hash, the time until cache miss could reveal blocks that are present in the cache.

I'm curious what other use cases folks here have in mind.

I don't think I have an actual use case for a constant-time Ord implementation so much as any Ord implementation. Mainly for things like sorting keys during serialization to remove hash map leaks.

@imuli
Copy link

imuli commented Nov 21, 2022

This #267 (comment) seems relevant over here.

Actually I should've said that it's not clear (to me) that we want any Ord implementation at all. In my mind, forcing the caller to convert hashes to bytes to sort them has some value, because it makes it clear that the comparison semantics are the regular [u8] comparison semantics. Given that we do encourage users to rely on our constant-time Eq impl, I worry it would be confusing to expose an Ord impl without making a similar guarantee.

So the recommendation for types that want an Ord implementation and to contain a Hash would be to either make a wrapper, write it manually, or embed it as a [u8; 32] - which I guess is what everyone is doing now anyway :)

@oconnor663
Copy link
Member

Yeah if you need to sort hashes or something like that, my instinct is to convert them to [u8; 32]. In --release mode that conversion is almost certainly a no-op, and it makes it maximally clear what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants