Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: ahash without length prefixing #198

Open
nwalfield opened this issue Jan 12, 2024 · 3 comments
Open

feature request: ahash without length prefixing #198

nwalfield opened this issue Jan 12, 2024 · 3 comments

Comments

@nwalfield
Copy link

In Sequoia, we currently use xxhash to compare streams. ahash appears to be better than xxhash, because, as discussed in your README, it is faster, and it is already used by our other dependencies. The problem is that ahash appears to automatically adds length prefixes.

First, perhaps I'm holding it wrong. In that case, I apologize in advance for the noise, but would appreciate any tips.

As an aside, I'm a bit confused by this note in the Rust documentation for Hasher::write:

Note to Implementers

You generally should not do length-prefixing as part of implementing this method. It’s up to the Hash implementation to call Hasher::write_length_prefix before sequences that need it.

I understand that to mean that an implementation of Hash should do length prefixing; an implementation of Hasher, like ahash's implementation should not do length prefixing, but it seems to. Is this correct?

Assuming ahash's implementation is okay, I'd like to suggest a variant that can work on streams by not doing length prefixing and not padding short writes.

@tkaitchuck
Copy link
Owner

When this PR merges into the standard library, we will be able to remove the length prefixing:
rust-lang/rust#96762
Until then any hasher which does not work the way that sip-hash does where the algorithm depends only on the byte sequence rather than the calls: IE: if h.hash_slice(&[a, b]); h.hash_slice(&[c]); is not guaranteed to be the same as h.hash_slice(&[a]); h.hash_slice(&[b, c]); then the hasher would be vulnerable to a DoS attack. This includes XXHash.

@tkaitchuck
Copy link
Owner

tkaitchuck commented Feb 11, 2024

In the mean time, if you want to avoid the extra call, you can call write on the hasher rather than hash on the object.

@nwalfield
Copy link
Author

Thanks for the explanation, and the tip!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants