Skip to content

Commit

Permalink
Improve documentation around keys (#139)
Browse files Browse the repository at this point in the history
* Provide a flag to opt-out of keyed hashing.
* Document much more closely where keys come from.
* Explain each of the constructors for RandomState.
* Provide more code examples.

Signed-off-by: Tom Kaitchuck <Tom.Kaitchuck@gmail.com>
  • Loading branch information
tkaitchuck committed Nov 1, 2022
1 parent 2c1a088 commit 2f39c89
Show file tree
Hide file tree
Showing 6 changed files with 113 additions and 62 deletions.
4 changes: 4 additions & 0 deletions Cargo.toml
Expand Up @@ -36,6 +36,10 @@ runtime-rng = ["getrandom"]
# If this is disabled and runtime-rng is unavailable constant keys are used.
compile-time-rng = ["const-random"]

# Do not use any random number generator (either at compile time or runtime)
# If either runtime-rng or compile-time-rng are enabled this does nothing.
no-rng = []

# in case this is being used on an architecture lacking core::sync::atomic::AtomicUsize and friends
atomic-polyfill = [ "dep:atomic-polyfill", "once_cell/atomic-polyfill"]

Expand Down
8 changes: 2 additions & 6 deletions README.md
Expand Up @@ -61,12 +61,8 @@ This allows for DOS resistance even if there is no random number generator avail
This makes the binary non-deterministic. (If non-determinism is a problem see [constrandom's documentation](https://github.com/tkaitchuck/constrandom#deterministic-builds))

If both `runtime-rng` and `compile-time-rng` are enabled the `runtime-rng` will take precedence and `compile-time-rng` will do nothing.

**NOTE:** If both `runtime-rng` and `compile-time-rng` a source of randomness may be provided by the application on startup
using the [ahash::random_state::set_random_source](https://docs.rs/ahash/latest/ahash/random_state/fn.set_random_source.html) method.
If neither flag is set and this is not done, aHash will fall back on using the numeric value of memory addresses as a source of randomness.
This is somewhat strong if ALSR is turned on (it is by default) but for embedded platforms this will result in weak keys.
As a result, it is recommended to use `compile-time-rng` anytime random numbers will not be available at runtime.
If neither flag is set, seeds can be supplied by the application. [Multiple apis](https://docs.rs/ahash/latest/ahash/random_state/struct.RandomState.html)
are available to do this.

## Comparison with other hashers

Expand Down
24 changes: 18 additions & 6 deletions src/hash_map.rs
Expand Up @@ -14,6 +14,7 @@ use serde::{
};

use crate::RandomState;
use crate::random_state::RandomSource;

/// A [`HashMap`](std::collections::HashMap) using [`RandomState`](crate::RandomState) to hash the items.
/// (Requires the `std` feature to be enabled.)
Expand Down Expand Up @@ -51,12 +52,16 @@ impl<K, V> Into<HashMap<K, V, crate::RandomState>> for AHashMap<K, V> {
}

impl<K, V> AHashMap<K, V, RandomState> {
/// This crates a hashmap using [RandomState::new] which obtains its keys from [RandomSource].
/// See the documentation in [RandomSource] for notes about key strength.
pub fn new() -> Self {
AHashMap(HashMap::with_hasher(RandomState::default()))
AHashMap(HashMap::with_hasher(RandomState::new()))
}

/// This crates a hashmap with the specified capacity using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
pub fn with_capacity(capacity: usize) -> Self {
AHashMap(HashMap::with_capacity_and_hasher(capacity, RandomState::default()))
AHashMap(HashMap::with_capacity_and_hasher(capacity, RandomState::new()))
}
}

Expand Down Expand Up @@ -340,13 +345,16 @@ where
}
}

impl<K, V, S> FromIterator<(K, V)> for AHashMap<K, V, S>
impl<K, V> FromIterator<(K, V)> for AHashMap<K, V, RandomState>
where
K: Eq + Hash,
S: BuildHasher + Default,
{
/// This crates a hashmap from the provided iterator using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
fn from_iter<T: IntoIterator<Item = (K, V)>>(iter: T) -> Self {
AHashMap(HashMap::from_iter(iter))
let mut inner = HashMap::with_hasher(RandomState::new());
inner.extend(iter);
AHashMap(inner)
}
}

Expand Down Expand Up @@ -397,10 +405,14 @@ where
}
}

/// NOTE: For safety this trait impl is only available available if either of the flags `runtime-rng` (on by default) or
/// `compile-time-rng` are enabled. This is to prevent weakly keyed maps from being accidentally created. Instead one of
/// constructors for [RandomState] must be used.
#[cfg(any(feature = "compile-time-rng", feature = "runtime-rng", feature = "no-rng"))]
impl<K, V> Default for AHashMap<K, V, RandomState> {
#[inline]
fn default() -> AHashMap<K, V, RandomState> {
AHashMap::new()
AHashMap(HashMap::default())
}
}

Expand Down
34 changes: 23 additions & 11 deletions src/hash_set.rs
@@ -1,4 +1,5 @@
use crate::RandomState;
use crate::random_state::RandomSource;
use std::collections::{hash_set, HashSet};
use std::fmt::{self, Debug};
use std::hash::{BuildHasher, Hash};
Expand All @@ -14,10 +15,10 @@ use serde::{
/// A [`HashSet`](std::collections::HashSet) using [`RandomState`](crate::RandomState) to hash the items.
/// (Requires the `std` feature to be enabled.)
#[derive(Clone)]
pub struct AHashSet<T, S = crate::RandomState>(HashSet<T, S>);
pub struct AHashSet<T, S = RandomState>(HashSet<T, S>);

impl<T> From<HashSet<T, crate::RandomState>> for AHashSet<T> {
fn from(item: HashSet<T, crate::RandomState>) -> Self {
impl<T> From<HashSet<T, RandomState>> for AHashSet<T> {
fn from(item: HashSet<T, RandomState>) -> Self {
AHashSet(item)
}
}
Expand All @@ -40,19 +41,23 @@ where
}
}

impl<T> Into<HashSet<T, crate::RandomState>> for AHashSet<T> {
fn into(self) -> HashSet<T, crate::RandomState> {
impl<T> Into<HashSet<T, RandomState>> for AHashSet<T> {
fn into(self) -> HashSet<T, RandomState> {
self.0
}
}

impl<T> AHashSet<T, RandomState> {
/// This crates a hashset using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
pub fn new() -> Self {
AHashSet(HashSet::with_hasher(RandomState::default()))
AHashSet(HashSet::with_hasher(RandomState::new()))
}

/// This crates a hashset with the specified capacity using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
pub fn with_capacity(capacity: usize) -> Self {
AHashSet(HashSet::with_capacity_and_hasher(capacity, RandomState::default()))
AHashSet(HashSet::with_capacity_and_hasher(capacity, RandomState::new()))
}
}

Expand Down Expand Up @@ -237,14 +242,17 @@ where
}
}

impl<T, S> FromIterator<T> for AHashSet<T, S>
impl<T> FromIterator<T> for AHashSet<T, RandomState>
where
T: Eq + Hash,
S: BuildHasher + Default,
{
/// This crates a hashset from the provided iterator using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
#[inline]
fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> AHashSet<T, S> {
AHashSet(HashSet::from_iter(iter))
fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> AHashSet<T> {
let mut inner = HashSet::with_hasher(RandomState::new());
inner.extend(iter);
AHashSet(inner)
}
}

Expand Down Expand Up @@ -286,6 +294,10 @@ where
}
}

/// NOTE: For safety this trait impl is only available available if either of the flags `runtime-rng` (on by default) or
/// `compile-time-rng` are enabled. This is to prevent weakly keyed maps from being accidentally created. Instead one of
/// constructors for [RandomState] must be used.
#[cfg(any(feature = "compile-time-rng", feature = "runtime-rng", feature = "no-rng"))]
impl<T> Default for AHashSet<T, RandomState> {
/// Creates an empty `AHashSet<T, S>` with the `Default` value for the hasher.
#[inline]
Expand Down
86 changes: 52 additions & 34 deletions src/lib.rs
@@ -1,7 +1,5 @@
//! AHash is a high performance keyed hash function.
//!
//! It is a DOS resistant alternative to `FxHash` or a faster alternative to `SipHash`.
//!
//! It quickly provides a high quality hash where the result is not predictable without knowing the Key.
//! AHash works with `HashMap` to hash keys, but without allowing for the possibility that an malicious user can
//! induce a collision.
Expand All @@ -11,65 +9,85 @@
//! When it is available aHash uses the hardware AES instructions to provide a keyed hash function.
//! When it is not, aHash falls back on a slightly slower alternative algorithm.
//!
//! AHash does not have a fixed standard for its output. This allows it to improve over time.
//! But this also means that different computers or computers using different versions of ahash will observe different
//! hash values.
//! Because aHash does not have a fixed standard for its output, it is able to improve over time.
//! But this also means that different computers or computers using different versions of ahash may observe different
//! hash values for the same input.
#![cfg_attr(
feature = "std",
all(feature = "std", any(feature = "compile-time-rng", feature = "runtime-rng", feature = "no-rng")),
doc = r##"
# Usage
AHash is a drop in replacement for the default implementation of the Hasher trait. To construct a HashMap using aHash as its hasher do the following:
# Basic Usage
AHash provides an implementation of the [Hasher] trait.
To construct a HashMap using aHash as its hasher do the following:
```
use ahash::{AHasher, RandomState};
use std::collections::HashMap;
let mut map: HashMap<i32, i32, RandomState> = HashMap::default();
map.insert(12, 34);
```
"##
)]
#![cfg_attr(
feature = "std",
doc = r##"
For convenience, both new-type wrappers and type aliases are provided.
The new type wrappers are called called `AHashMap` and `AHashSet`.
These do the same thing with slightly less typing. (For convience `From`, `Into`, and `Deref` are provided).
```
use ahash::AHashMap;
### Randomness
let mut map: AHashMap<i32, i32> = AHashMap::new();
map.insert(12, 34);
```
The above requires a source of randomness to generate keys for the hashmap. By default this obtained from the OS.
It is also possible to have randomness supplied via the `compile-time-rng` flag, or manually.
### If randomess is not available
For even less typing and better interop with existing libraries which require a `std::collection::HashMap` (such as rayon),
the type aliases [HashMap], [HashSet] are provided. These alias the `std::HashMap` and `std::HashSet` using aHash as the hasher.
[AHasher::default()] can be used to hash using fixed keys. This works with
[BuildHasherDefault](std::hash::BuildHasherDefault). For example:
```
use ahash::{HashMap, HashMapExt};
use std::hash::BuildHasherDefault;
use std::collections::HashMap;
use ahash::AHasher;
let mut map: HashMap<i32, i32> = HashMap::new();
map.insert(12, 34);
let mut m: HashMap<_, _, BuildHasherDefault<AHasher>> = HashMap::default();
# m.insert(12, 34);
```
Note the import of [HashMapExt]. This is needed for the constructor.
It is also possible to instantiate [RandomState] directly:
# Directly hashing
```
use ahash::HashMap;
use ahash::RandomState;
Hashers can also be instantiated with `RandomState`. For example:
let mut m = HashMap::with_hasher(RandomState::with_seed(42));
# m.insert(1, 2);
```
Or for uses besides a hashhmap:
```
use std::hash::BuildHasher;
use ahash::RandomState;
let hash_builder = RandomState::with_seed(42);
let hash = hash_builder.hash_one("Some Data");
```
### Randomness
There are several constructors for [RandomState] with different ways to supply seeds.
# Convenience wrappers
To ensure that each map has a unique set of keys aHash needs a source of randomness.
Normally this is just obtained from the OS. (Or via the `compile-time-rng` flag)
For convenience, both new-type wrappers and type aliases are provided.
If for some reason (such as fuzzing) an application wishes to supply all random seeds manually, this can be done via:
[random_state::set_random_source].
The new type wrappers are called called `AHashMap` and `AHashSet`.
```
use ahash::AHashMap;
let mut map: AHashMap<i32, i32> = AHashMap::new();
map.insert(12, 34);
```
This avoids the need to type "RandomState". (For convience `From`, `Into`, and `Deref` are provided).
# Aliases
For even less typing and better interop with existing libraries (such as rayon) which require a `std::collection::HashMap` ,
the type aliases [HashMap], [HashSet] are provided.
```
use ahash::{HashMap, HashMapExt};
let mut map: HashMap<i32, i32> = HashMap::new();
map.insert(12, 34);
```
Note the import of [HashMapExt]. This is needed for the constructor.
"##
)]
Expand Down
19 changes: 14 additions & 5 deletions src/random_state.rs
Expand Up @@ -121,9 +121,12 @@ cfg_if::cfg_if! {
///
/// If [set_random_source] aHash will default to the best available source of randomness.
/// In order this is:
/// 1. OS provided random number generator (available if the `runtime-rng` flag is enabled which it is by default)
/// 2. Strong compile time random numbers used to permute a static "counter". (available if `compile-time-rng` is enabled. __Enabling this is recommended if `runtime-rng` is not possible__)
/// 3. A static counter that adds the memory address of each [RandomState] created permuted with fixed constants. (Similar to above but with fixed keys)
/// 1. OS provided random number generator (available if the `runtime-rng` flag is enabled which it is by default) - This should be very strong.
/// 2. Strong compile time random numbers used to permute a static "counter". (available if `compile-time-rng` is enabled.
/// __Enabling this is recommended if `runtime-rng` is not possible__)
/// 3. A static counter that adds the memory address of each [RandomState] created permuted with fixed constants.
/// (Similar to above but with fixed keys) - This is the weakest option. The strength of this heavily depends on whether or not ASLR is enabled.
/// (Rust enables ASLR by default)
pub trait RandomSource {
fn gen_hasher_seed(&self) -> usize;
}
Expand Down Expand Up @@ -207,7 +210,7 @@ cfg_if::cfg_if! {
/// | Constructor | Dynamically random? | Seed |
/// |---------------|---------------------|------|
/// |`new` | Each instance unique|_[RandomSource]_|
/// |`generate_with`| Each instance unique|`u64` x 4 + static counter|
/// |`generate_with`| Each instance unique|`u64` x 4 + [RandomSource]|
/// |`with_seed` | Fixed per process |`u64` + static random number|
/// |`with_seeds` | Fixed |`u64` x 4|
///
Expand All @@ -229,7 +232,8 @@ impl RandomState {

/// Create a new `RandomState` `BuildHasher` using random keys.
///
/// (Each instance will have a unique set of keys).
/// Each instance will have a unique set of keys derived from [RandomSource].
///
#[inline]
pub fn new() -> RandomState {
let src = get_src();
Expand Down Expand Up @@ -363,6 +367,11 @@ impl RandomState {
/// can be used to create many hashers each or which will have the same keys.)
///
/// This is the same as [RandomState::new()]
///
/// NOTE: For safety this trait impl is only available available if either of the flags `runtime-rng` (on by default) or
/// `compile-time-rng` are enabled. This is to prevent weakly keyed maps from being accidentally created. Instead one of
/// constructors for [RandomState] must be used.
#[cfg(any(feature = "compile-time-rng", feature = "runtime-rng", feature = "no-rng"))]
impl Default for RandomState {
#[inline]
fn default() -> Self {
Expand Down

0 comments on commit 2f39c89

Please sign in to comment.