Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation around keys #139

Merged
merged 10 commits into from Nov 1, 2022
4 changes: 4 additions & 0 deletions Cargo.toml
Expand Up @@ -36,6 +36,10 @@ runtime-rng = ["getrandom"]
# If this is disabled and runtime-rng is unavailable constant keys are used.
compile-time-rng = ["const-random"]

# Do not use any random number generator (either at compile time or runtime)
# If either runtime-rng or compile-time-rng are enabled this does nothing.
no-rng = []

# in case this is being used on an architecture lacking core::sync::atomic::AtomicUsize and friends
atomic-polyfill = [ "dep:atomic-polyfill", "once_cell/atomic-polyfill"]

Expand Down
8 changes: 2 additions & 6 deletions README.md
Expand Up @@ -61,12 +61,8 @@ This allows for DOS resistance even if there is no random number generator avail
This makes the binary non-deterministic. (If non-determinism is a problem see [constrandom's documentation](https://github.com/tkaitchuck/constrandom#deterministic-builds))

If both `runtime-rng` and `compile-time-rng` are enabled the `runtime-rng` will take precedence and `compile-time-rng` will do nothing.

**NOTE:** If both `runtime-rng` and `compile-time-rng` a source of randomness may be provided by the application on startup
using the [ahash::random_state::set_random_source](https://docs.rs/ahash/latest/ahash/random_state/fn.set_random_source.html) method.
If neither flag is set and this is not done, aHash will fall back on using the numeric value of memory addresses as a source of randomness.
This is somewhat strong if ALSR is turned on (it is by default) but for embedded platforms this will result in weak keys.
As a result, it is recommended to use `compile-time-rng` anytime random numbers will not be available at runtime.
If neither flag is set, seeds can be supplied by the application. [Multiple apis](https://docs.rs/ahash/latest/ahash/random_state/struct.RandomState.html)
are available to do this.

## Comparison with other hashers

Expand Down
24 changes: 18 additions & 6 deletions src/hash_map.rs
Expand Up @@ -14,6 +14,7 @@ use serde::{
};

use crate::RandomState;
use crate::random_state::RandomSource;

/// A [`HashMap`](std::collections::HashMap) using [`RandomState`](crate::RandomState) to hash the items.
/// (Requires the `std` feature to be enabled.)
Expand Down Expand Up @@ -51,12 +52,16 @@ impl<K, V> Into<HashMap<K, V, crate::RandomState>> for AHashMap<K, V> {
}

impl<K, V> AHashMap<K, V, RandomState> {
/// This crates a hashmap using [RandomState::new] which obtains its keys from [RandomSource].
/// See the documentation in [RandomSource] for notes about key strength.
pub fn new() -> Self {
AHashMap(HashMap::with_hasher(RandomState::default()))
AHashMap(HashMap::with_hasher(RandomState::new()))
}

/// This crates a hashmap with the specified capacity using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
pub fn with_capacity(capacity: usize) -> Self {
AHashMap(HashMap::with_capacity_and_hasher(capacity, RandomState::default()))
AHashMap(HashMap::with_capacity_and_hasher(capacity, RandomState::new()))
}
}

Expand Down Expand Up @@ -340,13 +345,16 @@ where
}
}

impl<K, V, S> FromIterator<(K, V)> for AHashMap<K, V, S>
impl<K, V> FromIterator<(K, V)> for AHashMap<K, V, RandomState>
where
K: Eq + Hash,
S: BuildHasher + Default,
{
/// This crates a hashmap from the provided iterator using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
fn from_iter<T: IntoIterator<Item = (K, V)>>(iter: T) -> Self {
AHashMap(HashMap::from_iter(iter))
let mut inner = HashMap::with_hasher(RandomState::new());
inner.extend(iter);
AHashMap(inner)
}
}

Expand Down Expand Up @@ -397,10 +405,14 @@ where
}
}

/// NOTE: For safety this trait impl is only available available if either of the flags `runtime-rng` (on by default) or
/// `compile-time-rng` are enabled. This is to prevent weakly keyed maps from being accidentally created. Instead one of
/// constructors for [RandomState] must be used.
#[cfg(any(feature = "compile-time-rng", feature = "runtime-rng", feature = "no-rng"))]
impl<K, V> Default for AHashMap<K, V, RandomState> {
#[inline]
fn default() -> AHashMap<K, V, RandomState> {
AHashMap::new()
AHashMap(HashMap::default())
}
}

Expand Down
34 changes: 23 additions & 11 deletions src/hash_set.rs
@@ -1,4 +1,5 @@
use crate::RandomState;
use crate::random_state::RandomSource;
use std::collections::{hash_set, HashSet};
use std::fmt::{self, Debug};
use std::hash::{BuildHasher, Hash};
Expand All @@ -14,10 +15,10 @@ use serde::{
/// A [`HashSet`](std::collections::HashSet) using [`RandomState`](crate::RandomState) to hash the items.
/// (Requires the `std` feature to be enabled.)
#[derive(Clone)]
pub struct AHashSet<T, S = crate::RandomState>(HashSet<T, S>);
pub struct AHashSet<T, S = RandomState>(HashSet<T, S>);

impl<T> From<HashSet<T, crate::RandomState>> for AHashSet<T> {
fn from(item: HashSet<T, crate::RandomState>) -> Self {
impl<T> From<HashSet<T, RandomState>> for AHashSet<T> {
fn from(item: HashSet<T, RandomState>) -> Self {
AHashSet(item)
}
}
Expand All @@ -40,19 +41,23 @@ where
}
}

impl<T> Into<HashSet<T, crate::RandomState>> for AHashSet<T> {
fn into(self) -> HashSet<T, crate::RandomState> {
impl<T> Into<HashSet<T, RandomState>> for AHashSet<T> {
fn into(self) -> HashSet<T, RandomState> {
self.0
}
}

impl<T> AHashSet<T, RandomState> {
/// This crates a hashset using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
pub fn new() -> Self {
AHashSet(HashSet::with_hasher(RandomState::default()))
AHashSet(HashSet::with_hasher(RandomState::new()))
}

/// This crates a hashset with the specified capacity using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
pub fn with_capacity(capacity: usize) -> Self {
AHashSet(HashSet::with_capacity_and_hasher(capacity, RandomState::default()))
AHashSet(HashSet::with_capacity_and_hasher(capacity, RandomState::new()))
}
}

Expand Down Expand Up @@ -237,14 +242,17 @@ where
}
}

impl<T, S> FromIterator<T> for AHashSet<T, S>
impl<T> FromIterator<T> for AHashSet<T, RandomState>
where
T: Eq + Hash,
S: BuildHasher + Default,
{
/// This crates a hashset from the provided iterator using [RandomState::new].
/// See the documentation in [RandomSource] for notes about key strength.
#[inline]
fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> AHashSet<T, S> {
AHashSet(HashSet::from_iter(iter))
fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> AHashSet<T> {
let mut inner = HashSet::with_hasher(RandomState::new());
inner.extend(iter);
AHashSet(inner)
}
}

Expand Down Expand Up @@ -286,6 +294,10 @@ where
}
}

/// NOTE: For safety this trait impl is only available available if either of the flags `runtime-rng` (on by default) or
/// `compile-time-rng` are enabled. This is to prevent weakly keyed maps from being accidentally created. Instead one of
/// constructors for [RandomState] must be used.
#[cfg(any(feature = "compile-time-rng", feature = "runtime-rng", feature = "no-rng"))]
impl<T> Default for AHashSet<T, RandomState> {
/// Creates an empty `AHashSet<T, S>` with the `Default` value for the hasher.
#[inline]
Expand Down
86 changes: 52 additions & 34 deletions src/lib.rs
@@ -1,7 +1,5 @@
//! AHash is a high performance keyed hash function.
//!
//! It is a DOS resistant alternative to `FxHash` or a faster alternative to `SipHash`.
//!
//! It quickly provides a high quality hash where the result is not predictable without knowing the Key.
//! AHash works with `HashMap` to hash keys, but without allowing for the possibility that an malicious user can
//! induce a collision.
Expand All @@ -11,65 +9,85 @@
//! When it is available aHash uses the hardware AES instructions to provide a keyed hash function.
//! When it is not, aHash falls back on a slightly slower alternative algorithm.
//!
//! AHash does not have a fixed standard for its output. This allows it to improve over time.
//! But this also means that different computers or computers using different versions of ahash will observe different
//! hash values.
//! Because aHash does not have a fixed standard for its output, it is able to improve over time.
//! But this also means that different computers or computers using different versions of ahash may observe different
//! hash values for the same input.
#![cfg_attr(
feature = "std",
all(feature = "std", any(feature = "compile-time-rng", feature = "runtime-rng", feature = "no-rng")),
doc = r##"
# Usage
AHash is a drop in replacement for the default implementation of the Hasher trait. To construct a HashMap using aHash as its hasher do the following:
# Basic Usage
AHash provides an implementation of the [Hasher] trait.
To construct a HashMap using aHash as its hasher do the following:
```
use ahash::{AHasher, RandomState};
use std::collections::HashMap;

let mut map: HashMap<i32, i32, RandomState> = HashMap::default();
map.insert(12, 34);
```
"##
)]
#![cfg_attr(
feature = "std",
doc = r##"
For convenience, both new-type wrappers and type aliases are provided.

The new type wrappers are called called `AHashMap` and `AHashSet`.
These do the same thing with slightly less typing. (For convience `From`, `Into`, and `Deref` are provided).
```
use ahash::AHashMap;
### Randomness

let mut map: AHashMap<i32, i32> = AHashMap::new();
map.insert(12, 34);
```
The above requires a source of randomness to generate keys for the hashmap. By default this obtained from the OS.
It is also possible to have randomness supplied via the `compile-time-rng` flag, or manually.

### If randomess is not available

For even less typing and better interop with existing libraries which require a `std::collection::HashMap` (such as rayon),
the type aliases [HashMap], [HashSet] are provided. These alias the `std::HashMap` and `std::HashSet` using aHash as the hasher.
[AHasher::default()] can be used to hash using fixed keys. This works with
[BuildHasherDefault](std::hash::BuildHasherDefault). For example:

```
use ahash::{HashMap, HashMapExt};
use std::hash::BuildHasherDefault;
use std::collections::HashMap;
use ahash::AHasher;

let mut map: HashMap<i32, i32> = HashMap::new();
map.insert(12, 34);
let mut m: HashMap<_, _, BuildHasherDefault<AHasher>> = HashMap::default();
# m.insert(12, 34);
```
Note the import of [HashMapExt]. This is needed for the constructor.
It is also possible to instantiate [RandomState] directly:

# Directly hashing
```
use ahash::HashMap;
use ahash::RandomState;

Hashers can also be instantiated with `RandomState`. For example:
let mut m = HashMap::with_hasher(RandomState::with_seed(42));
# m.insert(1, 2);
```
Or for uses besides a hashhmap:
```
use std::hash::BuildHasher;
use ahash::RandomState;

let hash_builder = RandomState::with_seed(42);
let hash = hash_builder.hash_one("Some Data");
```
### Randomness
There are several constructors for [RandomState] with different ways to supply seeds.

# Convenience wrappers

To ensure that each map has a unique set of keys aHash needs a source of randomness.
Normally this is just obtained from the OS. (Or via the `compile-time-rng` flag)
For convenience, both new-type wrappers and type aliases are provided.

If for some reason (such as fuzzing) an application wishes to supply all random seeds manually, this can be done via:
[random_state::set_random_source].
The new type wrappers are called called `AHashMap` and `AHashSet`.
```
use ahash::AHashMap;

let mut map: AHashMap<i32, i32> = AHashMap::new();
map.insert(12, 34);
```
This avoids the need to type "RandomState". (For convience `From`, `Into`, and `Deref` are provided).

# Aliases

For even less typing and better interop with existing libraries (such as rayon) which require a `std::collection::HashMap` ,
the type aliases [HashMap], [HashSet] are provided.

```
use ahash::{HashMap, HashMapExt};

let mut map: HashMap<i32, i32> = HashMap::new();
map.insert(12, 34);
```
Note the import of [HashMapExt]. This is needed for the constructor.

"##
)]
Expand Down
19 changes: 14 additions & 5 deletions src/random_state.rs
Expand Up @@ -121,9 +121,12 @@ cfg_if::cfg_if! {
///
/// If [set_random_source] aHash will default to the best available source of randomness.
/// In order this is:
/// 1. OS provided random number generator (available if the `runtime-rng` flag is enabled which it is by default)
/// 2. Strong compile time random numbers used to permute a static "counter". (available if `compile-time-rng` is enabled. __Enabling this is recommended if `runtime-rng` is not possible__)
/// 3. A static counter that adds the memory address of each [RandomState] created permuted with fixed constants. (Similar to above but with fixed keys)
/// 1. OS provided random number generator (available if the `runtime-rng` flag is enabled which it is by default) - This should be very strong.
/// 2. Strong compile time random numbers used to permute a static "counter". (available if `compile-time-rng` is enabled.
/// __Enabling this is recommended if `runtime-rng` is not possible__)
/// 3. A static counter that adds the memory address of each [RandomState] created permuted with fixed constants.
/// (Similar to above but with fixed keys) - This is the weakest option. The strength of this heavily depends on whether or not ASLR is enabled.
/// (Rust enables ASLR by default)
pub trait RandomSource {
fn gen_hasher_seed(&self) -> usize;
}
Expand Down Expand Up @@ -207,7 +210,7 @@ cfg_if::cfg_if! {
/// | Constructor | Dynamically random? | Seed |
/// |---------------|---------------------|------|
/// |`new` | Each instance unique|_[RandomSource]_|
/// |`generate_with`| Each instance unique|`u64` x 4 + static counter|
/// |`generate_with`| Each instance unique|`u64` x 4 + [RandomSource]|
/// |`with_seed` | Fixed per process |`u64` + static random number|
/// |`with_seeds` | Fixed |`u64` x 4|
///
Expand All @@ -229,7 +232,8 @@ impl RandomState {

/// Create a new `RandomState` `BuildHasher` using random keys.
///
/// (Each instance will have a unique set of keys).
/// Each instance will have a unique set of keys derived from [RandomSource].
///
#[inline]
pub fn new() -> RandomState {
let src = get_src();
Expand Down Expand Up @@ -363,6 +367,11 @@ impl RandomState {
/// can be used to create many hashers each or which will have the same keys.)
///
/// This is the same as [RandomState::new()]
///
/// NOTE: For safety this trait impl is only available available if either of the flags `runtime-rng` (on by default) or
/// `compile-time-rng` are enabled. This is to prevent weakly keyed maps from being accidentally created. Instead one of
/// constructors for [RandomState] must be used.
#[cfg(any(feature = "compile-time-rng", feature = "runtime-rng", feature = "no-rng"))]
impl Default for RandomState {
#[inline]
fn default() -> Self {
Expand Down