Skip to content

How aHash is resists DOS attacks

Tom Kaitchuck edited this page Jan 20, 2021 · 5 revisions

Overview

Previous attacks on hashmaps, such as from malware written and distributed by the authors of SipHash to attack Murmur hash have relied on several steps to make the attack effective:

  1. Reversing "premixing" so that only a fixed bit can be changed.
  2. Targeting a differential to change one bit that will not subsequently mix in a way that affects other bits.
  3. Sending a second difference in input which cancels out the effect the earlier change had on the intermediate state.
  4. Chaining many such sequences to turn a pair of colliding values into a vast number.

aHash is designed to prevent each of these steps.

Avoiding reversible premixing

When mixing data in a hash function, it is desirable to not prematurely throw away bits, because to do so is to loose entropy and will inevitably lead to collisions. This however has an unfortunate consequence, it means that such operations are (at least in principal) reversible. If by no other way, then by building up a table of what every input maps to.

For example murmurHash did the following to mix new input in the variable data:

        data = data.wrapping_mul(0xc6a4a7935bd1e995);
        data = data ^ (data >> 47);
        data = data.wrapping_mul(0xc6a4a7935bd1e995);
        state ^= data;

This is undoubtedly a great mixing function, it doesn't throw away bits, most bits end up being affect by most others, it is portable, and even fairly fast. The problem it that it is not at all key dependent. So someone just wrote some code to reverse this first part, then it is easy to control exactly what bits in state are changed. If murmur hash had reversed the order of these operations. IE:

        state ^= data;
        state = state.wrapping_mul(0xc6a4a7935bd1e995);
        state = state ^ (state >> 47);
        state = state.wrapping_mul(0xc6a4a7935bd1e995);

The attacks used against it would not have been possible because the mixing is done after the data is combined with information the attacker does not know.

aHash avoids this by always immediately combining the incoming data with something derived from a secret key to prevent this problem.

Additionally aHash has the advantage that it is explicitly not committed to a fixed standard. If such a bug were found in it, it would be immediately patched, where as with most hashing algorithms, we would just be stuck with the vulnerability until all code is ported to a new hashing algorithm.

Preventing chaining

The final stage in the SipHash's authors' DDOS attack on services running murmur hash was to 'chain' pairs of sequences which collided with each other to produce many values which all have the same hash.

In the code above you can see exactly why murmur was vulnerable to this, the intermediate state and the output were of the same length. So if there was a collision in the output, then the pair of inputs which produced the collision could be collected. Then these colliding pairs could be combined with each other to produce exponentially more collisions.

aHash avoids this by having an intermediate state which is significantly larger than its output. So if pairs of inputs are found that produce a collision in the output hash, they cannot be combined to produce additional collisions.

Ultimately this means that even if collisions can be found for aHash, they will not be easily weaponized.

Differential analysis

The middle two and most important steps in an attack are based on Differential analysis. The basic idea is, given two inputs which are identical except in one particular place, track the subsequent changes that occur. In the case of hashing algorithms this is done to try to a way to provide input that will affect only a small number of bits in the intermediate state, in the hope that this can lead to a collision later.

Lets walk through such an attack. Below is a diagram of the starting state of the hasher. It consists of 2 u128 values which are randomized and assumed to be unknown to an attacker.

Empty array

In the first step new block of data (which may be up to 128 bits) is xored into the top u128 and added into the lower one. So there will be a modification to two of the 32 bytes in the state. (possibly more if the addition carries)

Step 1

This immediately shows the first aspect of aHash's strategy which is that the input provided to the hasher is always used twice and in different ways. This ensures that if there ends up being an attack against one half of the state, it would not affect the other half of the state. (This provides resistance to Integral cryptanalysis)

There is no location or input that can only effect a single bit or byte of the intermediate state.

The next step is that "upper" portion of the state is fed through the AES 'SubBytes', 'ShiftRows', and 'MixColumns' steps. This is a single CPU instruction thanks to aes-ni.

Step 2

This has two important effects, first because the supplied data was xored with an unknown value, the result of the SubBytes step make the resulting output unpredictable. (Through there are still only 256 possibilities for that byte). Then the ShiftRows and MixColumns steps mix the state. In effect each input byte difference will end up affecting 4 adjacent bytes in the output.

Meanwhile (Actually in parallel thanks to using different CPU ports) the other half of the input state is shuffled according to this pattern:

Step 3

This is not random. In particular the pattern has the property that any differential regardless of its location will not be moved into a position that corresponds to one of the locations where where the differential would affect the AES output. (IE: in the diagram above there is no location where the bright red square will align vertically with the part of the dark red box.)

At this point the next block of input is added just as the first one was. However there is no input which can cancel out the previous differential because whatever input that is supplied will affect both the upper and lower parts of the state at the same offsets. But there are differences at non-overlapping offsets. So even if the keys were known, there is no input that can be supplied which alters the upper portion of the state in a way so at to cancel out the change but does not have a corresponding change in the lower portion of the state.

Given this an attacker would at best be forced to have the next block of input following the initial differential be identical and hope to force a collision later. After that the state would look like this:

step 4

Now the differential is fully defused across all bits of the upper portion of the state, and every bit in the upper portion of the state is dependent on every bit in the initial input. So there is no way to 'cancel' out the differential by any way short of guessing or knowing the key for the upper portion of the state. Furthermore there is at most a 1 in 256 chance that a such a pattern even exists at this stage.

This is certainly encouraging, but the situation is not yet air-tight. If the attacker were to observe the state at this point, the AES operations could be reversed and the original key obtained. Once this was known it would be possible to force a collision. But of course we ensure that this intermediate state cannot be observed.

To do this we perform finalization. This consists of first running a round of AES decryption (inverse 'MixColumns', 'ShiftRows', and 'SubBytes') on the lower half of the state. This spreads the any lingering differential from the last step out, across the lower half of the state. (Decryption is chosen because it won't collide with the encryption just performed on the upper portion of the state). Then the two halves of the state are xored together. This combined value is saved and sent through two more rounds of AES encryption the first using a new (never before used) random value as the key, and the second using the previous combined state. (The combined state is used rather than a fixed key because it prevents to "square attack" style attempts to obtain the key used in the prior round)

Finally the result is truncated to 64bits.

If we trace a differential through this finalize stage. Suppose we have the following differential going into the finalization:

Finalize 1

After the decryption we would have:

Finalize 2

Then the two halves are xored together:

Finalize 3

This is run through an encryption round and xored with a secret key:

Finalize 4

It is run through another encryption round and xored with the input from earlier:

Finalize 5

Finally the result is truncated:

Finalize 6

When the truncation is performed each of the remaining bytes is dependent on a 4 bytes of output from the previous step (After being combined with the key). Because before the final encryption round any differential in the input is already defused over the whole of the output, the truncate is throwing away half the data but each remaining bit is dependent on the whole of the data. This is not feasible to reverse because half of the information is simply missing.

One might wonder about the feasibility of getting values to collide "by chance". Here partial collisions are prevented by AES's high quality mixing and non-linear s-box. But AES has a far more important property: It is reversible. Because AES can be decrypted, for a given key each plain test will have exactly one encrypted text and vice versa. So for a given sequence of input, there should be a unique 'upper' and 'lower' halves of state going into the finalization. Similarly it can be state that prior to the final truncation there can be no collisions between two different upper states for the same lower state and vice versa. So an canceling out a differential in one but not the other can't be used as a means to produce collisions.

Finally this means for any given combined state in the finalization and 64 bit hash output, there are exactly 2^64 keys in the final step which could have produced that output. Similarly for any given key and hash output there are exactly 2^64 combined states which could have resulted in the output. (Of course an attacker wouldn't know even this much, because it isn't just the 128 bit key at the end which is unknown, the initial values of the 256bit state are unknown, and at no point prior to finalization are any of that entropy lost.)