 Basic constructions and general results

General model for iterated hash functions

Most unkeyed hash functions h are designed as iterative processes which hash arbitrary- length inputs by processing successive fixed-size blocks of the input, as illustrated in Figure 9.2. Figure 9.2: General model for an iterated hash function.

A hash input x of arbitrary finite length is divided into fixed-length / -bit blocks ж*. This preprocessing typically involves appending extra bits (padding) as necessary to attain an overall bitlength which is a multiple of the blocklength r, and often includes (for security reasons - e.g., see Algorithm 9.26) a block or partial block indicating the bitlength of the unpadded input. Each block ж, then serves as input to an internal fixed-size hash function /, the compression function of h, which computes a new intermediate result of bitlength n for some fixed n, as a function of the previous n-bit intermediate result and the next input block rVi. Letting Я, denote the partial result after stage i, the general process for an iterated hash function with input x = xX2 ... xt can be modeled as follows: II,. | serves as the n-bit chaining variable between stage i — 1 and stage i, and #0 is a pre-defined starting value or initializing value (IV). An optional output transformation д (see Figure 9.2) is used in a final step to map the n-bit chaining variable to an m-bit result g{Ht) g is often the identity mapping g(Ht) = Ht.

Particular hash functions are distinguished by the nature of the preprocessing, compression function, and output transformation.

General constructions and extensions

To begin, an example demonstrating an insecure construction is given. Several secure general constructions are then discussed.

9.23 Example (insecure trivial extension of OWHF to CRHF) In the case that an iterated

OWHF h yielding n-bit hash-values is not collision resistant (e.g., when a 2"/     birthday collision attack is feasible - see §9.7.1) one might propose constructing from h a CRHF using as output the concatenation of the last two гг-bit chaining variables, so that a t-block message has hash-value Ht-iHt rather than Ht. This is insecure as the final message block xt can be held fixed along with Ht, reducing the problem to finding a collision on Ht-1 for h.

Extending compression functions to hash functions

Fact 9.24 states an important relationship between collision resistant compression functions and collision resistant hash functions. Not only can the former be extended to the latter, but this can be done efficiently using Merkle’s meta-method of Algorithm 9.25 (also called the Merkle-Damgard construction). This reduces the problem of finding such a hash function to that of finding such a compression function.

• 9.24 Fact (extending compression functions) Any compression function / which is collision resistant can be extended to a collision resistant hash function h (taking arbitrary length inputs).
• 9.25 Algorithm Merkle’s meta-method for hashing

The proof that the resulting function h is collision resistant follows by a simple argument that a collision for h would imply a collision for / for some stage i. The inclusion of the length-block, which effectively encodes all messages such that no encoded input is the tail end of any other encoded input, is necessary for this reasoning. Adding such a length- block is sometimes called Merkle-Damgard strengthening (MD-strengthening), which is now stated separately for future reference.

9.26 Algorithm MD-strengthening

Before hashing a message x = xX2 ... xt (where x* is a block of bitlength r appropriate for the relevant compression function) of bitlength b, append a final length-block, xt+i, containing the (say) right-justified binary' representation of b. (This presumes b < 2' .)