Unkeyed hash functions (MDCs)

A move from general properties and constructions to specific hash functions is now made, and in this section the subclass of unkeyed hash functions known as modification detection codes (MDCs) is considered. From a structural viewpoint, these may be categorized based on the nature of the operations comprising their internal compression functions. From this viewpoint, the three broadest categories of iterated hash functions studied to date are hash functions based on block ciphers, customized hash functions, and hash functions based on modular arithmetic. Customized hash functions are those designed specifically for hashing, with speed in mind and independent of other system subcomponents (e.g., block cipher or modular multiplication subcomponents which may already be present for non-hashing purposes).

Table 9.3 summarizes the conjectured security of a subset of the MDCs subsequently discussed in this section. Similar to the case of block ciphers for encryption (e.g. 8- or 12- round DES vs. 16-round DES), security of MDCs often comes at the expense of speed, and tradeoffs are typically made. In the particular case of block-cipher-based MDCs, a provably secure scheme of Merkle (see page 378) with rate 0.276 (see Definition 9.40) is known but little-used, while MDC-2 is widely believed to be (but not provably) secure, has rate = 0.5, and receives much greater attention in practice.

Hash functions based on block ciphers

A practical motivation for constructing hash functions from block ciphers is that if an efficient implementation of a block cipher is already available within a system (either in hardware or software), then using it as the central component for a hash function may provide

(Hash function










for keylength = n

MDC-2 (with DES)b



2 • 282

2 • 254

rate 0.5

MDC-4 (with DES)




4 • 254

rate 0.25

Merkle (with DES)




2 50

rate 0.276





2 20

Remark 9.50





2 04

Remark 9.52





2 04






2 80


“The same strength is conjectured for Davies-Meyer and Miyaguchi-Preneel hash functions. ^Strength could be increased using a cipher with keylength equal to cipher blocklength.

Table 9.3: Upper bounds on strength of selected hash functions, n-bit message blocks are processed to produce m-bit hash-values. Number of cipher or compression function operations currently believed necessary to find preimages and collisions are specified, assuming no underlying weabtesses for block ciphers (figures for MDC-2 and MDC-4 account for DES complementation and weak key properties). Regarding rate, see Definition 9.40.

the latter functionality at little additional cost. The (not always well-founded) hope is that a good block cipher may serve as a building block for the creation of a hash function with properties suitable for various applications.

Constructions for hash functions have been given which are “provably secure” assuming certain ideal properties of the underlying block cipher. However, block ciphers do not possess the properties of random functions (for example, they are invertible - see Remark 9.14). Moreover, in practice block ciphers typically exhibit additional regularities or weaknesses (see §9.7.4). For example, for a block cipher E, double encryption using an encrypt-decrypt (E-D) cascade with keys K, K2 results in the identity mapping when Кi = K2. In summary, while various necessary conditions are known, it is unclear exactly what requirements of a block cipher are sufficient to construct a secure hash function, and properties adequate for a block cipher (e.g., resistance to chosen-text attack) may not guarantee a good hash function.

In the constructions which follow, Definition 9.39 is used.

9.39 Definition An (n,r) block cipher is a block cipher defining an invertible function from n-bit plaintexts to n-bit ciphertexts using an r-bit key. If E is such a cipher, then Ek{x) denotes the encryption of x under key к.

Discussion of hash functions constructed from n-bit block ciphers is divided between those producing single-length (n-bit) and double-length (2n-bit) hash-values, where single and double are relative to the size of the block cipher output. Under the assumption that computations of 264 operations are infeasible,[1] the objective of single-length hash functions is to provide a OWHF for ciphers of blocklength near n = 64, or to provide CRHFs for cipher blocklengths near n = 128. The motivation for double-length hash functions is that many n-bit block ciphers exist of size approximately n = 64, and single-length hash-codes of this size are not collision resistant. For such ciphers, the goal is to obtain hash-codes of bitlength 2n which are CRHFs.

In the simplest case, the size of the key used in such hash functions is approximately the same as the blocklength of the cipher (i.e., n bits). In other cases, hash functions use larger (e.g., double-length) keys. Another characteristic to be noted in such hash functions is the number of block cipher operations required to produce a hash output of blocklength equal to that of the cipher, motivating the following definition.

9.40 Definition Let h be an iterated hash function constructed from a block cipher, with compression function / which perfonns s block encryptions to process each successive n-bit message block. Then the rate of h is 1/s.

The hash functions discussed in this section are summarized in Table 9.4. The Matyas- Meyer-Oseas and MDC-2 algorithms are the basis, respectively, of the two generic hash functions in ISO standard 10118-2, each allowing use of any n-bit block cipher E and providing hash-codes of bitlength m < n and m < 2n, respectively.

Hash function

(n, к, m)


Matyas-Meyer-Oseas Davies-Meyer Miyaguchi-Preneel MDC-2 (with DES) MDC-4 (with DES)

  • (n, к, n) (n, к, n) (n, к, n)
  • (64.56.128)
  • (64.56.128)



  • 1
  • 1/2
  • 1/4

Table 9.4: Summary of selected hash functions based on n-bit block ciphers, k = key bitsize (approximate): function yields m-bit hash-values.

(i) Single-length MDCs of rate 1

The first three schemes described below, and illustrated in Figure 9.3, are closely related single-length hash functions based on block ciphers. These make use of the following predefined components:

  • 1. a generic n-bit block cipher Ek parametrized by a symmetric key К;
  • 2. a function g which maps n-bit inputs to keys К suitable for E (if keys for E are also of length n, g might be the identity function); and
  • 3. a fixed (usually n-bit) initial value IV, suitable for use with E.
Three single-length, rate-one MDCs based on block ciphers

Figure 9.3: Three single-length, rate-one MDCs based on block ciphers.

9.41 Algorithm Matyas-Meyer-Oseas hash

INPUT: bitstring x.

OUTPUT: n-bit hash-code of x.

  • 1. Input x is divided into n-bit blocks and padded, if necessary, to complete last block. Denote the padded message consisting of t n-bit blocks: xX2 ■ ■. xt. A constant n- bit initial value IV must be pre-specified.
  • 2. The output is Ht defined by: Ho = IV II, = Eg(Hi-i){xi)(&Xi, 1 < i < t.
  • 9.42 Algorithm Davies-Meyer hash

INPUT: bitstring x.

OUTPUT: n-bit hash-code of x.

  • 1. Input x is divided into Л -bit blocks where к is the keysize, and padded, if necessary, to complete last block. Denote the padded message consisting of t А -bit blocks: xX2 ... xt. A constant n-bit initial value IV must be pre-specified.
  • 2. The output is Ht defined by: //() = IV II, = EXi (#;_1)ф#,_1. 1 < i < t.
  • 9.43 Algorithm Miyaguchi-Preneel hash

This scheme is identical to that of Algorithm 9.41, except the output //,_ i from the previous stage is also XORed to that of the current stage. More precisely, H, is redefined as: #0 =

IV: Hi = Eg(Hi_1)(xi)(Bxi(bHi-i, 1 < i < t.

  • 9.44 Remark (dual schemes) The Davies-Meyer hash may be viewed as the ‘dual’ of the Mat- yas-Meyer-Oseas hash, in the sense that хг and i play reversed roles. When DES is used as the block cipher in Davies-Meyer, the input is processed in 56-bit blocks (yielding rate 56/64 < 1), whereas Matyas-Meyer-Oseas and Miyaguchi-Preneel process 64-bit blocks.
  • 9.45 Remark (black-box security) Aside from heuristic arguments as given in Example 9.13, it appears that all three of Algorithms 9.41, 9.42, and 9.43 yield hash functions which are provably secure under an appropriate “black-box” model (e.g., assuming E has the required randomness properties, and that attacks may not make use of any special properties or internal details of E). “Secure” here means that finding preimages and collisions (in fact, pseudo-preimages and pseudo-collisions - see §9.7.2) require on the order of 2" and 2n/2 n-bit block cipher operations, respectively. Due to their single-length nature, none of these three is collision resistant for underlying ciphers of relatively small blocklength (e.g., DES, which yields 64-bit hash-codes).

Several double-length hash functions based on block ciphers are considered next.

(ii) Double-length MDCs: MDC-2 and MDC-4

MDC-2 and MDC-4 are manipulation detection codes requiring 2 and 4, respectively, block cipher operations per block of hash input. They employ a combination of either 2 or 4 iterations of the Matyas-Meyer-Oseas (single-length) scheme to produce a double-length hash. When used as originally specified, using DES as the underlying block cipher, they produce 128-bit hash-codes. The general construction, however, can be used with other block ciphers. MDC-2 and MDC-4 make use of the following pre-specified components:

  • 1. DES as the block cipher Ek of bitlength n = 64 parameterized by a 56-bit key К;
  • 2. two functions g and g which map 64-bit values U to suitable 56-bit DES keys as follows. For U = UU2 . ■ ■ u64, delete every' eighth bit starting with щ, and set the 2nd and 3rd bits to ‘10’ for g, and ‘0Г for g:

(The resulting values are guaranteed not to be weak or semi-weak DES keys, as all such keys have bit 2 = bit 3; see page 375. Also, this guarantees the security requirement that g(IV) ± g(IV).)

MDC-2 is specified in Algorithm 9.46 and illustrated in Figure 9.4.

Compression function of MDC-2 hash function. E = DES

Figure 9.4: Compression function of MDC-2 hash function. E = DES.

9.46 Algorithm MDC-2 hash function (DES-based)

INPUT: string x of bitlength r = 64f for t > 2.

OUTPUT: 128-bit hash-code of x.

  • 1. Partition x into 64-bit blocks xp. x = xX2 . ■ .xt.
  • 2. Choose the 64-bit non-secret constants IV, IV (the same constants must be used for MDC verification) from a set of recommended prescribed values. A default set of prescribed values is (in hexadecimal): IV = 0x52 52 52 52 52 52 52 52, IV = 0x2525252525252525.

3. Let || denote concatenation, and Cl, C[{ the left and right 32-bit halves of C,. The output is h(x) = Ht || Ht defined as follows (for 1 < i < t):

hi Algorithm 9.46, padding may be necessary to meet the bitlength constraint on the input x. In this case, an unambiguous padding method may be used (see Remark 9.31), possibly including MD-strengthening (see Remark 9.32).

MDC-4 (see Algorithm 9.47 and Figure 9.5) is constructed using the MDC-2 compression function. One iteration of the MDC-4 compression function consists of two sequential executions of the MDC-2 compression function, where:

  • 1. the two 64-bit data inputs to the first MDC-2 compression are both the same next 64-bit message block;
  • 2. the keys for the first MDC-2 compression are derived from the outputs (chaining variables) of the previous MDC-4 compression;
  • 3. the keys for the second MDC-2 compression are derived from the outputs (chaining variables) of the first MDC-2 compression; and
  • 4. the two 64-bit data inputs for the second MDC-2 compression are the outputs (chaining variables) from the opposite sides of the previous MDC-4 compression.
  • 9.47 Algorithm MDC-4 hash function (DES-based)

INPUT: string x of bitlength r = 64i for t > 2. (See MDC-2 above regarding padding.) OUTPUT: 128-bit hash-code of x.

  • 1. As in step 1 of MDC-2 above.
  • 2. As in step 2 of MDC-2 above.
  • 3. With notation as in MDC-2, the output is h(x) = Gt || Gt defined as follows (for

1 < i < t):

  • [1] The discussion here is easily altered for a more conservative bound, e.g., 280 operations as used in §9.3.5.Here '264 is more convenient for discussion, due to the omnipresence of 64-bit block ciphers.
< Prev   CONTENTS   Source   Next >