Customized hash functions based on MD4
Customized hash functions are those which are specifically designed “from scratch” for the explicit purpose of hashing, with optimized performance in mind, and without bemg constrained to reusing existing system components such as block ciphers or modular arithmetic. Those havmg received the greatest attention in practice are based on the MD4 hash function.
Number 4 in a series of hash functions (Message Digest algorithms), MD4 was designed specifically for software implementation on 32bit machines. Security concerns motivated the design of MD5 shortly thereafter, as a more conservative variation of MD4.
Figure 9.5: Compression function ofMDC4 hash function
Other important subsequent variants include the Secure Hash Algorithm (SHA1), the hash function RIPEMD, and its strengthened variants RIPEMD128 and RIPEMD160. Parameters for these hash functions are summarized in Table 9.5. “Rounds x Steps per round” refers to operations performed on input blocks within the corresponding compression function. Table 9.6 specifies test vectors for a subset of these hash functions.
Notation for description of MD4family algorithms
Table 9.7 defines the notation for the description of MD4family algorithms described below. Note 9.48 addresses the implementation issue of converting strings of bytes to words in an unambiguous manner.
 9.48 Note (littleendian vs. bigendian) For interoperable implementations involving byteto word conversions on different processors (e.g., converting between 32bit words and groups of four 8bit bytes), an unambiguous convention must be specified. Consider a stream of bytes B, with increasing memory addresses i, to be interpreted as a 32bit word with numerical value W. In littleendian architectures, the byte with the lowest memory address (Si) is the least significant byte: W = 2^{24}B_{4} + 2^{1G}S_{3} + 2^{&}B2 + B. In bigendian architectures, the byte with the lowest address (Si) is the most significant byte: W = 2^{24}Bi + 2^{1C}B_{2} + 2 ^{8}B_{3} + B_{4}.
 (i) MD4
MD4 (Algorithm 9.49) is a 128bit hash function. The original MD4 design goals were that breaking it should require roughly bruteforce effort: finding distinct messages with the same hashvalue should take about 2^{G4} operations, and finding a message yielding a
Name 
Bitlength 
Rounds x Steps per round 
Relative speed 
MD4 
128 
3 x 16 
1.00 
MD5 
128 
4 x 16 
0.68 
RIPEMD128 
128 
4x16 twice (in parallel) 
0.39 
SHA1 
160 
4 x 20 
0.28 
RIPEMD160 
160 
5x16 twice (in parallel) 
0.24 
Table 9.5: Summary of selected hash functions based on MD4.
Name 
String 
Hash value (as a hex byte string) 
MD4 
**55 “a” “abc” “abcdefghijklnmopqrstuvwxyz” 
3 Id6cfe0dl6ae93 Ib73c59d7e0c089c0 bde52cb3Ide33e46245e05fbdbd6fb24 a448017aaf21d8525fcl0ae87aa6729d d79e1c308aa5bbcdeea8ed63df412da9 
MD5 
“a” “abc” “abcdefghijklnmopqrstuvwxyz” 
d41 d8cd98f00b204e9800998ecf8427e 0ccl75b9c0flb6a831c399e269772661 900150983cd24fb0d6963f7 d28e 17f72 c3fcd3d76192e4007dfb496cca67e13b 
SHA1 
<*55 “a” “abc” “abcdefghijklnmopqrstuvwxyz” 
da39a3ee5e6b4b0d3255bfef95601890afd80709 86f7e437faa5a7fcel5dlddcb9eaeaea377667b8 a9993e364706816aba3e25717850c26c9cd0d89d 32dl0c7b8cf96570ca04ce37f2al9d84240d3a89 
RIPEMD160 
<*55 “a” “abc” “abcdefghijklnmopqrstuvwxyz” 
f71c27109c692clb56bbdceb5b9d2865b3708dbc 
Table 9.6: Test vectors for selected hash functions.
Notation 
Meaning 
variables representing 32bit quantities hexadecimal 32bit integer (least significant byte: 01) addition modulo 2^{32 }bitwise complement result of rotating и left through s positions bitwise AND bitwise inclusiveOR bitwise exclusiveOR uv V йк uv V uw V vw ифуфю 

simultaneous assignments (Xi <— Yi), where (Yi,... , Yj) is evaluated prior to any assignments 
Table 9.7: Notation for MD4family algorithms.
prespecified hashvalue about 2^{128} operations. It is now known that MD4 fails to meet this goal (Remark 9.50). Nonetheless, a full description of MD4 is included as Algorithm 9.49 for historical and cryptanalytic reference. It also serves as a convenient reference for describing, and allowing comparisons between, other hash functions in this family.
9.49 Algorithm MD4 hash function
INPUT: bitstring x of arbitrary bitlength 6 > 0. (For notation see Table 9.7.)
OUTPUT: 128bit hashcode of x. (See Table 9.6 for test vectors.)
1. Definition of constants. Define four 32bit initial chaining values (IVs):
hi = 0x67452301, /12 = 0xefcdab89, h_{3} = 0x98badcfe, /14 = 0x10325476.
Define additive 32bit constants: y[j] = 0,0 < j < 15;
j/[j] = 0x5a827999,16 < j < 31; (constant = squareroot of 2) j/[j] = 0x6ed9ebal, 32 < j < 47; (constant = squareroot of 3)
Define order for accessing source words (each list contains 0 through 15):
.г[0..15] = [0,1.2,3,4,5,6,7,8,9,10.11.12.13.14.15],
2[16..31] = [0,4,8,12,1,5,9,13,2,6,10,14,3,7,11,15], г[32..47] = [0,8,4,12,2,10,6,14,1,9,5,13,3,11,7,15].
Finally define the number of bit positions for left shifts (rotates): s[0..15] = [3,7,11,19,3,7,11,19,3,7,11,19,3,7,11,19], s[16..31] = [3,5,9,13,3,5,9,13,3,5,9,13,3,5,9,13], s[32..47] = [3,9,11,15,3,9,11,15,3,9,11,15,3,9.11.15].
 2. Preprocessing. Pad x such that its bitlength is a multiple of 512, as follows. Append a single 1bit, then append r — 1 (> 0) 0bits for the smallest r resulting in a bitlength 64 less than a multiple of 512. Finally append the 64bit representation of b mod 2^{G4}, as two 32bit words with least significant word first. (Regarding converting between streams of bytes and 32bit words, the convention is littleendian; see Note 9.48.) Let m be the number of 512bit blocks in the resulting string (6 + r + 64 = 512m = 32 ■ 16m). The formatted input consists of 16m 32bit words: xoxi... aiicmi • Initialize: (Hi, H_{2}, ff_{3}, Hf) i (hi,h_{2}, /13, h_{4}).
 3. Processing. For each i from 0 to m  1, copy the i^{th} block of 16 32bit words into temporary storage: Xj] t— хш+j, 0 < j < 15, then process these as below in three 16step rounds before updating the chaining variables:
 (initialize working variables) (А, В, C, D) t— (Hi, Я_{2}, H_{3}, I If.
 (Round 1) For j from 0 to 15 do the following:
t < (A + f(B, C, D) + X[z[j}} + yj]), (А. В, C, D) <(D,t sj], В, C).
(Round 2) For j from 16 to 31 do the following:
t < (A + д(В, C. D) + X[zj}} + y[j]). (А В, C, D) < (D,t ^ sj],B, C).
(Round 3) For j from 32 to 47 do the folio whig:
t < (A + h(B, C, D) + X[z[j}} + y[j]), (А, В, C, D) < (D,t ^ s[j],B, C). (update chaining values) (Яь Я_{2}. H_{3}, Hf) < (Hi +A, H_{2} + B, H_{3} + C, II +1)).
 4. Completion. The final hashvalue is the concatenation: ff_{1}ff_{2}#_{3}#_{4 }(with first and last bytes the low and highorder bytes of Hi, H_{4}, respectively).
 9.50 Remark (MD4 collisions) Collisions have been found for MD4 in 2^{20} compression function computations (cf. Table 9.3). For this reason, MD4 is no longer recommended for use as a collisionresistant hash function. While its utility as a oneway function has not been studied in light of this result, it is prudent to expect a preimage attack on MD4 requiring fewer than 2^{128} operations will be found.
(ii) MD5
MD5 (Algorithm 9.51) was designed as a strengthened version of MD4, prior to actual MD4 collisions being found. It has enjoyed widespread use in practice. It has also now been found to have weaknesses (Remark 9.52).
The changes made to obtam MD5 from MD4 are as follows:
 1. addition of a fourth round of 16 steps, and a Round 4 function
 2. replacement of the Round 2 function by a new function
 3. modification of the access order for message words in Rounds 2 and 3
 4. modification of the shift amounts (such that shifts differ in distinct rounds)
 5. use of unique additive constants in each of the 4x16 steps, based on the integer part of 2^{32} • sin(j) for step j (requiring overall, 256 bytes of storage)
 6. addition of output from the previous step into each of the 64 steps.
 9.51 Algorithm MD5 hash function
INPUT: bitstring x of arbitrary bitlength 6 > 0. (For notation, see Table 9.7.)
OUTPUT: 128bit hashcode of x. (See Table 9.6 for test vectors.)
MD5 is obtained from MD4 by making the following changes.
1. Notation. Replace the Round 2 function by: g(u,v,w)^{d}= uw V vw.
Define a Round 4 function: k(u, v, tv) '= v ® (и V w).
2. Definition of constants. Redefine unique additive constants:
yj = first 32 bits of binary value abs(sin(j +1)), 0 < j <63, where j is in radians and “abs” denotes absolute value. Redefine access order for words in Rounds 2 and 3, and define for Round 4:
z[16..31] = [1,6,11,0,5,10,15,4,9,14,3,8,13,2,7,12], z[32..47] = [5,8,11,14,1,4,7,10,13,0,3,6,9,12,15,2], z[48..63] = [0,7,14,5,12,3,10,1,8,15,6,13,4.11,2,9].
Redefine number of bit positions for left shifts (rotates): s[0..15] = [7,12,17,22,7,12.17,22,7,12,17,22,7,12,17,22], s[16..31] = [5,9,14,20,5,9,14,20,5,9,14,20,5,9.14,20], s[32..47] = [4,11,16,23,4,11,16,23,4,11,16,23,4,11,16,23], s[48..63] = [6,10,15,21,6.10,15,21,6,10,15,21,6,10,15,21].
 3. Preprocessing. As in MD4.
 4. Processing. In each of Rounds 1, 2, and 3, replace “J5 (t ^ s[j])” by “B < В [t t—^ s [)])”. Also, immediately following Round 3 add:
 (Round4) For j from 48 to 63 do the following:
t < (A+k(B,C,D)+X[zj}]+yj}),(A,B,C,D) < (Д B+(t s[j}), В. C).
 5. Completion. As in MD4.
 9.52 Remark (MD5 compression function collisions) While no collisions for MD5 have yet been found (cf. Table 9.3), collisions have been found for the MD5 compression function. More specifically, these are called collisions for random IV. (See §9.7.2, and in particular Definition 9.97 and Note 9.98.)
(iii) SHA1
The Secure Hash Algorithm (SHA1), based on MD4, was proposed by the U.S. National Institute for Standards and Technology (NIST) for certain U.S. federal government applications. The main differences of SHA1 from MD4 are as follows:
 1. The hashvalue is 160 bits, and five (vs. four) 32bit chaining variables are used.
 2. The compression function has four rounds instead of three, using the MD4 step functions /, g, and h as follows: / in the first, g in the third, and h in both the second and fourth rounds. Each round has 20 steps instead of 16.
 3. Within the compression function, each 16word message block is expanded to an 80 word block, by a process whereby each of the last 64 of the 80 words is the XOR of 4 words from earlier positions in the expanded block. These 80 words are then input onewordperstep to the 80 steps.
 4. The core step is modified as follows: the only rotate used is a constant 5bit rotate; the fifth working variable is added into each step result; message words from the expanded message block are accessed sequentially; and C is updated as В rotated left 30 bits, rather than simply B.
 5. SHA1 uses four nonzero additive constants, whereas MD4 used three constants only two of which were nonzero.
The byte ordering used for converting between streams of bytes and 32bit words in the official SHA1 specification is bigendian (see Note 9.48); this differs from MD4 which is littleendian.
9.53 Algorithm Secure Hash Algorithm  revised (SHA1)
INPUT: bitstring x of bitlength b > 0. (For notation, see Table 9.7.)
OUTPUT: 160bit hashcode of x. (See Table 9.6 for test vectors.)
SHA1 is defined (with reference to MD4) by making the following changes.
 1. Notation. AsinMD4.
 2. Definition of constants. Define a fifth IV to match those in MD4: /i_{5} = 0xc3d2elf0. Define perround integer additive constants: y = 0x5a827999, y2 = 0x6ed9ebal, ■уз = 0x8flbbcdc, ?/4 = 0xca62cld6. (No order for accessing source words, or specification of bit positions for left shifts is required.)
 3. Overall preprocessing. Pad as in MD4, except the final two 32bit words specifying the bitlength b is appended with most significant word preceding least significant. As in MD4, the formatted input is 16rn 32bit words: xqx ... xwmi Initialize chaining variables: (Hi, H_{2}, H_{3}, Я_{4}, Hr,) < (hi,h_{2}, h_{3}, ii_{4}. hr,).
 4. Processing. For each i from 0 to m  1, copy the i^{th} block of sixteen 32bit words into temporary storage: X[j] t— ziei+j, 0 < j < 15, and process these as below in four 20step rounds before updating the chaining variables:
 (expand 16word block into 80word block; let Xj denote X[j]) for j from 16 to 79, Xj t— (( Xj_3®.Xj_g®.X7j_i4®.Xj_i6 ) i—^{5} 1).
 (initialize working variables) (A, В. C, D, E) <— (H, H_{2}. ff_{3}, Н_{л}. Hr,).
 (Round 1) For j from 0 to 19 do the following: t <— ((A 5) + f(B,C, D) + E + Xj + 2/1),
 (А, В, 67, E), E) — (t, A, В i—^{5} 30, 67, D).
 (Round 2) For j from 20 to 39 do the following: t t— ((A t—’ 5) + h(B, C. D) + E + Xj + ?/2)>
 (A, В, C, D, E) — (t, A, В i—^{5} 30,67, D).
 (.Round 5) For j from 40 to 59 do the following: t 4— ((A p5) + g(B.C. D) + E + Xj + у$),
 (A. 13. C. D. E) — (t. A. В 4—^{>} 30, С. B).
 (Round 4) For j from 60 to 79 do the folio whig: t i— ((A t—’ 5) + h(B, C. D) + E + Xj + 3/4),
 (A. В. (3. В. E) 4— (f, A. В 4—^{5} 30, C. B).
 (update chaining values)
 (Я_{ь}я_{2}. Яз, Я_{4}, Ни) < (Н_{1} + А.Н_{2} + В. Яз + С, Я_{4} + А Я_{5} + Е).
 5. Completion. The hashvalue is: Hi 11Я_{2}11Я_{3}11Я111 Hr,
 (with first and last bytes the high and loworder bytes of Я_{ь} Я_{3}, respectively).
 9.54 Remark (security' of.SHA1) Compared to 128bit hash functions, the 160bit hashvalue of SHA1 provides increased security against bruteforce attacks. SHA1 and RIPEMD 160 (see §9.4.2(iv)) presently appear to be of comparable strength; both are considered stronger than MD5 (Remark 9.52). In SHA1, a significant effect of the expansion of 16 word message blocks to 80 words in the compression function is that any two distinct 16 word blocks yield 80word values which differ in a larger number of bit positions, significantly expanding the number of bit differences among message words input to the compression function. The redundancy added by this preprocessing evidently adds strength.
 (iv) RIPEMD160
RIPEMD160 (Algorithm 9.55) is a hash function based on MD4, taking into account knowledge gamed in the analysis of MD4, MD5, and RIPEMD. The overall RIPEMD160 compression function maps 21word inputs (5word chaining variable plus 16word message block, with 32bit words) to 5word outputs. Each input block is processed in parallel by distinct versions (the left line and right line) of the compression function. The 160bit outputs of the separate lines are combined to give a single 160bit output.
Notation 
Definition 
Table 9.8: RIPEMD160 round function definitions.
The RIPEMD160 compression function differs from MD4 in the number of words of chaining variable, the number of rounds, the round functions themselves (Table 9.8), the order in which the input words are accessed, and the amounts by which results are rotated. The left and and right computation lines differ from each other in these last two items, in their additive constants, and in the order in which the round functions are applied. This design is intended to improve resistance against known attack strategies. Each of the parallel lines uses the same IV as SHA1. When writing the IV as a bitstring, littleendian ordering is used for RIPEMD160 as in MD4 (vs. bigendian in SHA1; see Note 9.48).
9.55 Algorithm RIPEMD160 hash function INPUT: bitstring x of bitlength b > 0.
OUTPUT: 160bit hashcode of x. (See Table 9.6 for test vectors.)
RIPEMD160 is defined (with reference to MD4) by making the following changes.
 1. Notation. See Table 9.7, with MD4 round functions /, g, h redefined per Table 9.8 (which also defines the new round functions k, l).
 2. Definition of constants. Define a fifth IV: /i_{5} = 0xc3d2elf0. In addition:
 (a) Use the MD4 additive constants for the left line, renamed: yij} = 0, 0 < j < 15; y_{L}j] = 0x5a827999,16 < j < 31; y_{L}[j] = 0x6ed9ebal, 32 < j < 47. Define two further constants (square roots of 5,7): ytfj] = 0x8flbbcdc, 48 < j < 63; yb[j] = 0xa953fd4e, 64 < j < 79.
 (b) Define five new additive constants for the right line (cube roots of 2,3,5,7): y_{R}[j] = 0x50a28be6,0 < j < 15; y_{R}[j] = 0x5c4ddl24,16 < j < 31; y_{R}j = 0x6d703ef3, 32 < j < 47; y_{R}[j] = 0x7a6d76e9, 48 < j < 63; VRlj] = 0, 64 < j < 79.
 (c) See Table 9.9 for constants for step j of the compression function: zi[j], z_{R}[j} specify the access order for source words hi the left and right lines; sy[j], s_{R} [j] the number of bit positions for rotates (see below).
 3. Preprocessing. As in MD4, with addition of a fifth chaining variable: Hr, < h_{5}.
 4. Processing. For each i from 0 to m  1, copy the i^{th} block of sixteen 32bit words into temporary storage: X[j] < хш+j, 0 < j < 15. Then:
 (a) Execute five 16step rounds of the left line as follows:
 (A_{L}, B_{l}, C_{l}. D_{l}, E_{l}) < (H_{X}.H_{2}. H_{3}, H_{4}, H_{5}).
ileft Round 1) For j from 0 to 15 do the following: t « (A_{l} + f(B_{L}, C_{l},Dl) + X[z_{L}j] + y_{L}j]),
 (Al, Bl,Cl, DlEl) « (El, El + (t ^ SL[j]), Bl,Cl ^ 10, Dl) (left Round 2) For j from 16 to 31 do the following: t (Al + q(Bl,Cl,Dl) + X[zl[})] + уь(з),
 (Al, Bl,Cl,Dl,El) « (El, El + (t ^ SL[j]), Bl,Cl ^ 10, Dl) (left Round 3) For j from 32 to 47 do the following: t (Al + h(B_{L}, Cl, D_{l}) + X[zL[j]] + yb[j]),
 (Al, Bl,Cl, Dl, El) « (El, El + (t ^ SL[j]), Bl,Cl ^ 10, Dl) (left Round 4) For j from 48 to 63 do the following: t <— (Al + U(Bl, Cl, Dl) + + уьЫ),
 (Al, Bl,Cl,Dl,El) (El, El + (t ^ SL[j]), Bl,Cl ^ 10, Dl) (left Round 5) For j from 64 to 79 do the following: t ■< (Al + 1(Bl,Cl, Dl) + X[zl[j + yL[j}),
 (Al, Bl,Cl,Dl,El) « (El, El + (t ^ SL[j]), Bl,Cl ^ 10, Dl)
 (b) Execute in parallel with the above five rounds an analogous right line with (Ar, B_{r},C_{r}, Dr, Er), y_{R}j], z_{R}[j], s_{R}[j] replacing the corresponding quantities with subscript L; and the order of the round functions reversed so that their order is: /, It, h, g, and /. Start by initializing the right line working variables: (An, Br, Cr, Dr, Er) ^ (Hi. H'2, H_{3}, H_{4}, tfs).
 (c) After executing both the left and right lines above, update the chaining values as follows: t <— Hi, Hi <— H_{3} + Cl + Dr, H2 t— H_{3} + Dl + Er, H_{3} <— H_{4} + El + Ar, Н_{4} <— Я_{5} + Al + Br, H_{3} t + Bl + Cr.
 5. Completion. The final hashvalue is the concatenation: Я_{1}#2#з#4#5 (with first and last bytes the low and highorder bytes of Hi, Hr,, respectively).
Variable 
Value 
Z_{L}[ 0. .15] 
[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14,15] 
Z_{L}[16..31] 
[ 7, 4,13, 1,10, 6,15, 3,12, 0, 9, 5, 2,14,11, 8] 
ZL [32 . .47] 
[ 3,10,14, 4, 9,15, 8, 1, 2, 7, 0, 6,13,11, 5,12] 
Z_{L} [48. .63] 
[ 1, 9,11,10, 0, 8,12, 4,13, 3, 7,15,14, 5, 6, 2] 
ZL [64. .79] 
[ 4, 0, 5, 9, 7,12, 2,10,14, 1, 3, 8,11, 6,15,13] 
Zr[ 0..15] 
[ 5,14, 7, 0, 9, 2,11, 4,13, 6,15, 8, 1,10, 3,12] 
ZR[16..31] 
[ 6,11, 3, 7, 0,13, 5,10,14,15, 8,12, 4, 9, 1, 2] 
Zr [32. .47] 
[15, 5, 1, 3, 7,14, 6, 9,11, 8,12, 2,10, 0, 4,13] 
Zr [48. .63] 
[ 8, 6, 4, 1, 3,11,15, 0, 5,12, 2,13, 9, 7,10,14] 
ZR [64. .79] 
[12,15,10, 4, 1, 5, 8, 7, 6, 2,13,14, 0, 3, 9,11] 
SL [ 0. .15] 
[11,14,15,12, 5, 8, 7, 9,11,13,14,15, 6, 7, 9, 8] 
S_{L} [16. .31] 
[ 7, 6, 8,13,11, 9, 7,15, 7,12,15, 9,11, 7,13,12] 
SL [32 . .47] 
[11,13, 6, 7,14, 9,13,15,14, 8,13, 6, 5,12, 7, 5] 
S_{L} [48. .63] 
[11,12,14,15,14,15, 9, 8, 9,14, 5, 6, 8, 6, 5,12] 
SL [64. .79] 
[ 9,15, 5,11, 6, 8,13,12, 5,12,13,14,11, 8, 5, 6] 
Sr [ 0..15] 
[ 8, 9, 9,11,13,15,15, 5, 7, 7, 8,11,14,14,12, 6] 
SR[16..31] 
[ 9,13,15, 7,12, 8, 9,11, 7, 7,12, 7, 6,15,13,11] 
SR [32. .47] 
[ 9, 7,15,11, 8, 6, 6,14,12,13, 5,14,13,13, 7, 5] 
Sr [48. .63] 
[15, 5, 8,11,14,14, 6,14, 6, 9,12, 9,12, 5,15, 8] 
Sr [64. .79] 
[ 8, 5,12, 9,12, 5,14, 6, 8,13, 6, 5,15,13,11,11] 
Table 9.9: RIPEMD160 wordaccess orders and rotate counts