# Formatting and initialization details

- 9.28 Note
*(data representation)*As hash-values depend on exact bitstrings, different data representations (e.g., ASCII vs. EBCDIC) must be converted to a common format before computing hash-values. - (i) Padding and length-blocks

For block-by-block hashing methods, extra bits are usually appended to a hash input string before hashing, to pad it out to a number of bits which make it a multiple of the relevant block size. Tire padding bits need not be transmitted/stored themselves, provided the sender and recipient agree on a convention.

9.29 Algorithm Padding Method 1

INPUT: data x; bitlength n giving blocksize of data input to processing stage.

OUTPUT: padded data *x*', with bitlength a multiple of *n.*

- 1. Append to x as few (possibly zero) О-bits as necessary to obtain a string x' whose bitlength is a multiple of
*n.* - 9.30 Algorithm Padding Method 2
^{[1]}

- 2. Then append as few (possibly zero) О-bits as necessary to obtain a string
*x'*whose bitlength is a multiple of*n.* - 9.31 Remark
*(ambiguous padding*) Padding Method 1 is*ambiguous*- trailing О-bits of the original data cannot be distinguished from those added during padding. Such methods are acceptable if the length of the data (before padding) is known by the recipient by other means. Padding Method 2 is not ambiguous - each padded string*x'*corresponds to a unique unpadded string*x.*When the bitlength of the original data*x*is already a multiple of*n,*Padding Method 2 results in the creation of an extra block. - 9.32 Remark
*(appended length blocks*) Appending a logical length-block prior to hashing prevents collision and pseudo-collision attacks which find second messages of different length, including trivial collisions for random IVs (Example 9.96), long-message attacks (Fact 9.37), and fixed-point attacks (page 374). This further justifies the use of MD- strengthening (Algorithm 9.26).

Trailing length-blocks and padding are often combined. For Padding Method 2, a length field of pre-specified bitlength *w* may replace the final *w* О-bits padded if padding would otherwise cause *w* or more redundant such bits. By pre-agreed convention, the length field typically specifies the bitlength of the original message. (If used instead to specify the number of padding bits appended, deletion of leading blocks cannot be detected.)

(ii) IVs

Whether the IV is fixed, is randomly chosen per hash function computation, or is a function of the data input, the same IV must be used to generate and verify a hash-value. If not known *a priori* by the verifier, it must be transferred along with the message. In the latter case, this generally should be done with guaranteed integrity (to cut down on the degree of freedom afforded to adversaries, in line with the principle that hash functions should be defined with a fixed or a small set of allowable IVs).

- [1] Append to x a single 1-bit.