 Maurer’s universal statistical test

The basic idea behind Maurer’s universal statistical test is that it should not be possible to significantly compress (without loss of information) the output sequence of a random bit generator. Thus, if a sample output sequence s of a bit generator can be significantly compressed, the generator should be rejected as being defective. Instead of actually compressing the sequence s, the universal statistical test computes a quantity that is related to the length of the compressed sequence.

The universality> of Maurer’s universal statistical test arises because it is able to detect any one of a very general class of possible defects a bit generator might have. This class includes the five defects that are detectable by the basic tests of §5.4.4. A drawback of the universal statistical test over the five basic tests is that it requires a much longer sample output sequence in order to be effective. Provided that the required output sequence can be efficiently generated, this drawback is not a practical concern since the universal statistical test itself is very efficient.

Algorithm 5.33 computes the statistic Xu for a sample output sequence s = s0,si,... , ,s„_i to be used in the universal statistical test. The parameter L is first chosen from the

 L P 1 0.732C495 0.690 2 1.5374383 1.338 3 2.401G0C8 1.901 4 3.3112247 2.358 5 4.25342C6 2.705 6 5.2177052 2.954 7 6.1962507 3.125 8 7.1836656 3.238
 L P 9 8.1764248 3.311 10 9.1723243 3.356 11 10.170032 3.384 12 11.168765 3.401 13 12.168070 3.410 14 13.167693 3.416 15 14.167488 3.419 16 15.167379 3.421

Table 5.3: Mean p and variance a2 of the statistic Xufor random sequences, with parameters L, К as Q » oc. The variance of Xu is a2 = c(L,K)2 - a2/K, where c(L, K) » 0.7 — (0.8/L) + (1.6 + (12.8/L)) • K~4/L for К > 2l.

interval [6,16]. The sequence s is then partitioned into non-overlapping L-bit blocks, with any leftover bits discarded; the total number of blocks is Q+ K, where Q and К are defined below. For each г, 1 let b-t be the integer whose binary representation is the /,h block. The blocks are scanned in order. A table T is maintained so that at each stage T[j] is the position of the last occurrence of the block corresponding to integer j, 0 < j <2L — 1. The first Q blocks of s are used to initialize table T; Q should be chosen to be at least 10 • 2L in order to have a high likelihood that each of the 2L L-bit blocks occurs at least once in the first Q blocks. The remaining К blocks are used to define the statistic Xu as follows. For each i,Q + 1 < i < Q + K, let Ai = i — L[6,]; A, is the number of positions since the last occurrence of block 6*. Then К should be at least 1000 • 2L (and, hence, the sample sequence s should be at least (1010 • 2l ■ L) bits in length). Table 5.3 lists the mean ц and variance er2 of Xu for random sequences for some sample choices of L as Q oc.

5.33 Algorithm Computing the statistic Xu for Maurer’s universal statistical test

INPUT: a binary sequence s = so, si,... , stl-] of length n, and parameters L, Q, K. OUTPUT: the value of the statistic Xu for the sequence s.

• 1. Zero the table T. For j from 0 to 2L - 1 do the following: Tj+-0.
• 2. Initialize the table T. For i from 1 to Q do the following: Т[Ьг]-«-г.
• 3. sumt—0.
• 4. For i from Q + 1 to Q + К do the following:
• 4.1 sunrt-sum + lg(« - Т[Ь*]).
• 4.2 i.
• 5. Xu<-sxnt/K.
• 6. Retum(X„).

Maurer’s universal statistical test uses the computed value of Xu for the sample output sequence s in the manner prescribed by Fact 5.34. To test the sequence s, a two-sided test should be used with a significance level a between 0.001 and 0.01 (see §5.4.2).

5.34 Fact Let Xu be the statistic defined in (5.6) having mean // and variance o2 as given in Table 5.3. Then, for random sequences, the statistic Zu = (Xu - p)jo approximately follows an Лг(0.1) distribution.