bug in my Maurer code; retraction of claim; ref
There was a numerical bug in the C code for Maurer's Univ. Stat. Test for Randomness which I posted. The "float" variable "Sum" should be "double". This will cause an incorrect input-size dependence for large >40Mbyte files. Measuring (with the fixed version) the output of a block cipher in feedback mode, I find that it DOES NOT differ from truly nondeterministic noise, as I had claimed. The difference I had observed was due to a larger sample size for my cipher-noise. Boy do I feel stupid. Needless to say, sorry about any inconvenience. The fixed code follows at the bottom. This version also spits out an early estimate after a million samples; this turns out to be very close to the final value for (homogenous) large samples. Also: I recently found the following amendment to Maurer's work at http://www.eleves.ens.fr:8080/home/coron/escience.html#1 Abstract. Maurer's universal test is a very common randomness test, capable of detecting a wide gamut of statistical defects. The algorithm is simple (a few Java code lines), flexible (a variety of parameter combinations can be chosen by the tester) and fast. Although the test is based on sound probabilistic grounds, one of its crucial parts uses the heuristic approximation: c(L,K) = 0.7 - 0.8/L + (1.6 + 12.8/L)K-4/L In this work we compute the precise value of c(L,K) and show that the inaccuracy due to the heuristic estimate can make the test 2.67 times more permissive than what is theoretically admitted. Moreover, we etablish a new asymptotic relation between the test parameter and the source's entropy. 09/08/98 - Jean-Sébastien Coron ..................................... /* UELI.c 28 Oct 98 This implements Ueli M Maurer's "Universal Statistical Test for Random Bit Generators" using L=16 Accepts a filename on the command line; writes its results, with other info, to stdout. Handles input file exhaustion gracefully. Ref: J. Cryptology v 5 no 2, 1992 pp 89-105 also on the web somewhere, which is where I found it. -David Honig honig@sprynet.com Built with Wedit 2.3, lcc-win32 http://www.cs.virginia.edu/~lcc-win32 26 Sept CP Release Version Notes: This version does L=16. It evolved from an L=8 prototype which I ported from the Pascal in the above reference. I made the memory usage reasonable by replacing Maurer's "block" array with the 'streaming' fgetc() call. 27 Oct 98 compiled under Sun cc OK if C++ stuff taken out.. 28 Oct 98 made "Sum" into a double, from float; fixes a bug found with larger (40M files); reposted with retraction. Usage: ULI filename outputs to stdout */ #define L 16 // bits per block #define V (1<<L) // number of possible blocks #define Q (10*V) // at LEAST 10 * V, to assure each block seen #define K (100*Q) // at LEAST 100 * Q, as large as possible #define MAXSAMP (Q + K) #include <stdio.h> #include <math.h> int main( int argc, char **argv ) { FILE *fptr; int i; int b, c; int table[V]; double sum=0.0; int run; // Human Interface printf("UELI 27 Oct 98 double-precision, with early value\n L=%d %d %d \n", L, V, MAXSAMP); if (argc <2) {printf("Usage: ULI filename\n"); exit(-1); } else printf("Measuring file %s\n", argv[1]); // FILE IO fptr=fopen(argv[1],"rb"); if (fptr == NULL) {printf("Can't find %s\n", argv[1]); exit(-1); } // INIT for (i=0; i<V; i++) table[i]=0; for (i=0; i<Q; i++) { b= fgetc(fptr)<<8 | fgetc(fptr); table[ b ]=i; } printf("Init done\n"); // COMPUTE run=1; for (i=Q; run && i<Q+K; i++) { // COMPOSE A 16-bit quantity b=fgetc(fptr); c=fgetc(fptr); if (c<0 || b<0) run=0; if (run) { b = b<<8 | c; sum += log( (double) ( i-table[b] ) ) ; table[ b ]=i; } if (i==1000000+Q) printf("Early measurement after %d samples = %g\n\n", i-Q, (sum/( (double) (i-Q) ) ) / log(2.0)); } if (!run) printf("Premature end of file; read %d blocks.\n", i-Q); sum = (sum/( (double) (i-Q) ) ) / log(2.0); // i should be K if enough samples printf("%s fTU= %g\n\n", argv[1], sum); printf("Expected value for L=16 is 15.167379 \n"); // Add further interpretation/thresholding of the number of sigmas from expected, // and include the fudge factors explained in the paper. } // end
participants (1)
-
David Honig