bug in my Maurer code; retraction of claim; ref

17 Dec 2003

      There was a numerical bug in the C code for Maurer's Univ. Stat. Test
for Randomness which I posted.  The "float" variable "Sum" 
should be "double".  This will cause an incorrect input-size dependence
for large >40Mbyte files.

Measuring (with the fixed version) the output of a block cipher in feedback
mode, I find that it DOES NOT differ from truly nondeterministic noise, as
I had claimed.  The difference I had observed was due to a larger sample
size for my cipher-noise.  Boy do I feel stupid.

Needless to say, sorry about any inconvenience.

The fixed code follows at the bottom.  This version also spits out
an early estimate after a million samples; this turns out to be
very close to the final value for (homogenous) large samples.

Also:

I recently found the following amendment to Maurer's work
at http://www.eleves.ens.fr:8080/home/coron/escience.html#1

Abstract. Maurer's universal test is a very common randomness test, capable
of detecting a wide gamut of statistical defects. The algorithm is simple
(a few Java code lines), flexible (a variety of parameter combinations can
be chosen by the tester) and fast.
Although the test is based on sound probabilistic grounds, one of its
crucial parts uses the heuristic approximation:
c(L,K) = 0.7 - 0.8/L + (1.6 + 12.8/L)K-4/L
In this work we compute the precise value of c(L,K) and show that the
inaccuracy due to the heuristic estimate can make the test 2.67 times more
permissive than what is theoretically admitted.
Moreover, we etablish a new asymptotic relation between the test parameter
and the source's entropy.
09/08/98 - Jean-Sébastien Coron

.....................................

/*
UELI.c
28 Oct 98

This implements Ueli M Maurer's
"Universal Statistical Test for Random Bit Generators"
using L=16

Accepts a filename on the command line;
writes its results, with other info, to stdout.

Handles input file exhaustion gracefully.

Ref: J. Cryptology v 5 no 2, 1992 pp 89-105
also on the web somewhere, which is where I found it.

-David Honig
honig@sprynet.com

Built with Wedit 2.3, lcc-win32
http://www.cs.virginia.edu/~lcc-win32

26 Sept CP Release
Version Notes:

	This version does L=16.  It evolved from an L=8 prototype
	which I ported from the Pascal in the above reference.

	I made the memory usage reasonable
	by replacing Maurer's "block" array
	with the 'streaming' fgetc() call.

27 Oct 98 compiled under Sun cc OK if C++ stuff taken out..
28 Oct 98 made "Sum" into a double, from float; fixes
		a bug found with larger (40M files); reposted
		with retraction.

Usage:
	ULI filename
	outputs to stdout

*/

#define L 16		// bits per block
#define V (1<<L)	// number of possible blocks
#define Q (10*V)    // at LEAST 10 * V, to assure each block seen
#define K (100*Q)   // at LEAST 100 * Q, as large as possible
#define MAXSAMP (Q + K)

#include <stdio.h>
#include <math.h>

int main( int argc, char **argv )
{
FILE *fptr;
int i;
int b, c;
int table[V];
double sum=0.0;
int run;

// Human Interface
printf("UELI 27 Oct 98 double-precision, with early value\n L=%d %d %d \n",
L, V, MAXSAMP);
if (argc <2)
	{printf("Usage: ULI filename\n"); exit(-1); }
else
	printf("Measuring file %s\n", argv[1]);

// FILE IO
fptr=fopen(argv[1],"rb");
if (fptr == NULL) {printf("Can't find %s\n", argv[1]); exit(-1); }

// INIT
for (i=0; i<V; i++) table[i]=0;
for (i=0; i<Q; i++)	{
	b= fgetc(fptr)<<8 | fgetc(fptr);
	table[ b ]=i;
}

printf("Init done\n");

// COMPUTE
run=1;
for (i=Q;  run && i<Q+K; i++)
	{
	// COMPOSE A 16-bit quantity
	b=fgetc(fptr);
	c=fgetc(fptr); if (c<0 || b<0) run=0;
	if (run) { b = b<<8 | c;
			sum += log( (double) ( i-table[b] ) ) ;
			table[ b ]=i;
		}

	if (i==1000000+Q)
	printf("Early measurement after %d samples = %g\n\n", i-Q, (sum/( (double)
(i-Q) ) ) /  log(2.0));

	}

	if (!run) printf("Premature end of file; read %d blocks.\n", i-Q);
	sum = (sum/( (double) (i-Q) ) ) /  log(2.0);    // i should be K if enough
samples
	printf("%s fTU= %g\n\n", argv[1], sum);
	printf("Expected value for L=16 is 15.167379 \n");

	// Add further interpretation/thresholding of the number of sigmas from
expected,
	// and include the fudge factors explained in the paper.

} // end

David Honig

tags

participants (1)