Re: fast modular reduction (proof?)
In the following pseudocode, B is the radix in which the numbers are represented (2^32 for a 32-bit machine), n is the length of modulus in blocks, U is B^(n+1) mod the modulus, X is the number to be reduced, k+1 is the length of X, and Y is the result.
1. Y = X 2. For i from k down to n+1, repeat steps 3 and 4 3. Y = Y - Y[i] * B^i + Y[i] * U * B^(i-n-1) 4. If Y >= B^i, then Y = Y - B^i + U * B^(i-n-1)
To do a proof I rewrite the algorithm: n = len(modulus) // modulus < B^n Y = X // obviously Y = X mod modulus K = B ^ (n+1) - U // U = B ^ (n+1) mod modulus, // therefore K = 0 mod modulus // furthermore K > 0 for (i=len(Y)-1 ; i>n ; i--) { F = B ^ (i-n-1) * K // F > 0 // F = 0 mod modulus Y -= Y[i] * F // Y shrinking, but // Y still the same mod modulus if ( Y >= B^i ) Y -= F // again shrinking, // still the same mod modulus } This shows that Y was shrinking, but is still equal to X mod modulus. To see whether Y really shrinks enough: Y = sum(i=0..len(Y)-1) Y[i] * B^i In the step Y = Y - Y[i] * B^i the highest block of Y is deleted (what could be done fast by reducing the length of Y). Now Y < B^i Afterwards the same value mod modulus is added to keep Y constant: Y = Y + Y[i] * U * B^(i-n-1). Y[i]<B -> Y[i]+1 <= B U < modulus < B^n , therefore U < B^n -> (Y[i]+1) * U < B * B^n = B^(n+1) -> Y[i] * U < B^(n+1) - U -> Y[i] * U * B^ (i-n-1 ) < F Therefore after doing the addition Y < B^i + F Check of the last step: 0 <= U < B^n therefore B^n < B^(n+1) - U <= B^(n+1) Therefore in every loop B^(i-1) < F <= B^i -> Y-F < B^i Partial Correctness: Y = X [ Y = X mod Y < B^len(X) ] K = B ^ (n+1) - U [ K = 0 mod B^n < K <= B^(n+1) ] for (i=len(X)-1 ; i>n ; i--) { [ Y = X mod , Y < B^(i+1) ] F = B ^ (i-n-1) * K [ F = 0 mod , B^(i-1) < F <= B^i ] [ 0 <= Y[i] < B ] [ Y[i] * F = 0 mod , 0 <= Y[i] * F < B^(i+1) ] [ Y >= Y[i] * B^i -> Y >= Y[i] * F ] Y -= Y[i] * F [ Y = X mod , Y < B^i + F (reason see above) , Y >= 0 ] if ( Y >= B^i ) Y -= F // again shrinking, // still the same mod modulus [ Y = X mod , Y >= 0 , Y < B^i ] } Last i was n+1, therefore Y = X mod , Y >= 0 , Y < B^(n+1) This is not enough, Y < B^n is requested. The loop can't be done once more because i-n-1 would become negative. k+1 was the length of X, and n the length of the modulus. You walk down from k to n+1 . In every loop you remove one block of the number. This means you have to do len(X)-len(modulus) loops. In the pseudocode you do only len(X)-len(modulus)-1 loops. One loop seems to be missing. This may be a result of confusion whether your Y starts with Y[0] or Y[1]. I do understand the algorithm as: n = len(modulus) U = B^n mod modulus K = B^n - U // = 0 mod modulus, 0 < K < B^n Y = X for(i=len(X)-1 ; i>= n ; i--) // squeeze Block i in Number Y { // Y < B ^ (i+1) F = B ^ ( i-n ) * K // F = 0 mod modulus Y -= Y[i] * F // subtract Y[i] * B^i, now Y < B ^ i // add the equivalent Y[i] * B^(i-n)*U <= F // now Y < B^i + F if ( Y >= B[i] ) Y -= F // now Y < B^i } Last i was n, therefore Y < B^n , Y = X mod modulus , but perhaps still Y >= modulus. Ok, algorithm understood and agreed (after modifying the loop counter). Any more agreement or disagreements? Hadmut
participants (1)
-
danisch@ira.uka.de