LUC Public Key Crypto...

17 Dec 2003

      Hi,

In the process of doing some research on Gauss I stumbled across this...

    ____________________________________________________________________
   |                                                                    |
   |      Those who make peaceful revolution impossible will make       |
   |      violent revolution inevitable.                                |
   |                                                                    |
   |                                          John F. Kennedy           |
   |                                                                    |
   |                                                                    | 
   |            _____                             The Armadillo Group   |
   |         ,::////;::-.                           Austin, Tx. USA     |
   |        /:'///// ``::>/|/                     http://www.ssz.com/   |
   |      .',  ||||    `/( e\                                           |
   |  -====~~mm-'`-```-mm --'-                         Jim Choate       |
   |                                                 ravage@ssz.com     |
   |                                                  512-451-7087      |
   |____________________________________________________________________|

Forwarded message:
...
Dr. Dobb's Web Site
LUC PUBLIC-KEY ENCRYPTION
A secure alternative to RSA
Peter Smith
Peter has worked in the computer industry for 15 years as a
   programmer, analyst, and consultant and has served as deputy editor of
   Asian Computer Monthly. Peter's interest in number theory led to the
   invention of LUC in 1991. He can be reached at 25 Lawrence Street,
   Herne Bay, Auckland, New Zealand.
_________________________________________________________________
According to former NSA director Bobby Innman, public-key cryptography
   was discovered by the National Security Agency in the early seventies.
   At the time, pundits remarked that public-key cryptography (PKC) was
   like binary nerve gas--it was potent when two different substances
   were brought together, but quite innocuous in its separate parts.
   Because the NSA promptly classified it, not much was known about PKC
   until the mid-seventies when Martin Hellman and Whitfield Diffie
   independently came up with the notion and published papers about it.
Traditional cryptographic systems like the venerable Data Encryption
   Standard (DES) use the same key at both ends of a message
   transmission. The problem of ensuring correct keys leads to such
   expensive expedients as distributing the keys physically with trusted
   couriers. Diffie and Hellman (and the NSA) had the idea of making the
   keys different at each end. In addition to encryption, they envisioned
   this scheme would also lead to a powerful means of source
   authentication known as digital signatures.
RSA, developed in 1977, was the first reliable method of source
   authentication. The RSA approach (patented in the early eighties)
   initiated intense research in "number theory," one of the most
   recondite areas of mathematics. Although C.F. Gauss studied this topic
   in the early 1800s (referring to it then as "higher arithmetic"), very
   little real progress has been made in solving the problem of factoring
   since then. The means available today are essentially no better than
   exhaustive searching for prime factors. In terms of intractability
   theory, however, no one has yet proved that the problem is
   intractable, although researchers believe it to be so.
The RSA Algorithm
RSA works by raising a message block to a very large power, then
   reducing this modulo N, where N (the product of two large prime
   numbers) is part of the key. Typical systems use an N of 512 bits, and
   the exponent to which blocks are raised in decryption is of the same
   order. An immediate problem in implementing such a system is the
   representation and efficient manipulation of such large integers.
   (Standard microprocessors don't really have the power to handle normal
   integer sizes and functions; even numeric coprocessors are inadequate
   when integers of this size are involved.)
RSA has dominated public-key encryption for the last 15 years as
   research has failed to turn up a reliable alternative--until the
   advent of LUC. Based on the same difficult mathematical problem as
   RSA, LUC uses the calculation of Lucas functions instead of
   exponentiation. (See text box entitled, "How the Lucas Alternative
   Works.")
Because we're working in the area of mathematics, we can formally
   prove that LUC is a true alternative to RSA. Furthermore, we can show
   that a cipher based on LUC will be at least as efficient. More
   importantly, we can show that LUC is a stronger cipher than RSA. The
   reason is that under RSA, the digital signature of a product is the
   product of the signatures making up the product; in mathematical
   terms, M{e}L{e}=(ML){e}. This opens RSA to a cryptographic attack
   known as adaptive chosen-message forgery. Ironically, this is outlined
   in a paper co-authored by Ron Rivest (the "R" in RSA). LUC is not
   multiplicative and therefore not susceptible to this attack. Using
   Lucas functions, V[e](M,1)V[e](L,1) is not equal to V[e](ML,1). In
   other words, the use of exponentiation leads to RSA being
   multiplicative in this way, while LUC's use of Lucas functions avoids
   this weakness.
Choosing the Algorithms
Lucas functions have been studied mainly in relation to primality
   testing, and it was to these sources we turned when researching
   efficient algorithms for implementing LUC. For given parameters, the
   Lucas functions give rise to two series, U[n] and V[n]. The first
   algorithm (see Listing One, page 90) calculated both, even though we
   were only interested in V[n]. It was only in a paper on factoring
   integers that we found a means of calculating V[n] alone (see Listing
   Two, page 90). The pseudocode examples show that both algorithms have
   two phases: The work done when the current bit is a 0 is half the work
   necessary when the current bit is a 1.
More Details.
Typically, in systems like LUC the exponent used for encryption is a
   much smaller integer than that used for decryption. A commonly chosen
   encryption exponent is the prime number 65,537. This is a good choice
   for fast encryption as all but 2 of the 17 bits are 0s. We have no
   such control over the decryption exponent, but there is a way of
   halving the work, and thus, of introducing a limited degree of
   parallelism into the calculation.
Since LUC is a public-key cryptosystem, we can always assume that the
   possessor of the private decrypting keys knows the two primes (p and
   q) which make up the modulus, N. Consequently, we can reduce the
   exponent and message with respect to the two primes, in each case at
   least halving the amount of work. At the end of the calculation with
   respect to the primes, we bring the results together to produce the
   final plain text (see Listing Three, page 90).
Large-integer Arithmetic
There's really only one source of information about large-integer
   arithmetic: Knuth's The Art of Computer Programming. We found that
   almost every time we referred to his book, we came up with some new
   angle or way of tweaking some extra performance out of our code.
We decided to represent the large integers as 256-byte arrays, with
   the low byte giving the length (in bytes) of the integer. For
   instance, the 8-byte hexadecimal number 1234567890ABCDEF would appear
   in a file view as 08 EF CD AB 90 78 56 34 12. These arrays became a
   Pascal-type har (for hexadecimal array). We can store integers of over
   600 decimal digits in our hars, but because the hars must be able to
   hold the results of a multiplication, we are limited to manipulating
   integers up to 300 decimal digits in length.
Implementation of addition, subtraction, and multiplication went quite
   smoothly; implementation of division took more effort. (We took
   comfort in not being the first to encounter problems with division.
   Lady Ada Lovelace, the first computer programmer, said, "I am still
   working at some most entangled notations of division, but see my way
   through them at the expense of heavy labor, from which I shall not
   shrink as long as my head can bear it.") We tried various methods,
   including one based on Newton which calculated the inverse of the
   divisor and then multiplied. (See Knuth's discussion.) We finally
   opted for Knuth's Algorithm D, despite his warning that it contained
   possible discontinuities. At that stage, we were working on a 16-bit
   80286 PC; see Listing Four, page 90.
Of course there was much more than the division routine to consider,
   but we found that it was the critical routine in terms of getting LUC
   to run at a reasonable speed. Once we had upgraded to an 80386, we
   converted to a full 32-bit implementation. The assembler code for the
   division (still Algorithm D) is given in Listing Five (page 91).
   Although space constraints prevent a complete presentation of the
   code, suffice to say that we have been able to achieve a
   signing/decryption speed on a modulus of 512 bits of over 200 bits per
   second (33-MHz 80386, 0 wait states).
Other Issues
Central to any cryptographic system are keys. In LUC, if an adversary
   is able to find p and q, the prime factors of modulus N, then all
   messages sent with N can be either read in the case of encryption or
   forged in the case of signing.
Since the days of Gauss, research on factoring has come up with
   various so-called "aleatoric" methods of factoring some numbers. These
   methods are like cures for poison ivy: numerous, and occasionally
   efficacious. One old method, found by Pierre Fermat, is very quick at
   factoring some types of composite numbers. If N is the product of two
   primes which are close together, then it can be easily factored. For
   example, if p=1949, and q=1951, then N=3802499. Taking the square root
   of N, we find that it is approximately 1949.999. Adding 1 to the
   integral part of this (giving 1950), we square this, giving 3802500.
   If we now subtract N from this square, we get a difference of 1, which
   is the square of itself. This means that N has been expressed as the
   difference of two squares. As we learned in high school, x{2}-y{2} =
   (x-y)(x+y), and so we obtain the two factors.
Fermat's method works whenever the ratio of the factors is close to an
   integer. (Note that the ratio is close to 1 in the above discussion.)
   This attack, as cryptographers call methods used to break a cipher,
   has to be guarded against in generating the modulus N.
Another guard is that neither (p + 1) and (q + 1) nor (p - 1) and (q -
   1) should be made up of small prime factors. There are many other
   guards of varying degrees of importance, but the entire area needs
   consideration depending on the level of security required, and how
   long the keys are meant to last.
The basic idea behind LUC is that of providing an alternative to RSA
   by substituting the calculation of Lucas functions for that of
   exponentiation. While Lucas functions are somewhat more complex
   mathematically than exponentiation, they produce superior ciphers.
This substitution process can be done with systems other than the RSA.
   Among these are the Hellman-Diffie-Merkle key exchange system (U.S.
   Patent number 4,200,770), the El Gamal public-key cryptosystem, the El
   Gamal digital signature, and the recently proposed Digital Signature
   Standard (DSS), all of which use exponentiation.
The nonmultiplicative aspect of Lucas functions carries over, allowing
   us to produce alternatives to all these. In the case of the DSS, Lucas
   functions allow us to dispense with the one-way hashing cited (but not
   specified) in the draft standard.
A New Zealand consortium has been set up to develop and license
   systems based on LUC, which is protected by a provisional patent. For
   more information, contact me or Horace R. Moore, 101 E. Bonita, Sierra
   Madre, California 91024.
References
Athanasiou, Tom. "Encryption Technology, Privacy, and National
   Security." MIT Technology Review (August/September, 1986).
Diffie, W. and M.E. Hellman. "New Directions in Cryptography." IEEE
   Transactions on Information Theory (November, 1976).
El Gamal, Taher. "A Public Key Cryptosystem and a Signature Scheme
   Based on Discrete Logarithms." IEEE Transactions on Information Theory
   (July, 1985).
Gauss, C.F. "Disquisitiones Arithmeticae," Article 329.
Goldwasser, S., S. Micali, and R. Rivest. "A Digital Signature Scheme
   Secure Against Adaptive Chosen Message Attack." SIAM J. COMPUT (April,
   1988).
Kaliski, Burton S., Jr. "Multiple-precision Arithmetic in C." Dr.
   Dobb's Journal (August, 1992).
Knuth, D.E. The Art of Computer Programming: Volume II: Semi-Numerical
   Algorithms, second edition. Reading, MA: Addison-Wesley, 1981.
Schneier, Bruce. "Untangling Public Key Cryptography." Dr. Dobb's
   Journal (May, 1992).
Williams, H.C. "A p + 1 method of factoring." Mathematics of
   Computation (vol. 39, 1982).
How the Lucas Alternative Works
As with RSA encryption, use of the Lucas alternative involves two
   public keys: N and e. The number N is assumed to be the product of two
   large (odd) prime numbers, p and q. Encryption and decryption of a
   message is achieved using Lucas sequences, which may be defined as
   shown in Example 1. Note that P and Q are integers.
If a message P is to be sent, it is encoded as the residue P1 modulo N
   of the eth term of the Lucas sequence V[n](P,1), and then transmitted.
   The receiver uses a secret key d (based on the prime factorization of
   N) to decode the received message P1, by taking the residue modulo N
   of the dth term of the Lucas sequence V[n](P1,1). The secret key d is
   determined so that V[d](V[e](P,1),1) = P modulo N, ensuring the
   decryption of the received message P1 as P. The existence of such a
   key d is based on the following theorem.
Theorem
Suppose N is any odd positive integer, and P is any positive integer,
   such as P{2}-4 is coprime to N. If r is the Lehmer totient function of
   N with respect to D = P{2}-4 (see Example 2), then V[mr+1](P,1)=P
   modulo N for every positive integer m. The condition that P{2}-4 be
   coprime to N is easily checked, as P{2}-4=(P+2)(P-2). Also, because
   V[d](V[e](P,1),1)=V[de](P,1), according to Example 4(e), the key d may
   simply be chosen so that de=1 modulo r.
The Lehmer Totient Function
Suppose P and Q are integers, and a and b are the zeros of X{2}-Px+Q
   (so that P = a+b while Q = ab). Also, let D be the discriminant of
   x{2}-Px+Q. That is, D = P{2}-4Q = (a-b){2}.
The Lucas sequences U[n] = U[n] (P,Q) and V[n] = V[n] (P,Q) are
   defined for n = 0,1,2, and so on by the equation in Example3.
In particular, U[0] = 0, U[1] = 1, and then U[n+1] = PU[n] - QU[n-1]
   (for n = 1,2,3,...), while V[0] = 2, V[1] = P, and similarly V[n+1]=
   PV[n]-QV[n-1] (for n = 1,2,3,...). These sequences satisfy a number of
   identities, including the following which may be simply obtained from
   the definitions in Example 4.
Next, suppose N is any positive integer, and let r be the Lehmer
   totient function of N with respect to D = P{2}-4Q, defined the same
   way as in the statement of the theorem. In the special case where N is
   an odd prime p, the Lehmer totient function of p with respect to D is
   the number given by the equation in Example 5(a). In this case, the
   Lucas-Lehmer theorem states that if p does not divide Q then the
   equation in Example 5(b) holds true.
Example of LUC
Let N = pxq = 1949x2089=4071461, and P = 11111, which equals the
   message to encrypt/decrypt. The public keys will be e and N; the
   private key will be d. First, calculate r, the Lehmer totient function
   of P with respect to N. To do this we need to calculate the Legendre
   of p and q. Let D = p{2}-4; then (D/1949) =-1 and (D/2089)=-1 are the
   two Legendre values. Hence r is the least common multiple of 1949 + 1
   and 2089 + 1; see Example 6(a). Choosing e = 1103 for our public key,
   we use the Extended Euclidean Algorithm to find the secret key d, by
   solving the modular equation ed = 1 mod r. d turns out to equal 24017.
To encrypt the message 11111, we make the calculation shown in Example
   6(b). To decrypt the encrypted message, we calculate as in Example
   6(c). --P.S.
_LUC PUBLIC-KEY ENCRYPTION_
by Peter Smith
[LISTING ONE]
{ To calculate Ve(P,1) modulo N }
 Procedure LUCcalc;
 {Initialise}
 BEGIN
 D := P*P - 4; ut := 1; vt := P; u := ut; v := vt;
 If not odd(e) then BEGIN u := 0; v := 2; END;
 e := e div 2;
 {Start main}
 While e > 0 do
   BEGIN
   ut := ut*vt mod N; vt := vt*vt mod N;
   If vt < 3 then vt := vt + N;
   vt := vt - 2;
   If odd(e) then
     BEGIN
     c := (ut*v + u*vt) mod N;
     v := (vt*v + D*u*ut) mod N;
     If odd(v) then v := v + N; v := v/2;
     If odd(c) then c := c + N; u := c/2;
     END;
   e := e div 2;
   END;
 END;    {LUCcalc}
{ The required result is the value of v.}
[LISTING TWO]
Pseudocode for calculating Lucas Functions
Procedure wiluc  {   V = V(M) Mod N, the Mth Lucas number(P,1) }
Var
                    V,Vb,P,Vf,N,M,NP, Vd, Vf : LargeInteger ;
                    carry, high_bit_set      : boolean ;
                    bz                        : word ;
  BEGIN
  Va := 2 ;   { V[0] }  Vb = P ;   { V[1] }
  NP := N - P; bz := bits(M) -1 ; { test bits from high bit downwards }
  For j := 1 to bz do
      BEGIN
      Vc := Vb * Vb; Vf = Vc ; If Vf < 2 then Vf := Vf + N
      Vf := Vf - 2; Vd := Va * Vb
      {  Vc := V, Vd := V*Vb, Vf := V-2}
     If high_bit_set Then
          BEGIN
          Vb := P * Vc; If Vb < Vd then Vb := Vb + N; Vb := Vb - Vd;
          If Vb < P then Vb := Vb + N; Vb := Vb - P; Va := Vf
          END ;
     Else BEGIN { "even" ie high bit not set }
          Va := Vd; If Va < P then Va := Va + N; Va := Va - P;
          Vb := Vf;
          END ;
     High_bit_set := next_bit_down(M);
     {This boolean function determines the setting of the next bit down}
     Va := Va Mod N; Vb := Vb Mod N
     END ; { for j to bz }
END ; {wiluc}
[LISTING THREE]
{ Pseudocode for splitting decryption/signing over p and q
  (N = p*q) }
Procedure hafluc ( var s,p,q,m,e : LargeInteger ; qix : word ) ;
var                            ep,emq,
                               temp,pi,qi,
                               b,n,pa,qa : LargeInteger ;
{ This procedure applies only to decipherment and signing, where the primes
  making up the modulus N ( = p * q) are known (or can be easily deduced,
  since both keys are known). Applying it allows us to halve the amount of
  work. Encipherment is usually done with a small key - standard is 65537. }
  Begin
  Qpr (pa,qa,p,q,m,qix ) ; {} {assumes qix already calculated }
  ep  = e ;              ep  = ep  Mod pa
  emq = e ;   emq  = emq Mod qa
  mp  = m ;    mp  = mp Mod p
  mq  = m ;    mq  = mq Mod q
  wiluc(q2,mq,emq,q) ;        wiluc(p2,mp,ep,p) ;
  if p2 < q2 then
      Begin
      temp = q         q  = p    p  = temp
      temp = q2        q2 = p2   p2 = temp
      End ;
  temp = p2   temp = temp - q2
  n = p * q
{ Solve with Extended Euclidean algorithm qi = 1/q Mod p. The algorithm
for the Extended Euclidean calculation can be found in Knuth. }
  r = temp * p
  r = r mod N
  s = r * qi
  s = s Mod n
  s = s + p2
End ; { hafluc }
Procedure SignVerify ;
  Begin
  h4 = 4
  p = large prime...
  q = large prime...
  n = p * q
  bz := bits(n) ;
    {write(cf,'  generate 4 keysets (d,e)  for p1,q1') ;}
{
      qix table for T[qix]
     Convention for qix
 This calculation is explained below.
   Lehmer totient      qix   Legendre values for p  and   q
   i.e. T[qix] = LCM
   (p - 1),(q - 1)     1                         1        1
   (p - 1),(q + 1)     2                         1       -1
   (p + 1),(q - 1)     3                        -1        1
   (p + 1),(q + 1)     4                        -1       -1
    e = encryption key,  small prime eg 65537
    mu = message as large integer less than n
    Solve e * d[qix] = 1 Mod T[qix] using Extended Euclidean Algorithm
    where T[qix] is lcm(p1,q1), the Lehmer totient function of N
    with repect to mu, according to the above table.
    This gives 4 possible values of d, the decryption/signing key.
    The particular value used depends on the message mu, as follows:
    Let D = mu2 - 4. Calculate the Legendre values of D with respect to
    both p and q. This value is -1 if D is a quadratic non-residue of
    p (or q), and equal to 1 if D is a quadratic residue of p (or q).
    N.B. This part is the most difficult part of LUC! Take care.
Signing (Deciphering):
    hafluc (a,pu,qu,mu,d,qix)
Verifying (Enciphering):
    Use Wiluc.
End.
[LISTING FOUR]
Algorithm D in 32-bit Intel assmbler
Author: Christopher T. Skinner
Short version of Mod32.Txt with scalings just as comments
               Modulus routine for Large Integers
                        u = u Mod v
Based on:
D.E.Knuth  The Art of Computer Programming
           Vol 2 Semi-Numerical Algorithms 2ed 1981
           Algorithm D page 257
We use a Pascal Type called "har" ( for "hexadecimal array")
Type
            har = Array[0..255] of byte ;
Var         u,v : har ;
Note that u[0] is the length of u and that the
integer begins in u[1]
It is desirable that u[1] is on a double word boundary.
; Turbo Pascal Usage:      ( Turbo Pascal v6.0)
; {$L Mod32a}   { contains mod32 far }
; {$F+}   { far pointers }
; procedure Mod32 ( var u,v : har ) ;
; Turbo Assembler code: (TASM v2.01)--requires 32-bit chip ie 386 or 486
; nb FS and GS can be used as temporary storage. Don't try to use them as
; segment registers because Windows 3.0 restricts their allowed range, even
; after you have finished out of Windows. You will hang for sure, unless you
; have used a well-behaved protected-mode program to reset them, or cold boot.
Data    Segment Word Public Use16
    vdz     dw ?        ; size  v    words
    va  dd ?            ;     hi dword v
    vb  dd ?            ; 2nd     "    v
    vi  dw ?        ; ^v[1]
        savdi   dw ?            ; used in addback
Data    EndS
Code    Segment Word Public  Use16
    Assume  cs:Code, ds:Data ,es:Nothing
        Public  mod32
; Pascal Parameters:
u   Equ DWord Ptr ss:[bp+10]      ; Parameter 1 of 2   (far)
v       Equ DWord Ptr ss:[bp+ 6]      ; parameter 2 of 2
uof     equ word ptr  ss:[bp+10]
vinof   equ word ptr  ss:[bp+ 6]
mod32   Proc    far
    push bp
    mov  bp,sp
        push di
        push si
        push ds          ; save the DS
; Before using Mod32 check that:
        ;     v > 0
        ;     v < u         u <= 125 words
        ;     v[0] is a multiple of 4   and at least 8
        ;     v[top] >= 80h           (may need to scale u & v)
        ;     make u[0] = 0 Mod 4     (add 1..3 if required)
domod:
        ; now point to our v
        mov ax,seg v
        mov ds,ax
        assume ds:Data
        mov si, offset v
        cld
        assume es:Nothing
    xor ah,ah
    mov al,es:[di]   ; ax = size of u in bytes    "uz"
    mov cx,ax        ; cx = uz
    mov bx,ax        ; bx = uz
    mov al,[si]
    mov dx,ax    ; dx  = size v bytes
    shr ax,2
    mov vdz,ax   ; vdz    "     dwords   vz = 0 mod 4
    sub bx,dx        ; bx = uz - vz  difference in bytes
    mov ax,bx        ; ax = uz - vz
    sub ax,3     ; ax = uz - vz - 3     ->  gs
    sub cx,3     ; cx =  uz - 3
    add cx,di        ; cx = ^top dword u
    add ax,di
    mov gs,ax    ; gs = ^(uz-vz-3)  u start   (by -4  down to 1)
        inc di
        mov fs,di    ; fs = uf = ^u[1] , end point
    inc si
    mov vi,si    ; vi = ^v[1]
    add si,dx
    mov eax,[si-4]
    mov va,eax   ;  va = high word of v
    mov eax,[si-8]
    mov vb,eax       ;  vb = 2nd highest word v
    mov di,cx    ; set di to ut , as at bottom of loop
d3:
    mov edx,es:[di]  ; dx is current high dword of u
    sub di,4
        mov eax,es:[di]  ; ax is current 2nd highest dword of u
    mov ecx,va
    cmp edx,ecx
    jae  aa          ; if high word u is 0 , never greater than
    div ecx      ;          mov ebx,eax
        mov esi,edx  ; si = rh
    jmp short ad     ; Normal route -- -- -- -- -->
aa:     mov eax,0FFFFFFFFh
    mov edx,es:[di]  ; 2nd highest wrd u
    jmp short ac
ab: mov eax,ebx      ; q2
    dec eax
    mov edx,esi      ;  rh
ac: mov ebx,eax      ; q3
    add edx,ecx
    jc d4        ; Knuth tests overflow,
    mov esi,edx
; normal route:
 ad:
        mul vb       ; Quotient by 2nd digit of divisor
    cmp edx,esi  ; high word of product : remainder
    jb  d4           ; no correction to quot, drop thru to mulsub
        ja  ab           ; nb unsigned use ja/b not jg/l
    cmp eax,es:[di-4] ; low word of product : 3rd high of u
    ja  ab
d4:          ; Multiply & subtract * * * * * * *
    mov cx,gs
    mov di,cx    ; low start pos in u for subtraction of q * v
        sub cx,4
        mov gs,cx
        xor ecx,ecx
    Mov  cx,vdz  ; word count for q * v
        mov  si,vi   ; si points to v[1]
        xor ebp,ebp      ; carry 14Oct90 bp had problems in mu-lp
        even
;    ** ** ** ** **  **  **  **
ba:     lodsd        ; eax <- ds[si]
    mul ebx      ; dx:ax contains product   carry set if dx > 0
        add eax,ebp
        adc edx,0
    sub es:[di],eax
        adc edx,0
        mov ebp,edx
        add di,4
    loop ba      ; dec cx , jmp if not 0
; .. .. .. . .. .. . .. .. . .. . . ..
        sub es:[di],edx
        jnc d7
mov si,vi    ;  add back (rare)
        mov savdi,di
    mov di,gs
        add di,4
    clc
    mov cx,vdz
bb:     lodsd        ; eax = ds[si]   si + 2
    adc es:[di],eax
        inc di
        inc di
        inc di
        inc di
        loop bb
        xor eax,eax
        mov es:[di],eax
        mov di,savdi
        ; test with:
        ; 1,00000000,00000000,00000001/ 80000000,00000000,00000001
d7:
    mov bx,fs     ; fs ^u[1]
        mov ax,gs     ; gs = current u start position
    cmp ax,bx     ; current - bottom
    jb d8
        sub di,4
    jmp d3
d8:
; here we would scale u down if it had been scaled up
quex:                 ; quick exit if v < u
        cld              ; just in case
        pop ds
        pop si
        pop di
        pop bp
    ret 8       ; 2 pointers = 4 words = 8 bytes
mod32   EndP        ;
Code    Ends
    End
[LISTING FIVE]
Algorithm D in 16-bit Intel assembler
Author: Christopher T. Skinner
   mod16.txt 21 Au8 92     16 bit modulus
; divm  Modulus
Data    Segment Word Public
    vwz     dw ?        ; size  v    words
    va  dw ?            ;     hi word v
    vb  dw ?            ; 2nd    "    v
    vi  dw ?        ; ^v[1]
    uf  dw ?        ; ^u[3]
    uz  dw ?            ; size u byte
    vz  dw ?             ;   "   v  "
    ua      dw ?        ; ^( u[0] + uz - vz -1 ) , mul sub start
    ut  dw ?            ; ^ u[topword]
    qh      dw ?
    uzofs   dw ?        ; ttt
    vzofs   dw ?        ; ttt
Data    EndS
Code    Segment Word Public
    Assume  cs:Code, ds:Data
        Public  diva
u   Equ DWord Ptr [bp+10]       ; ES:DI
v       Equ DWord Ptr [bp+6]        ; DS:SI
    ; NB v Must be Global, DS based...
diva    Proc    far
    push bp
    mov  bp,sp
        push ds
    cld     ; increment lodsw in mulsub
    lds si,v
        les di,u
    xor ah,ah
    mov al,es:[di]  ; ax = uz size of u in bytes N.B. uz is not actually used
    mov cx,ax       ; cx = uz
    mov bx,ax       ; bx = uz
    mov al,ds:[si]
    mov dx,ax   ; dx  = size v bytes
    shr ax,1
    mov vwz,ax  ; vwz    "     words
    sub bx,dx       ; bx = uz - vz  difference in bytes
    mov ax,bx       ; ax = uz - vz
    dec ax      ; ax = uz - vz - 1     ->  ua
    dec cx      ; cx =  uz - 1
    add cx,di       ; cx = ^top word u
    mov ut,cx   ; ut = ^top word u
    add ax,di
    mov ua,ax   ; ua = ^(uz-vz-1)  u start   (by -2  down to 1)
        inc di
    mov uf,di   ; uf = ^u[1] , end point
    inc si
    mov vi,si   ; vi = ^v[1]
    add si,dx
    mov ax,ds:[si-2]
    mov va,ax   ;  va = high word of v
    mov ax,ds:[si-4]
    mov vb,ax       ;  vb = 2nd highest word v
    mov di,cx   ; set di to ut , as at bottom of loop
d3:
    mov dx,es:[di]          ; dx is current high word of u
    dec di
    dec di
    mov ut,di
        mov ax,es:[di]        ; ax is current 2nd highest word of u
    mov cx,va
    cmp dx,cx
    jae  aa   ;if high word u is 0 , never greater than
    div cx          ;
        mov qh,ax
        mov si,dx       ; si = rh
    jmp ad          ; Normal route -- -- -- -- -->
aa:     mov ax,0FFFFh
    mov dx,es:[di]      ; 2nd highest wrd u
    jmp ac
ab: mov ax,qh
    dec ax
    mov dx,si       ;  rh
ac: mov qh,ax
    add dx,cx
    jc d4           ; Knuth tests overflow,
    mov si,dx
ad:     mul vb          ; Quotient by 2nd digit of divisor
    cmp dx,si       ; high word of product : remainder
    jb  d4          ; no correction to quot, drop thru to mulsub
        ja  ab          ; nb unsigned use ja/b not jg/l
    cmp ax,es:[di-2]    ; low word of product : 3rd high of u
    ja  ab
d4:         ; Multiply & subtract * * * * * * *
    mov bx,ua
    mov di,bx   ; low start pos in u for subtraction of q * v
    dec bx
    dec bx      ;
    mov ua,bx
    Mov  cx,vwz ; word count for q * v
        mov  si,vi  ; si points to v[1]
    mov bx,qh
        xor bp,bp
;    ** ** ** ** **  **  **  **
ba:     lodsw       ; ax <- ds[si]   si + 2  preserve carry over mul ?
    mul bx      ; dx:ax contains product   carry set if dx > 0
    add dx,bp
        xor bp,bp
    sub es:[di],ax
    inc di
    inc di
    sbb es:[di],dx
    rcl bp,1
    loop ba     ; dec cx , jmp if not 0
; .. .. .. . .. .. . .. .. . .. . . ..
        rcr bp,1
        jnc d7
mov si,vi   ;  add back (rare)
    mov di,ua
       inc di
    inc di
    clc
    mov cx,vwz
bb:    lodsw        ; ax = ds[si]   si + 2
    adc es:[di],ax
    inc di
    inc di
    loop bb
    mov cx,ut
    add cx,4
    sub cx,di
    shr cx,1        ; word length of u
bc:    mov Word Ptr es:[di],0
       inc di
    inc di
       loop bc  ;
    dec di      ;
    dec di      ;
    clc
d7:
    mov ax,uf
    cmp ua,ax
    jb d8
    dec di      ; New these are suspicious, with an add back and a
    dec di      ; New
    jmp d3
d8:
             cld   ; just in case
       pop ds
    pop bp
    ret 8       ; 2 pointers = 4 words = 8 bytes ???
diva    EndP        ;
Code    Ends
    End
_________________________________________________________________
Copyright © 1998, Dr. Dobb's Journal
   Dr. Dobb's Web Site Home Page -- Top of This Page

Jim Choate

tags

participants (1)