But it is pretty unsatisfying to say that the best algorithm "depends" on half a dozen variables, and that we can't reliably predict (engineer) a solution. It does seem to come down to that though. I've spent a bit of time
playing with a couple of versions of Montgomery Mult code plus other optimisations for modular exponentiation. What works best depends upon the processor (I was doing C with some inline assembler for the multiply and divide ops). I remember that one particular approach worked very well on an HP 9000/730 and was miserable on anything else I tried (Sparc, 80486, MIPS R3000, 68030). There's a really nice survey paper by Cetin Kaya Koc (then of RSADSI) called _High Speed RSA Implementation_ which describes various optimisations. The references in this are also pretty useful. Mark -- Mark Henderson markh@wimsey.bc.ca - RIPEM MD5: F1F5F0C3984CBEAF3889ADAFA2437433 ViaCrypt PGP key fingerprint: 21 F6 AF 2B 6A 8A 0B E1 A1 2A 2A 06 4A D5 92 46 low security key fingerprint: EC E7 C3 A9 2C 30 25 C6 F9 E1 25 F3 F5 AF 92 E3 cryptography archive maintainer -- anon ftp to ftp.wimsey.bc.ca:/pub/crypto