Now that I've got DES running about as fast as it can go on the 486, I did a little analysis on IDEA. The algorithm is definitely more straightforward to implement than DES, but not necessarily that much faster. It uses three primitive operations, all on 16-bit quantities: XOR, ADD and multiplication modulo 65537. Each encryption involves 48 XORs, 34 adds and 34 multiplies, plus a few exchanges. The multiplies are a problem. On the 486, a 16x16 integer multiply takes from 13-26 clocks, depending on how many significant bits there are in the multiplicand. Random data usually has 15-16 significant bits, so this distribution is probably weighted more toward the 26 clock figure. So I count an optimistic total of 990 clocks per 64-bit encryption, assuming plenty of registers (which I don't have), not counting the modular reduction steps for each multiply, and ignoring the memory references for the subkeys. I figure my DES code is currently taking about 1300 clocks per encryption. So IDEA won't be much faster, though it may be more secure. Anybody know the speed of the integer multiply instruction on the various PowerPC chips? Along with modular exponentiation and vocoders, which also do a lot of multiplies, it looks like fast multiplication is becoming rather important in secure communications. Phil
Excerpts from internet.cypherpunks: 7-Aug-94 IDEA vs DES by Phil Karn@unix.ka9q.ampr
Anybody know the speed of the integer multiply instruction on the various PowerPC chips? Along with modular exponentiation and vocoders, which also do a lot of multiplies, it looks like fast multiplication is becoming rather important in secure communications.
PowerPC integer performance is rather impressive, i.e. faster than Pentium by a bit. One craveat, tho, Apple says "No!" to programming in assembly, and I doubt that IBM is all this happy about it either. My guess is that MacOS is approaching the Unix "distribute source, 'cause you're gonna have to do lots of re-compiles" type of thing. Just a guess, though. Anyway, there is one assembly interpreter out for PowerMacs, I don't know about the IBM PowerPCs, though. Back to lurking, jer darklord@cmu.edu | "it's not a matter of rights / it's just a matter of war finger me for my | don't have a reason to fight / they never had one before" Geek Code and | -Ministry, "Hero" PGP public key | http://www.cs.cmu.edu:8001/afs/andrew.cmu.edu/usr25/jbde/
Jeremiah A Blatz writes:
PowerPC integer performance is rather impressive, i.e. faster than Pentium by a bit. One craveat, tho, Apple says "No!" to programming in
Actually, the reverse is true. Pentium integer performance (as measured in SPECints) is somewhat better than 601 PowerPC performance, MHz for Mhz. Thus, a 66 MHz Pentium has slightly better integer performance than a 66 MHz PowerPC. Not by much, but slightly. However, 90 MHz Pentium machines are now available in volume, even for under $2000, while PowerPC is not yet at this level. (Experimental Pentia running at 150 MHz have been shown..601s running at 120 MHz have been shown...and both Intel and IBM/Motorola/Apple have newer designs about to appear--the P6 and the 604.) Floating point is another story, with the PowerPC 601 significantly outperforming the Pentium. The exact numbers for all of these benchmarks are published and republished constantly, so I won't do so here. I happen to use Macs exclusively, but I worked for Intel for 12 years and still own their stock, so make of my comments what you will. --Tim May -- .......................................................................... Timothy C. May | Crypto Anarchy: encryption, digital money, tcmay@netcom.com | anonymous networks, digital pseudonyms, zero 408-688-5409 | knowledge, reputations, information markets, W.A.S.T.E.: Aptos, CA | black markets, collapse of governments. Higher Power: 2^859433 | Public Key: PGP and MailSafe available. "National borders are just speed bumps on the information superhighway."
I'm specifically interested in *fixed point* multiply and divide performance, since these operations appear to be crucial to IDEA and high quality speech coding, not to mention multiple precision modular exponentiation functions. My 486 reference shows 13-42 clocks for a 32x32 multiply and 40 clocks for a 64/32 divide. I've heard that the PowerPC can do a multiply-accumulate (the basic operation of a FIR digital filter) in one clock cycle, which qualifies it as a DSP chip in my mind. If true, then it may become possible to do high quality speech coding (essential for a secure phone) in software on a widely available general purpose computer instead of needing a high performance DSP subsystem that may be costly and/or less readily available. Here are some figures on my latest DES code. I'm placing it into the public domain; how do I go about putting it on soda.berkeley.edu? Measured execution speeds in crypts/sec: 11,488 (C version, 486DX-50, DOS, Borland C++ 3.1 -O2, 16-bit real mode) 39,185 (assembler version, same system) 62,814 (assembler version, 60 Mhz Pentium) 24,172 (C version, 486DX2-66, BSDI 1.1, GCC 1.42 -O, 32-bit prot mode) 64,185 (C version, 50 Mhz Sparc 10, GCC 2.5.8 -O) The C version is essentially identical to Outerbridge's code in Applied Cryptography, with a few extra tricks. The assembler version is the same thing rewritten in assembler, with numerous optimizations that were possible only in assembler. Anybody have a tool for translating Intel 486 assembler code to the Gnu assembler format? --Phil
According to my references, the PowerPC 601 does an integer multiply in 9 cycles (5 if the 2nd operand is 16 bits or less). An integer divide takes 36 cycles. Adds, etc. take 1 cycle. Floating-point multiplies take 1 cycle for single precision, 2 for double. However, they are pipelined, so if you need to use the results of the multiply on the next instruction, they will take 3 cycles. Floating-point adds take 1 cycle, again with the results available in 3. There is a floating-point (but no integer) multiply-and-add instruction. It has the same timing as the multiply. Hal
participants (5)
-
Hal -
Jeremiah A Blatz -
Phil Karn -
Phil Karn -
tcmay@netcom.com