Phil Karn <karn@unix.ka9q.ampr.org> writes:
I've completely translated the encrypt and decrypt routines to assembler, with no calls or jumps inside either routine. I picked up Richard Outerbridge's seriously clever initial and final permutation algorithm from Schneier, along with a few of his other tricks.
I should confess that I am probably the only person on the list who has not yet read Schneier. So I apologize in advance if the following comments turn out to be redundant.
What still bugs me is that Schneier lists the speed of one commercial DES implementation as 40,600 encryptions/sec on a 33 Mhz 486. I just don't see how that's possible without using a lot more memory for lookup table space (I use only 2K, which is nice in a DOS environment).
Since 2k is exactly what is needed for a precomputed table which combines the S-boxes and the wirecrossing, I will assume this is the approach you used. Given this data structure, there are a number of cute tricks which will get DES down to around 30 machine instructions per each of the 16 rounds on a machine with enough registers and a decent set of addressing modes. The important trick is to reorder the S-boxes so that you do lookups on the odd numbered ones and the even numbered ones separately. (1,3,5,7,2,4,6,8) works nicely. This permits the results to be ORed together in two groups of four with all the necessary indexing held in a single 32 bit register, which can be appropriately repositioned each time. The precomputed key schedule needs to be adjusted to reflect the new order. Note that with this ordering, the blocks of six bits used for lookup are byte aligned if you consider the even and odd S-boxes separately. If you store the upper two bits of lookup table addressing in the precomputed key schedule and shift both it and the right hand block left two bits, all explicit table indexing vanishes and you can accumulate the result of a lookup with a single indexed OR instruction. I'm not sure what 30-something instructions per round translates into for a 33 Mhz 486, but 40,600 encryptions per second doesn't sound too outrageous using the above approach. -- Mike Duvos $ PGP 2.6 Public Key available $ mpd@netcom.com $ via Finger. $