small memoryspace DES?
Hi, Getting back to the cypherpunks-write-code theme, I'm currently attempting to put DES on a u-ctlr that has a whole 1k of memory for things like 'stack'. All the DES implimentations I've found thusfar have traded memory for speed. I can afford a lot of text space, and initialized data can go in the text segment, of course (think eeprom, ok?), but I have to live within a 1k space for all my dynamic memory needs. Has anyone ever done anything like this? Care to share? Jim
The DES implementation I'm most familiar with (the DES library core written by Dennis Ferguson) uses moderately large (by your standards) read-only tables. The IP and FP tables are each 256*4 bytes (2Kbytes total); the expanded S boxes are 8*64*4 bytes (another 2K bytes), and the key schedule code uses a 48*4 byte table and two 4*64*4 byte tables (another 2K+). It's optimized for 32 (and larger) bit processors, but it probably wouldn't be all bad on a 16 bit system. The expanded key schedule uses 8 bytes per round times 16 rounds, or 128 bytes per currently active key (1/8 of your available memory). However, only 48 bits of key schedule are actually needed per round (8 subfields of 6 bits each), so you could (at the cost of some speed, probably a penalty of 50% to 100% or so) compress the schedule down to 96 bytes if it really mattered to you.. You could also compute each round of the key schedule as you needed it (discarding it on the fly) reducing the memory needed per key to 8 bytes of "persistant" storage and a little bit more dynamic memory when actually doing the encryption.. of course, performance goes down the tubes if you do this unless you only use each key once..
Getting back to the cypherpunks-write-code theme, I'm currently attempting to put DES on a u-ctlr that has a whole 1k of memory for things like 'stack'. All the DES implimentations I've found thusfar have traded memory for speed.
You can get by with about 2KB of static table (sbox + permutation), assuming you lookup the sboxes 1 at a time and fold the permuation into the table lookup. 8 sboxes * 64 entries/sbox * 4 bytes/entry = 2KB For the key schedule you'll need 8 bytes/subkey * 16 subkeys = 128 bytes (You can do it with 6 bytes/subkey, but it's easier with 8 bytes/subkey) I suggest that you look at the "descore" implementation by Dana How. You can find it with archie. Also, if the uP that you are using has direct bit addressable memory, the permuations stop being a problem -- say 2 instructions/bit. Obviously, permutations are real easy to do in hardware. In addition, I suggest that you do some back of the envelope calculations to make sure that you're going to get the kind of throughput that you expect. The simplest way is to work out the code for a single round, and then multiply by 16. This will get you in the ball park. Have fun! Eric Blossom
participants (3)
-
Eric Blossom -
jim@Tadpole.COM -
sommerfeld@orchard.medford.ma.us