Bill Stewart <stewarts@ix.netcom.com> writes:
But those designs are for one-at-a-time cracks. An interesting question is whether you can speed up performance substantially by cracking multiple messages at once.
For known plaintext attack on pure RC4 this would work marvelously, should get close to linear speed up I think as the greatest overhead is the key setup. This was discussed some during the netscape SSL break, it didn't apply to 40 bit SSL because it was really 128 bit, just with 88 bits disclosed, so the 88 bits functioned as a salt. But it applies just fine to pure RC4-40, ... or even to ECB DES... This is interesting as applied to DES, does anyone have any banking or funds transfer protocols handy which use DES in ECB mode :-) Perhaps we could get DES down to a manageable number of bits, together with the argument that the attacker wouldn't care who's money he stole.
For instance, if you've got known plaintext, such as a standard header format saying "FooVoice" or "BEGIN DSA-SIGNED..", you can try many keys and compare them with _many_ cyphertexts, which may not slow down the FPGA very much.
Thinking of software attacks and RC4-40, if you were attacking pure RC4-40, you would collect your 16k known-plaintext / ciphertext pairs, xor them, and sort the xored texts and store them in some kind of dictionary lookup structure . Then you'd do the key schedule, then traverse the btree with each byte that the RC4_encrypt_byte would have xored with the text being encrypted. As soon as you took a branch which didn't exist in the btree you'd move on to the next key and keyschedule. [hacking interlude] I got bored so I hacked up a test of this of the overheads of lookups, using bsearch under linux I get lookups / sec against number of known plaintexts: known plaintext/ ciphertext actual avg time to pairs lookups/s keys/s keys/s find a key ======================================================== 16k 71k 23k 376M 24 mins 8k 77k 24k 193M 48 mins 4k 91k 25k 101M 1.5 hrs 2k 100k 25k 52M 2.9 hrs 1k 125k 27k 27M 5.6 hrs 1 - 34k 34k 187 days The tests were done on an AMD 486 dx/4 120 (a 120Mhz i486 clone), the keys/s for pure rc4-40 are from a hand optimised assembly version which I'd been playing with. `actual keys' is the keys from the search space of 2^40. `lookups/s' is the number of bsearches per second for the given sized pre-xored table. (Known plaintext xored with ciphertext allows the check for correct key to be done with memcmp). `keys/s' is the number of keys tested at once * the actual keys/s `avg time..' is the expected time before find a key. So based on one machine, if you had 1000 known plaintexts, you would get a key in around 5 hours. Multiply by 100 machines, some faster some slower and it gets interesting. Our only problem now is to find someone dumb enough to use pure RC4-40, Adam -- #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1 lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)