Cryptanalysis of RC4 - Preliminary Results (Repeat)
(This is a repeat because I posted the original 36 hours ago and it still hasn't bounced back to me.) -----BEGIN PGP SIGNED MESSAGE----- Hi c'punks & sci.cryptites About a week ago I posted a message about weak keys in RC4. This is an update on the results of my continued 4am sessions with RC4 and shows that certain weak keys lead to an almost-feasible known plaintext attack on the cipher (well, about as feasible as the differential attack on DES, shall we say). The attack is based on two particularly interesting three-byte key prefixes which have a high probability of producing PRNG sequences which start with a known two-byte sequence. The prefixes are: 1. Keys starting with "00 00 FD" which have a 14% probability of generating sequences which start "00 00". 2. Keys starting with "03 FD FC" which have a 5% probability of generating sequences which start "FF 03". Note that the expected frequency of any two-byte output sequence is 1 in 65536 or about 0.0015%, so these key prefixes are highly unusual. I won't go into the reasons why in this post, since it follows the same reasoning as my last post, but these prefixes are special in that they have a high probability of initializing the RC4 state table in such a way that the first two generated bytes depend only on the first three entries in the state table. This observation is the basis for a simple known-plaintext attack which reduces the effective key space which you need to search to have a 50% probability of discovering a key by about 11.2 bits. The down side is that you need "quite a few" known plaintexts to make the attack feasible. It works as follows: 1. Collect a large number of known plaintexts (and hence known generator sequences). 2. Discard generator sequences which do not start with "00 00" or "FF 03". 3. For generator streams starting "00 00", search all keys which begin with "00 00 FD". 4. For generator streams staring "FF 03", search all keys which begin with "03 FD FC". 5. Keep going until you find a key :-) Clearly this attack will only discover a small fraction of the keys. However since most generator sequences are discarded without being searched, and for those which are searched the search is 2^24 smaller than would be required to search the entire keyspace, the number of trials required to determine a key is significantly lower than for brute force alone. Enough of an intro, here are the relevant results. Forgive my simplistic approach to maths, I'm a philosopher-come-software developer, not a mathematician. I've run the relevant simulations with 40-bit, 64-bit, 80-bit and 128-bit key lengths, and with two different PRNGs. For the sake of consistency with my earlier paper I'll use the figures gathered for 80-bit keys (this seems to be RSA's preferred key length for RC4), but there are no significant differences for other key lengths. The PRNG used for these tests was L'Ecuyer's 32-bit combined linear congruential generator as described in "Applied Cryptography" p. 349. (a) Out of one million trials, keys starting with "00 00 FD" generated sequences starting "00 00" 138217 times, and keys starting with "03 FD FC" generated output sequences starting "FF 03" 50490 times. (b) Out of ten million trials, arbitrary pseudo-random keys generated sequences starting with "00 00" 446 times, and sequences starting with "FF 03" 146 times. (Note the abnormally high incidence of "00 00"; the expected mean is 152.8). Suppose we have the output stream generated by a randomly chosen key. The chance that it will start with either "00 00" or "FF 03", and that we will therefore search it, is: (446 + 146) / 1e7 = 5.92e-5 The chance that it starts with "00 00" and was generated by a key starting with "00 00 FD", or that it starts with "FF 03" and was generated by a key starting "03 FD FC" - i.e. the chance that we will search it and be rewarded for our efforts - is: (138217 + 50490)/(1e6 * 2^24) = 1.12e-8 The total number of plaintexts required for a 50% chance that we will discover one of the keys is: log(0.5)/log(1 - 1.12e-8) = 61 900 000 Well I did say "quite a few" plaintexts would be necessary :-) And the number of plaintexts which you expect to search in order to find the "right" one is: 61 900 000 * 5.92e-5 = 3665 Since the total key length is 80 bits, and we are "guessing" 24 of these, each search requires 2^56 trials. Hence the total number of trials for a 50% chance of discovering a key is: 3665 * 2^56 = 2.64e20 = 2 ^ 67.8 Since brute search alone would require 2^79 trials for a 50% chance of determining the key, this reduces the number of trials by 2^11.2. The results are essentially identical for all the key lengths I have tried, and in each case reduce effective key length by about 11.2 bits. So, for example, a 64-bit key would normally require 2^63 trials for 50% chance of solution; this attack reduces the number of trials to 2^51.8 at the cost of requiring 62 million known plaintexts. I'm still running simulations to check my maths, and although initial results are encouraging, I don't have enough data for it to be statistically relevant yet (generating all these sets of 62 million known streams takes time...) So consider this preliminary (again), and I'll post the results of my simulations when I have enough data. Andrew ________________________________________________________________ Andrew Roos <andrewr@vironix.co.za> // C++ programmers have class (but not much inheritance) PGP Fingerprint: F6 D4 04 6E 4E 16 80 59 3A F2 27 94 8B 9F 40 26 Full key at ftp://ftp.vironix.co.za/PGP-keys/AndrewRoos -----BEGIN PGP SIGNATURE----- Version: 2.6.2i iQCVAwUBMGrlfmatuqa4OR+lAQF1eQP+IBBmSztAYUpq1q/BjzvYDCbb+Ns0Gi1S u9wTaZOCl32fdp7NSUEQBX39nVJkQZginug56BZXzijRvOx6fl4+z7dmW9jwtE5E YNCOhx+/fHX4psszMyEUTrnza7MYDc4HXlgv743LOD/xvEyU0D5OGgB5fg+lyhAK 6xQ/Zy8JpE8= =BdMn -----END PGP SIGNATURE-----
participants (1)
-
Andrew Roos