Time to exhaustively break 40-bit RC4?
The SSL documents say that exhaustively searching 40 bits of RC4 keyspace takes 64 MIPS-years. When I brought this figure up at the cpunks meeting, it was roundly derided. However, I think it might be a sound estimate. The key schedule operation in RC4 does 256 "swap" operations. Let's say it takes four instructions to do each swap. So, it's 2000 instructions per key. A one-MIPS processor can search 500 keys a second. There are about 30 million seconds in a year, so that's 15 billion keys a year. 40 bits is a trillion keys, so it works out to 66 years, which is well within the Pentium-style accuracy of the calculations I've done. Am I missing something here? On the second floor of Soda Hall are about 100 HP Snake workstatations. I think they're about 100 MIPS each. During the winter break, they will be sitting mostly unused. If the math checks out, they should be able to search keyspace in two and a half days. Anyone wanna do some cracking? Raph
Raph Levien says:
The SSL documents say that exhaustively searching 40 bits of RC4 keyspace takes 64 MIPS-years. When I brought this figure up at the cpunks meeting, it was roundly derided. However, I think it might be a sound estimate.
Its not a question of deriding the estimate...
If the math checks out, they should be able to search keyspace in two and a half days.
...its a question of deriding the security of any system that takes so little time to crack, and thats assuming there are no better attacks than brute force (yet to be determined). With optimization, you can do even better than that. With a little bit of hardware (not very much) you can crack open a 40 bit keyspace with the effort normally reserved for opening your bathroom door in the morning. Perry
On Dec 12, 7:31pm, Perry E. Metzger wrote:
...its a question of deriding the security of any system that takes so little time to crack, and thats assuming there are no better attacks than brute force (yet to be determined). With optimization, you can do even better than that. With a little bit of hardware (not very much) you can crack open a 40 bit keyspace with the effort normally reserved for opening your bathroom door in the morning.
Actually, it's a bit more than a "little bit of hardware". One of the interesting realisations of pondering VLSI crackers was how much chip real-estate storing 2048 bits of laregly static internal state required, disregarding the size of a 2048 bit bus (remember "transistors are cheap, wires are expensive".) All transfers would have to be multi-cycle operations, which adds complexity due to the need to time and synchronise these transfers. It's by no means impossible, but the design of such a device is certainly not a trivial exercise in engineering, and I would never call the result a "little piece of hardware". Ian.
-----BEGIN PGP SIGNED MESSAGE----- "Ian Farquhar" <ianf@sydney.sgi.com> writes:
No, because as you're doing an exhaustive keysearch, you can "pipeline" the key generation process in software. Each key requires 256 swaps, certainly, but there are only two swaps difference between the key for "0000000000" and "0000000001" (assuming a 40 bit key). If you recursively generate keys, then you can generate successive keys like this:
This is not true, for a few reasons. First, keys are replicated (reused over and over) until 256*8=2048 bits have been used. So a 40-bit key would get reused about 50 times. Second, the key feeds into a PRNG which is mixed in with the swapping, so once you swap with a different one you will swap differently from then on. And third (and this is the one I find most interesting), SSL does not just use a 40-bit key for the export versions. They use a 128-bit key, but they require 128-40=88 bits to be sent in the clear. So the potential keyspace is much bigger than 2^40. This will make certain attacks (primarily those involving pre-calculation, which actually doesn't apply to your pipeline I guess) impossible. I thought it was interesting that this "128 minus 88" bit key qualified for the export approval. This suggests that NSA has no better attack than brute force (nothing relying on cryptographic weaknesses of 40 bit keys, for example). Hal -----BEGIN PGP SIGNATURE----- Version: 2.6 iQBVAwUBLuz/VBnMLJtOy9MBAQFMQwIAgo6XwroajnfYmRzSasstBSTKFGVeGI5U Kbg4VBG9FU9qFJaZ6hDpFbfZhvSc8OPnK0COWuZsdEZDcl1QDuwELA== =JCls -----END PGP SIGNATURE-----
I notice in the Netscape SSL spec the 40-bit export-approved RC4 key generation is a little more complicated than I would have thought. First a 128 bit "master key" is chosen and 88 bits are revealed, leaving 40 bits secret. Then the RC4 session key is generated as the MD5 hash of this master key plus about 32 bytes of publically known but random information. I'm not clear whether the 128-bit output of the MD5 hash is then used as the RC4 key, or whether only 40 bits are used (and if so, whether there are any public bits in the key besides these 40). If the former, then this extra hash step should really slow down exhaustive search of the key space. If the latter, then it is not clear why the master key is key-size restricted at all since it is not likely to be used in searching the key space. Maybe someone from Netscape could clear up how this is done. Hal
From: Hal <hfinney@shell.portal.com> I notice in the Netscape SSL spec the 40-bit export-approved RC4 key generation is a little more complicated than I would have thought. [The RC4 key is a hash of the external key. Are 40 or 128 bits of this hash used?] If the former, then this extra hash step should really slow down exhaustive search of the key space. If the latter, then it is not clear why the master key is key-size restricted at all since it is not likely to be used in searching the key space. It doesn't really matter, from a crack designer's point of view. It all depends on what keyspace you're actually searching. You can search either the external key (40 bit) or the internal key (larger). Clearly you have to search the external keyspace. In order to search the external keyspace, you have to simulate the whole algorithm, which in this case is not _just_ RC4 but also preliminary key setup phase. It's just another part of the algorithm. To make the distinction precise, what you're searching is not 40-bit RC4 but rather 40-bit RC4-as-used-in-SSL. The compound algorithm is not identical to the underlying algorithm. This is one of the design problems in Weiner's DES-cracking machine (designed and unbuilt), that it can only crack DES as such and not minor modifications to it. The machine uses a little polynomial generator (similar to using CRC) to be able to partition the keyspace among processors and to keep the pipelines full. This is a hard-wired generator. The architectural improvement needed in a practical machine would be an interconnect for key candidate sequencing. This would add to the cost of the machine, but only by, say, 20% at most. It would be expensive as interconnects go because the bandwidth is so high. Suppose an RC4 cracker existed with the above interconnect. In order to crack RC4-SSL, you'd need a second simulator that did all the hashing and spat keys out its interconnect. Such a front end would have to be designed for every particular configuration used. Eric
On Dec 17, 1:49pm, Hal wrote:
Subject: Re: Time to exhaustively break 40-bit RC4? I notice in the Netscape SSL spec the 40-bit export-approved RC4 key generation is a little more complicated than I would have thought. First a 128 bit "master key" is chosen and 88 bits are revealed, leaving 40 bits secret. Then the RC4 session key is generated as the MD5 hash of this master key plus about 32 bytes of publically known but random information. I'm not clear whether the 128-bit output of the MD5 hash is then used as the RC4 key, or whether only 40 bits are used (and if so, whether there are any public bits in the key besides these 40).
128 bits are used. I have cleaned up the spec language to make this more obvious.
If the former, then this extra hash step should really slow down exhaustive search of the key space. If the latter, then it is not clear why the master key is key-size restricted at all since it is not likely to be used in searching the key space. Maybe someone from Netscape could clear up how this is done.
Hopefully it will slow down exhaustive key search. Hope this helps, and thanks again for the comments. -- --------------------------------------------------------------------- Kipp E.B. Hickman Netscape Communications Corp. kipp@mcom.com http://www.mcom.com/people/kipp/index.html
On Dec 12, 3:30pm, Raph Levien wrote:
The key schedule operation in RC4 does 256 "swap" operations. Let's say it takes four instructions to do each swap. So, it's 2000 instructions per key. A one-MIPS processor can search 500 keys a second. There are about 30 million seconds in a year, so that's 15 billion keys a year. 40 bits is a trillion keys, so it works out to 66 years, which is well within the Pentium-style accuracy of the calculations I've done.
No, because as you're doing an exhaustive keysearch, you can "pipeline" the key generation process in software. Each key requires 256 swaps, certainly, but there are only two swaps difference between the key for "0000000000" and "0000000001" (assuming a 40 bit key). If you recursively generate keys, then you can generate successive keys like this: 1. Copy the keystate from iteration n-1 (keep the partial keystates on a stack). 2. Do the swap for this portion of the key, and for 255 out of 256 keys, you will have a new one in 2 swaps. (In reality, it would be faster to undo the last swap rather than copying the key, and keeping the swaps on a stack rather than the keystate on a stack. These are implementation issues I haven't given a huge amount of thought to as yet.) Unless there is some hidden complexity which I have overlooked - in which case I will be delighted to stand corrected - this will produce a key fast enough to allow an average workstation to search the 40-bit keyspace using a known plaintext attack in a couple of hours or less. If this is the case, 40-bit RC4 might as well be crypt(1), and 48-bit RC4 looks pretty shakey too. I was planning to code this over the xmas break, dependent on whatever other commitments fall on me during that period. I realised it was possible a couple of months ago after pondering ways of parallelising the RC4 key generation process in hardware. Ian.
Ian Farquhar <ianf@sydney.sgi.com> wrote:
No, because as you're doing an exhaustive keysearch, you can "pipeline" the key generation process in software. Each key requires 256 swaps, certainly, but there are only two swaps difference between the key for "0000000000" and "0000000001" (assuming a 40 bit key). If you recursively generate keys, then you can generate successive keys like this:
This doesn't quite work. As I understand it, the RC4 key scheduling algorithm repeats the key to fill 256 bytes. For a 128-bit key, this is 16 times. Thus, you can only win on the last repeat. Perry also mentioned some "optimizations" but I believe RC4 is resistant to this sort of thing. The inner loop is about as simple as you're going to get it. Oh, just to clarify one point. 40-bit RC4 in fact uses a 128 bit key, it's just that 88 bits of the key are sent in the clear. Your idea does help in searching the 128-bit keyspace. Unfortunately, it reduces the time needed from about 10^45 to 10^43 operations. Mazel Tov. Raph
From: "Ian Farquhar" <ianf@sydney.sgi.com> No, because as you're doing an exhaustive keysearch, you can "pipeline" the key generation process in software. Each key requires 256 swaps, certainly, but there are only two swaps difference between the key for "0000000000" and "0000000001" (assuming a 40 bit key). Not by my count. The key data length for a forty bit key is only 5. That means that each byte of the key data is used about fifty times in key setup (256/5). Those initial changes in the internal key permutation table then propagate under iteration. Now I haven't looked very closely at how to optimize this search, and it's not even clear that it's possible. There are 256! possible permutations for the internal key, which is a lot more than 2^40 possible (external) keys. It's quite possible that the internal keys are just not particularly close to each other. Close here, say, is the minimum number of swaps needed to take one key to another. It's possible that some arrangement other than incrementing the key yields internal key correlations that speed up software internal key generation. Eric
participants (6)
-
eric@remailer.net -
Hal -
Ian Farquhar -
Kipp E.B. Hickman -
Perry E. Metzger -
raph@netcom.com