Timing Cryptanalysis Attack
pck@netcom.com (Paul C. Kocher) writes:
I've just released details of an attack many of you will find interesting since quite a few existing cryptography products and systems are potentially at risk. The general idea of the attack is that secret keys can be found by measuring the amount of time used to to process messages.
I just read this paper, and while it is somewhat interesting, I don't think the walls of cryptography are in any danger of crumbling. People employing systems like PGP are already advised to use them on private machines, with only one user, and untampered-with binaries. Under such circumstances, the collecting of statistics necessary to employ a timing attack would be difficult at best, and anyone doing a "black bag" job on the platform would be better advised to use a direct attack like a passphrase-sniffer as opposed to a complex statistical approach. On Networked systems with many users, where one is advised not to decrypt with or store ones private key, the situation is of course different. But again, another user with the ability to monitor the timing of specific subroutines in ones cryptographic software or feed that software enough chosen data to generate a statistical profile of the key, would doubtless have an opportunity to compromise the system in other ways. In the particular case of RSA used to sign messages or transmit session keys, the values being exponentiated are either highly random or strongly hashed, and the opportunity of an opponent to time numerical routines with data of his own choosing is non-existant. So while this is a very nice piece of work, and certainly of theoretical interest, I don't think it will modify the way in which people are advised to utilize cryptographic software, or cause companies like Netscape of RSADSI to shed any tears. -Bourbaki 137
On Mon, 11 Dec 1995, Anonymous wrote:
pck@netcom.com (Paul C. Kocher) writes: I just read this paper, and while it is somewhat interesting, I don't think the walls of cryptography are in any danger of crumbling. ... So while this is a very nice piece of work, and certainly of theoretical interest, I don't think it will modify the way in which people are advised to utilize cryptographic software, or cause companies like Netscape of RSADSI to shed any tears.
Read the SKIP spec (SKIP is Sun's IP level encryption protocol). It uses Diffle-Hellman certificates. That means fixed secret DH keys being used in routers. It is hard to thing of a better target for this type of attack. I have not done a complete read of the SKIP specification (only a quick scan) so I could be wrong about SKIP but DH certificates sound like a very very bad idea. The other source for attack would be any networked service that is on a local network. Single user machines are far better targes than multi-user systems. That Web server sitting idle not doing much, repeatedly hit it with https requests and if you are on a local network, you should be able to get very good timing information. I for one will probably add a flag for conditional compilation of my bignumber library so that it will take constant time. This may be a %10 slow down (using small windows exponentiation) which is trivial compared to the %30 speedup I will probably get when I implement a faster mod function :-). eric -- Eric Young | Signature removed since it was generating AARNet: eay@mincom.oz.au | more followups than the message contents :-)
Eric Young writes:
Read the SKIP spec (SKIP is Sun's IP level encryption protocol). It uses Diffle-Hellman certificates.
Photuris, which likely will be the standard way to do this sort of thing on top of IPsec, also suffers from the problem, but I suspect the next version of the draft (number 9) will have it fixed. More interesting is the fact that a number of NSA vetted protocols seem to have the flaw. Obviously, they either didn't know or didn't say anything about it to the folks designing such stuff... Perry
Eric Young wrote:
I for one will probably add a flag for conditional compilation of my bignumber library so that it will take constant time. This may be a %10 slow down (using small windows exponentiation) which is trivial compared to the %30 speedup I will probably get when I implement a faster mod function :-).
Careful. Even if you can make the number of executed instructions the same, you still have to worry about timing differences due to branches and the way the hardware multiplier handles different operands. -- Sure we spend a lot of money, but that doesn't mean | Tom Weinstein we *do* anything. -- Washington DC motto | tomw@netscape.com
On Mon, 11 Dec 1995, Tom Weinstein wrote:
Careful. Even if you can make the number of executed instructions the same, you still have to worry about timing differences due to branches and the way the hardware multiplier handles different operands.
Granted. For my particular library, there are no major 'if statements' I believe (I'll check) after you get out of the mod_exp function and into the mod and mul sub parts. As for the multiplier, I just had a look at my old 386 book and yup, it does take an argument dependent time... I've been around pipelined RISC cpus too long... eric -- Eric Young | Signature removed since it was generating AARNet: eay@mincom.oz.au | more followups than the message contents :-)
I for one will probably add a flag for conditional compilation of my bignumber library so that it will take constant time. This may be a %10 slow down (using small windows exponentiation) which is trivial compared to the %30 speedup I will probably get when I implement a faster mod function :-).
Careful. Even if you can make the number of executed instructions the same, you still have to worry about timing differences due to branches and the way the hardware multiplier handles different operands.
No, he's saying to equalize wall-clock time---just pad out beyond the largest possible execution time with a timer. Surely with a sufficient pad the timing-channel leak can be made negligible (though the author seems to claim otherwise---I should read the explanation!). Peter Monta
Tom Weinstein writes:
I for one will probably add a flag for conditional compilation of my bignumber library so that it will take constant time. This may be a %10 slow down (using small windows exponentiation) which is trivial compared to the %30 speedup I will probably get when I implement a faster mod function :-).
Careful. Even if you can make the number of executed instructions the same, you still have to worry about timing differences due to branches and the way the hardware multiplier handles different operands.
The trivial way to handle this is simply to check user time with the right system calls and make sure it always comes out the same with an apropriate number of sleeps. Perry
Perry E. Metzger wrote:
The trivial way to handle this is simply to check user time with the right system calls and make sure it always comes out the same with an apropriate number of sleeps.
The problem with that approach is that if the system is heavily loaded, it can take an arbitrarily large amount of user time. Somewhat better is to sleep for a random amount of time after you're done. That will smear out the time distribution making it harder to get a statistically meaningful number of samples. It also increases your latency, but doesn't hurt throughput on a busy system. -- Sure we spend a lot of money, but that doesn't mean | Tom Weinstein we *do* anything. -- Washington DC motto | tomw@netscape.com
Tom Weinstein writes:
Perry E. Metzger wrote:
The trivial way to handle this is simply to check user time with the right system calls and make sure it always comes out the same with an apropriate number of sleeps.
The problem with that approach is that if the system is heavily loaded, it can take an arbitrarily large amount of user time.
Totally untrue. The process can take an arbitrary amount of wall clock time, not user time. In the case of the heavily loaded machine, the problem is gone -- the opponent can't precisely predict this. Provided you take the same amount of process time no matter what, you are okay. (To be technical, user time doesn't pass during sleeps, but that doesn't matter -- the problem gets fixed anyway).
Somewhat better is to sleep for a random amount of time after you're done.
I don't think so. First of all, you can still extract some information. If you have been gone as long as the maximum computation plus the maximum random fudge, you know that you had to have conducted the maximum computation. This means that some bits are indeed leaking. Your approach also has the disadvantage that it is hard to produce good random numbers -- you are perhaps familiar with that problem? Perry
Perry E. Metzger wrote:
Tom Weinstein writes:
The problem with that approach is that if the system is heavily loaded, it can take an arbitrarily large amount of user time.
Totally untrue. The process can take an arbitrary amount of wall clock time, not user time.
Whoops. You are absolutely correct. Pardon my brain-damage. I was thinking wall clock time, as you indicated.
Somewhat better is to sleep for a random amount of time after you're done.
I don't think so. First of all, you can still extract some information. If you have been gone as long as the maximum computation plus the maximum random fudge, you know that you had to have conducted the maximum computation. This means that some bits are indeed leaking. Your approach also has the disadvantage that it is hard to produce good random numbers -- you are perhaps familiar with that problem?
Yes, you are correct. It's better than taking a fixed amount of wall clock time, but definitely not better than a fixed amount of user time. As Paul mentions in his extended abstract, there is actually an easy way to fix the problem without hurting either latency or throughput much. If you blind and and unblind around the modular exponentiation, it appears impossible to perform this attack. Because you don't know the inputs to the exponentiation operation, you can't make any predictions based on those inputs. -- Sure we spend a lot of money, but that doesn't mean | Tom Weinstein we *do* anything. -- Washington DC motto | tomw@netscape.com
The trivial way to handle this is simply to check user time with the right system calls and make sure it always comes out the same with an apropriate number of sleeps.
Of course, this works against a remote adversary, but not against one on the same machine who can look at actual CPU consumption (which doesn't increase when the target is blocked). -matt
Matt Blaze writes:
The trivial way to handle this is simply to check user time with the right system calls and make sure it always comes out the same with an apropriate number of sleeps.
Of course, this works against a remote adversary, but not against one on the same machine who can look at actual CPU consumption (which doesn't increase when the target is blocked).
True enough, but using busy loops could handle that. However, I must admit to being far more interested in handling the remote case efficiently, especially given concerns people have about using Photuris like systems on heavily pounded servers. Perry
Matt Blaze writes:
Of course, this works against a remote adversary, but not against one on the same machine who can look at actual CPU consumption (which doesn't increase when the target is blocked).
Maybe this is a good reason to spinwait, rather than sleep, until the timer expires. It would be pretty subtle to distinguish that from "real" computation. Peter Monta
In message <199512120056.QAA16055@mage.qualcomm.com>, Peter Monta writes:
Of course, this works against a remote adversary, but not against one on the same machine who can look at actual CPU consumption (which doesn't increase when the target is blocked).
Maybe this is a good reason to spinwait, rather than sleep, until the timer expires. It would be pretty subtle to distinguish that from "real" computation.
Across a net it should be hard. On the same CPU it may be easy. Some CPUs with hardware branch prediction keep track of how many branches were correctly and incorrectly predected. These registers are not allways protected, and not allways "made virtual" by the OS. If your spin wait is of the form: LOAD #big_number, R1 L1: DEC R1 BNE L1 (a.k.a "for(i = big_number; i--;) { }") Then the "number of correctly predicted branches" will go up by approximatly big_number... (in all honesty the only CPU I am sure "allows" normal user programs to see the performance registers is the AMD29xxx series, and that is only if the OS sets the right bit in the register protection mask. I know the P6 has such performance registers, but don't know if they are protected, and I think the P5 has them, but again I don't know if they are protected. I think some of the Alpha's have them, but seem to remember them being protected (and I use to think it was a dumb idea...))
Hey, don't go for constant time, that's too hard to get perfect. Add a *random* delay. This particular crypto-flaw is pretty easy to fix. (See, I'm not *always* arguing the downside of cryptography!) It is worth noting, however, the extent to which "secure" cryptographic protocols keep needing to get fixed one last time.... -- Nathaniel -------- Nathaniel Borenstein <nsb@fv.com> | (Tense Hot Alien In Barn) Chief Scientist, First Virtual Holdings | VIRTUAL YELLOW RIBBON: FAQ & PGP key: nsb+faq@nsb.fv.com | http://www.netresponse.com/zldf
Nathaniel Borenstein <nsb@nsb.fv.com> writes: Hey, don't go for constant time, that's too hard to get perfect. Add a *random* delay. This particular crypto-flaw is pretty easy to fix. (See, I'm not *always* arguing the downside of cryptography!)
Random delay may be harder to get perfect than constant time. Note that the actual time for the transaction is the minimum of all the transaction times you measure, since you can't add a negative delay to them. It's presumably even easier if the random distribution is known. Adding a random delay means more transactions are required to find each new bit, but information is still leaking.
It is worth noting, however, the extent to which "secure" cryptographic protocols keep needing to get fixed one last time.... -- Nathaniel
Amen... Jim Gillogly Trewesday, 21 Foreyule S.R. 1995, 19:16
Jim Gillogly wrote: | > Nathaniel Borenstein <nsb@nsb.fv.com> writes: | > Hey, don't go for constant time, that's too hard to get perfect. Add a | > *random* delay. This particular crypto-flaw is pretty easy to fix. | > (See, I'm not *always* arguing the downside of cryptography!) | | Random delay may be harder to get perfect than constant time. Note that | the actual time for the transaction is the minimum of all the transaction | times you measure, since you can't add a negative delay to them. It's | presumably even easier if the random distribution is known. Adding a | random delay means more transactions are required to find each new bit, | but information is still leaking. Does the delay have to be random, or does the total time for a transacation need to be unrelated to the bits in the secret key? Assume that the time added is pseudo-random (and confidential). Further, for any non-overlapping group of N transactions, the distribution of the times fits some predetermined curve, say a bell curve. We've added a non random number, but since those numbers end up being a curve, it would be difficult to determine which transaction got which time added to it. This resembles the 'make them all a constant time', but allows us to send out some in a shorter time than the maximum (although most transactions should probably take longer than the average.) Adam -- "It is seldom that liberty of any kind is lost all at once." -Hume
Jim Gillogly wrote:
| > Nathaniel Borenstein <nsb@nsb.fv.com> writes: | > Hey, don't go for constant time, that's too hard to get perfect. Add a | > *random* delay. This particular crypto-flaw is pretty easy to fix. | > (See, I'm not *always* arguing the downside of cryptography!)
Does the delay have to be random, or does the total time for a transacation need to be unrelated to the bits in the secret key? Assume that the time added is pseudo-random (and confidential). Further, for any non-overlapping group of N transactions, the distribution of the times fits some predetermined curve, say a bell curve.
Random time won't save you - it just increases the noise, thus reducing the effective bandwidth of the covert channel. To get the time, I only need to do enough repetitions of the same computation to eliminate the effect of the randomness and I have the same resulting information about the key. The only way to completely remove covert channels is by making the measurable time completely independent of the actual time. One way with the RSA might be to do the encryption with the key and the inverse of the key (hence all 0s become 1s and 1s become 0s). -> See: Info-Sec Heaven at URL http://all.net/ Management Analytics - 216-686-0090 - PO Box 1480, Hudson, OH 44236
Nope, I'm wrong, as Fred and Simon point out. The noise makes finding the times more difficult by some small factor, nothing more. I'll stop writing these things in the morning. :) I wrote: | Does the delay have to be random, or does the total time for a | transacation need to be unrelated to the bits in the secret key? | Assume that the time added is pseudo-random (and confidential). | Further, for any non-overlapping group of N transactions, the | distribution of the times fits some predetermined curve, say a bell | curve. | | We've added a non random number, but since those numbers end | up being a curve, it would be difficult to determine which transaction | got which time added to it. This resembles the 'make them all a | constant time', but allows us to send out some in a shorter time than | the maximum (although most transactions should probably take longer | than the average.)
Dr. Frederick B. Cohen wrote:
One way with the RSA might be to do the encryption with the key and the inverse of the key (hence all 0s become 1s and 1s become 0s).
Nope, this doesn't work. -- Sure we spend a lot of money, but that doesn't mean | Tom Weinstein we *do* anything. -- Washington DC motto | tomw@netscape.com
Anonymous wrote:
So while this is a very nice piece of work, and certainly of theoretical interest, I don't think it will modify the way in which people are advised to utilize cryptographic software, or cause companies like Netscape of RSADSI to shed any tears.
While an exploit of this attack against our software has not been demonstrated, and there is some debate about whether it will even work, we are taking it very seriously. We've been working with Paul to develop a fix, which we will implement even if the attack is never proven effective against our software. --Jeff PS - I think Paul was a bit surprised when Jim Barksdale pulled out his wallet and handed him 10 crisp $100 bills. :-) -- Jeff Weinstein - Electronic Munitions Specialist Netscape Communication Corporation jsw@netscape.com - http://home.netscape.com/people/jsw Any opinions expressed above are mine.
On Mon, 11 Dec 1995, Jeff Weinstein wrote:
While an exploit of this attack against our software has not been demonstrated, and there is some debate about whether it will even work, we are taking it very seriously. We've been working with Paul to develop a fix, which we will implement even if the attack is never proven effective against our software.
My gut & scribble-on-the-back-of-a-napkin feeling about this class of attack is that it could be a problem for smartcards (almost certainly), and possibly for non-routed networks (possibly - napkin was too small :-), but is not going to viable on internetworks where routers are in use; if a packet enters a queue at any point in its path, then the transit time will be quantised by the time it drains the queue, which is basically controlled by the time it takes to drain previously queued packets; this will destroy any microsecond level correlations that may have been leaked. Ron is supposed to be doing a presentation at WWW IV later this week - hopefully he'll give his opinion on this. Definitely a really neat hack, even if it isn't always practical. Simon p.s. Someone mentioned adding random timings instead of padding out to a constant time. This won't work (adding noise doesn't destroy a signal - just increases the effort needed to isolate it)
Jeff Weinstein wrote: | PS - I think Paul was a bit surprised when Jim Barksdale pulled | out his wallet and handed him 10 crisp $100 bills. :-) Great. mention it where the IRS is sure to be listening. :) -- "It is seldom that liberty of any kind is lost all at once." -Hume
Adam Shostack wrote:
Jeff Weinstein wrote:
| PS - I think Paul was a bit surprised when Jim Barksdale pulled | out his wallet and handed him 10 crisp $100 bills. :-)
Great. mention it where the IRS is sure to be listening. :)
I know the spooks hang out here, but I didn't think the IRS did. Maybe the NSA just forwards them all net traffic that includes the words cash, bills, etc. :-) Since Paul mentioned it on his web page, and it was also in a press release, I figured it was OK. --Jeff -- Jeff Weinstein - Electronic Munitions Specialist Netscape Communication Corporation jsw@netscape.com - http://home.netscape.com/people/jsw Any opinions expressed above are mine.
On Mon, 11 Dec 1995, Adam Shostack wrote:
Jeff Weinstein wrote:
| PS - I think Paul was a bit surprised when Jim Barksdale pulled | out his wallet and handed him 10 crisp $100 bills. :-)
Great. mention it where the IRS is sure to be listening. :)
Why would the IRS listen? Everyone knows the tax system is voluntary.
-- "It is seldom that liberty of any kind is lost all at once." -Hume
--- My prefered and soon to be permanent e-mail address: unicorn@schloss.li "In fact, had Bancroft not existed, potestas scientiae in usu est Franklin might have had to invent him." in nihilum nil posse reverti 00B9289C28DC0E55 E16D5378B81E1C96 - Finger for Current Key Information
Anonymous writes:
I just read this paper, and while it is somewhat interesting, I don't think the walls of cryptography are in any danger of crumbling.
People employing systems like PGP are already advised to use them on private machines, with only one user, and untampered-with binaries.
Timings like the ones listed are trivial to take in establishing things like SSL sessions, or Photuris sessions. The danger is to online protocols, not to PGP. Any reason you felt you had to say this anonymously? Perry
participants (14)
-
Adam Shostack -
anon-remailer@utopia.hacktic.nl -
Black Unicorn -
Eric Young -
fc@all.net -
Jeff Weinstein -
Jim Gillogly -
Josh M. Osborne -
Matt Blaze -
Nathaniel Borenstein -
Perry E. Metzger -
Peter Monta -
Simon Spero -
Tom Weinstein