parallel encryption

Ok, so I've got my BeBox and so finally have an SMP of my own again; anyone want to suggest any cool crypto stuff that parallelises well? Rogaways hashs look interesting, and nDES offers an obvious process network for pipelining, but what about things like running multiple interleaved CBC streams in parallel, with each stream starting off from a different IV? I can't think of any practical ways of speeding up a single RSA operation, although twice as many processors obviously gives twice the thruput. Simon ------ "GM spent $3.6 billion giving birth to the Saturn, and it doesn't even go supersonic." - Ben Rich, Skunk Works head (75-91)

ses@tipper.oit.unc.edu (Simon Spero) wrote:
Ok, so I've got my BeBox and so finally have an SMP of my own again; anyone want to suggest any cool crypto stuff that parallelises well? Rogaways hashs look interesting, and nDES offers an obvious process network for pipelining, but what about things like running multiple interleaved CBC streams in parallel, with each stream starting off from a different IV? I can't think of any practical ways of speeding up a single RSA operation, although twice as many processors obviously gives twice the thruput.
One way to speed up RSA is to compute the series m^2, m^4, m^8, m^16... on one processor and then multiply together the values for each one bit in the decryption exponent on the other processor. It's only about a 33% speedup tho. The other possibility is to compute the two 'halves' on seperate processors when doing decryption. I don't know of any way to parallelize it to more than two processors for encrypting or more than four for decrypting. Discrete log systems are a bit more interesting in this respect - you can precompute the series g^2, g^4, g^8... (I think cryptolib does this) then the initial parameter in a Diffie-Hellman exchange is simply the product of some elements of that series. The multiplications can be carried out in parallel in a hierarchial fashion which can be completed in O(log(log(m))) time, where m is the modulus (assuming you have enough processors). However, for the second half of the exchange, you can't precompute anything so you are stuck with the same problem as with RSA.
participants (2)
-
Matthew Ghio
-
Simon Spero