----- Original Message ----- From: "Anonymous" <nobody@paranoici.org> To: <cypherpunks@lne.com> Sent: Wednesday, August 08, 2001 4:48 PM Subject: CDR: Re: Remailer Phases
An Unknown Party wrote:
On Wed, 8 Aug 2001, Anonymous wrote:
We need a good mixmaster net.
working remailer: 1. Average latency less than 5 min
Bad. See the papers done on threats of traffic analysis/spam attacks against remailers.
"Average latency" exists. What do you think it should be?
a) 5 minutes b) 5 hours c) 5 days d) 5 months e) longer
I like a).
As has been pointed out it's not latency but latency/messages that matters. If there are 2 messages a day going through the system then 5 minutes is simply not enough, it will be completely traceable. OTOH if there are 5000/sec going through the system then 5 minutes is arguably overkill. I think that with the current relatively low level of usage 24 hours is the minimum average latency that should be used. Of course this is across the entire Mixmaster net where messages could be dispersed enter at any location and leave at any location. Based on this I believe that each node should maintain a list l of known other nodes. It should of course select a delay time at random say up to t time. Assuming that the server will choose a new exit point at perfect random from itself (where it will exit immediately on timer expiration) and l this gives an equation for t in hours f(t) = necessaryDelay; f(t) = t + ((|l|-1)/|l|)f(t), by finding the solution for t you will find the necessary average t. I don't have the time to solve this right now but given a list l of magnitude 100 the value of t will be significantly greater than 5 minutes. So the remaining question is what value to use for necessary delay? This is also dependent on the number of known nodes. All nodes must be equally likely for the transfer for obvious reasons. Based on this I believe that necessaryDelay needs to be greater than the time needed to receive |l| messages. The reason for this is fairly simple, at the extreme we have only one possible message going through the system at once, this is obviously bad, an observer simply watches the output of the system, and what comes out is what they are looking for. with at least |l| messages going through a node and |l| necessary delay time (note that as the magnitude of l increases the entire system slows, this could be bad, I'm sure I'm missing something that will dominate on scaling) each message can be mistaken for other messages. Since it is expectable that the usage of remailers will increase at least as fast as the size of l the latency will likely decrease over time. If there is sufficient demand it is entirely reasonable to reduce from |l| to a value of at least 2, but I don't believe this is reasonable at 100 or even 1000 remailers. If the amount of remailer usage increases to the point where > 20% of email traffic goes through remailers it may become feasible to lower this limit, but probably unnecessary because this scaling would result in lowered delays as a matter of recomputation. What is surprising is that this can be automatically calculated in a rather interesting way. If each still maintains l it is entirely possible for a remailer to create a message pool of size |l| and when a new message arrives if the pool is full randomly select 1 entry to be flushed towards it's destination _prior_ to the insertion of the new message, with an autoflush happening every sqrt(|l|) hours (perhaps by insertion of null message). This would cause a ripple effect each time a message was sent which could be seen as a problem by the uninitiated because there would be a decided pattern of travel with each message entering the system causing activity along a random walk. To an amateur this would appear to be a flaw in the system, except that the message being sent by the ith node is not the message sent by the i-1th node, so the risk is non-existent, and since the average path length is going to be k=2((|l|-1)/|l|), and the random walk is going to choose from |l|^k paths, which we can approximate by |l|^2 this offers a sufficient growth rate to block tracing. If this growth rate is unacceptable we can also add a minimumHops value to the protocol increasing the number of paths to |l|^minimumHops + |l|^2, minimumHops should be chosen to be a suitable number, based on current assumptions I would recommend minimumHops = logbase|l|(2^128), making the |l|^2 only a footnote as the total would be greater than 2^128 giving an enormous difficulty in even selecting a duplicate path. Mitigating factors are present however, because each message can only exist in one of |l| locations, so the maximum difficulty in guessing is still bounded in that fashion, leaving the reasonable values for minimumHops at around 10 for a 100 node network. Joe