Re: Latency vs. Reordering (Was: Remailer ideas (Was: Re: Latency vs. Reordering))
In message <9408051716.AA14773@ah.com> Eric Hughes writes:
Back to the start, I guess.
Specifically cryptographic elements are easily added to the system * packets can be delayed for random intervals
Let me repeat:
REORDERING IS OF PRIMARY IMPORTANCE FOR REMAILER SECURITY.
ADDING LATENCY IS NOT.
No need to shout, we heard you the first time. ;-) In a system that is carrying continuous traffic, random packet delay is functionally identical to packet reordering. If messages are fragmented, random delays on sending packets out is functionally identical to reordering. More importantly, RemailerNet as described defeats traffic analysis by more significant techniques than reordering. Reordering is a weak technique. The introduction of noise, 'MIRV'ing of messages, fragmentation of messages, random choice of packet routes, and encyphering of all traffic are stronger techniques. -- Jim Dixon -- +-----------------------------------+--------------------------------------+ | Jim Dixon<jdd@aiki.demon.co.uk> | Compuserve: 100114,1027 | |AIKI Parallel Systems Ltd + parallel processing hardware & software design| | voice +44 272 291 316 | fax +44 272 272 015 | +-----------------------------------+--------------------------------------+
In a system that is carrying continuous traffic, random packet delay is functionally identical to packet reordering. OK. Prove it. Here are some difficulties I expect you'll find along the way. First, "continuous traffic" is the wrong assumption; some sort of multiple Poisson distribution for arrival times is. This is by no means a hypothetical. The backoff algorithms for TCP had to be developed because packet streams are not continuous, but bursty. There is such a thing as too many packets arriving at a router simultaneously. Routers don't swap packets to disk when they run out of RAM; they drop them. So given any relation between arrival interval, processing time, and machine capacity, there some _percentage_ of the time that the router is going to overflow exactly because the traffic is not continuous. Second, the beginnings and endings of operation are special. The idea of "stochastic deconvolution" hits me immediately, throwing out completely any reasoning based only on steady state assumptions. Third, these two effects interfere with each other, as there are bursts of silence in Poisson arrival times which will tend to reset the deconvolution. Fourth, the problem is incompletely specified, since the distribution of random added latencies is not made specific. If I assume a flat distribution over a given number of message intervals, that's not the same as assuming a geometrically decreasing distribution, or some other distribution. I'd guess there are more. If messages are fragmented, random delays on sending packets out is functionally identical to reordering. This is false; a system that concentrates on reordering has provably better average latency that one based only on adding latencies. Consider the following. If I send out a message sometime between two messages, I've acheived no more reordering (the significant thing, remember) than if I sent out that same message immediately after the arrival of the first of the two bracketing messages. So I can take _any_ latency-adding system and reduce its average latency with minimal effect on reordering by the following modification. When a message comes it, each message in the queue is tagged to go out at some time relative to present. For each of these messages, I can calculate the probability that no other incoming message will arrive before a particular outgoing time. Pick some probability bound close to 1, and send out all messages with probability greater than the cutoff _now_, before waiting for their time to be up. The decrease in reordering can be normalized to zero by lengthening the time scale of the added latencies. You'll then find that the modified system shows lower latency. And that's only the first inequivalency. Latency-adding systems are less efficient at memory usage than reordering systems. Reordering systems can get pretty close to 100% use, since the queue can be kept full, as in Hal's threshold sending scheme. The random delays can't have full usage, because there's an maximum to memory; it can't be borrowed like money when you temporarily need more of it. The analysis has similarities to gambler's ruin. Anyone else care to point out more inequivalencies? More importantly, RemailerNet as described defeats traffic analysis by more significant techniques than reordering. Reordering is a weak technique. WHAT?? Anyone else listening to this: I believe the above quoted two sentences to be distilled snake oil. The introduction of noise, 'MIRV'ing of messages, fragmentation of messages, random choice of packet routes, and encyphering of all traffic are stronger techniques. Encyphering is necessary. Reordering of quanta is necessary. "MIRV" messages may actually decrease security; multiple routes may decrease security; fragmentation may decrease security. Noise messages may not be resource effective. All the above claims require some justification, and I have seen nothing robust yet. Eric
I had an interesting thought. Remailer networks are hard to analyze, with messages whizzing this way and that. But Tim pointed out that if you have N messages coming in to the network as a whole and N going out, all that zigging and zagging really can't do much better than N-fold confusion. This suggests, that IF YOU COULD TRUST IT, a single remailer would be just as good as a whole net. Imagine that God offers to run a remailer. It batches messages up and every few hours it shuffles all the outstanding messages and sends them out. It seems to me that this remailer provides all the security that a whole network of remailers would. If this idea seems valid, it suggests that the real worth of a network of remailers is to try to assure that there are at least some honest ones in your path. It's not to add security in terms of message mixing; a single remailer seems to really provide all that you need. Hal
This suggests, that IF YOU COULD TRUST IT, a single remailer would be just as good as a whole net. If you could trust it and if it were large enough. There's scaling reasons to use multiple remailers as well. Consider a network of mailers running on a private network with link encryptors. Whenever you join two nodes with a full-time link encryptor you remove the information about message arrival and departure, which is to say that you remove all the remaining information not already removed by encryption and reordering. In other words, two remailers (physical) hooked up with link encryptors are almost the _same_ remailer for purposes of traffic analysis, and almost only because of the link latency and relative bandwidth. Likewise, multiple remailers hooked up with link encryptors all collapse to the same node for traffic analysis. Open links between two remailers which are connected otherwise by a path of encrypted links turn into an edge from the collapsed remailer set back onto itself. Simulating any of the salient features of a link encryptor over the Internet is an interesting exercise, particularly in regard to price negotiation with your service provider. Eric
I've left the subject line unchanged, to show an unusual _triple nesting_ of subjects! Also, I just got back after a weekend away, and so am only now seeing these interesting messages about remailers, entropy, etc. A subject of great interest. Hal Finney writes:
I had an interesting thought. Remailer networks are hard to analyze, with messages whizzing this way and that. But Tim pointed out that if you have N messages coming in to the network as a whole and N going out, all that zigging and zagging really can't do much better than N-fold confusion.
Yes, in _principle_, the theory is that Alice could be the only the remailer in the universe, and still the "decorrelation" of incoming and outgoing messages would be good. For example, 100 messages go in, 100 leave, and no one can make a better 1 chance in 100 chance of matching any single input to any output. From a _legal_ point of view, a wild guess, hence inadmissable, blah blah. (From a RICO point of view, to change subjects, Alice might get her ass sued. Or a subpoena of her logs, etc. All the stuff we speculate about.) But we can go further: a single remailer node, or mix, that takes in 1 input and produces 2 outputs breaks the correlation capability as well. However, we all "know" that a single remailer doing this operation is in some very basic way less "secure" (less diffusing and confusing, less entropic) than a network of 100 remailers each taking in hundreds of messages and outputting them to other remailers. Why--or if--this hunch is valid needs much more thinking. And the issues need to be carefully separated: multiple jurisdictions, confidence/reputation with each remailer, etc. (These don't go to the basic mathematical point raised above, but are nonetheless part of why we think N remailers are better than 1.) By the way, there's a "trick" that may help to get more remailers established. Suppose by some nefarious means a message is traced back to one's own system, and the authorities are about to lower the boom. Point out to them that you are yourself a remailer! This is more than just a legalistic trick. Indeed, as a legalistic trick it may not even work very well. Nonetheless, it helps to break the notion that every message can be traced back to some point of origin. By making all sites, or many sites, into remailers, this helps make the point that a message can never be claimed to have been traced back "all the way." There are lots of interesting issues here, and I see some vague similarities to the ideas about "first class objects"...in some sense, we want all nodes to be first class objects, capable of being remailers. (There's an even more potentially interesting parallel to digital banks: admit the possibility of everybody being a digital bank. No artificial distinction between "banks" and "customers." Helps scaling. And helps legally. I'm not saying we'll see this anytime soon, especially since we have no examples of digital banks, period. But a good vision, I think.)
This suggests, that IF YOU COULD TRUST IT, a single remailer would be just as good as a whole net. Imagine that God offers to run a remailer. It batches messages up and every few hours it shuffles all the outstanding messages and sends them out. It seems to me that this remailer provides all the security that a whole network of remailers would.
If this idea seems valid, it suggests that the real worth of a network of remailers is to try to assure that there are at least some honest ones in your path. It's not to add security in terms of message mixing; a single remailer seems to really provide all that you need.
Yes, which is why increasing N increases the chance that at least one non-colluding remailer is being used. A trick I have long favored--and one I actually used when we played the manual "Remailer Game" at our first meeting--is to *USE ONE'S SELF* as a remailer. This still admits the possibility of others being colluders, but at least you trust yourself and get the benefits described above. [The alert reader will not that a spoofing attack is possible, as with DC-Nets, in which all traffic into your node is controlled in various ways. The graph partition work Chaum does, and others who followed him do (Pfaltzmann, Boz, etc.), is very important here.] Practically speaking, we need to see hundreds of remailers, in multiple legal jurisdictions, with various policies. Messages routed through many of these remailers, including one's own remailer, should have very high entropies. I still say that a formal analysis of this would make a nice project for someone. --Tim May -- .......................................................................... Timothy C. May | Crypto Anarchy: encryption, digital money, tcmay@netcom.com | anonymous networks, digital pseudonyms, zero 408-688-5409 | knowledge, reputations, information markets, W.A.S.T.E.: Aptos, CA | black markets, collapse of governments. Higher Power: 2^859433 | Public Key: PGP and MailSafe available. "National borders are just speed bumps on the information superhighway."
participants (4)
-
Hal -
hughes@ah.com -
jdd@aiki.demon.co.uk -
tcmay@netcom.com