Travis H. wrote:
Part of the problem is using a packet-switched network; if we had circuit-based, then thwarting traffic analysis is easy; you just fill the link with random garbage when not transmitting packets. ....
OK so far ...
There are two problems with this; one, getting enough random data, and two, distinguishing the padding from the real data in a computationally efficient manner on the remote side without giving away anything to someone analyzing your traffic. I guess both problems could be solved by using synchronized PRNGs on both ends to generate the chaff.
This is a poor statement of the problem(s), followed by a "solution" that is neither necessary nor sufficient. 1) Let's assume we are encrypting the messages. If not, the adversary can read the messages without bothering with traffic analysis, so the whole discussion of traffic analysis is moot. 2) Let's assume enough randomness is available to permit encryption of the traffic ... in particular, enough randomness is available _steady-state_ (without stockpiling) to meet even the _peak_ demand. This is readily achievable with available technology. 3) As a consequence of (1) and (2), we can perfectly well use _nonrandom_ chaff. If the encryption (item 1) is working, the adversary cannot tell constants from anything else. If we use chaff so that the steady-state traffic is indistinguishable from the peak traffic, then (item 2) we have enough randomness available; TA-thwarting doesn't require anything more. 4) Let's consider -- temporarily -- the scenario where the encryption is being done using IPsec. This will serve to establish terminology and expose some problems heretofore not mentioned. 4a) IPsec tunnel mode has "inner headers" that are more than sufficient to distinguish chaff from other traffic. (Addressing the chaff to UDP port 9 will do nicely.) 4b) What is not so good is that IPsec is notorious for "leaking" information about packet-length. Trying to make chaff with a distribution of packet sizes indistinguishable from your regular traffic is rarely feasible, so we must consider other scenarios, somewhat like IPsec but with improved TA-resistance. 5) Recall that IPsec tunnel mode can be approximately described as IPIP encapsulation carried by IPsec transport mode. If we abstract away the details, we are left with a packet (called an "envelope") that looks like ---------------++++++++++++++++++++++++++ | outer header | inner header | payload | [1] ---------------++++++++++++++++++++++++++ where the inner header and payload (together called the "contents" of the envelope) are encrypted. (The "+" signs are meant to be opaque to prying eyes.) The same picture can be used to describe not just IPsec tunnel mode (i.e. IPIP over IPsec transport) but also GRE over IPsec transport, and even PPPoE over IPsec transport. Note: All the following statements apply *after* any necessary fragmentation has taken place. The problem is that the size of the envelope (as described by the length field in the outer header) is conventionally chosen to be /just/ big enough to hold the contents. This problem is quite fixable ... we just need constant-sized envelopes! The resulting picture is: ---------------++++++++++++++++++++++++++++++++++++ | outer header | inner header | payload | padding | [2] ---------------++++++++++++++++++++++++++++++++++++ where padding is conceptually different from chaff: chaff means packets inserted where there would have been no packet, while padding adjusts the length of a packet that would have been sent anyway. The padding is not considered part of the contents. The decoding is unambiguous, because the size of the contents is specified by the length field in the inner header, which is unaffected by the padding. This is a really, really tiny hack on top of existing protocols. If your plaintext consists primarily of small packets, you should set the MTU of the transporter to be small. This will cause fragmentation of the large packets, which is the price you have to pay. Conversely, if your plaintext consists primarily of large packets, you should make the MTU large. This means that a lot of bandwidth will be wasted on padding if/when there are small packets (e.g. keystrokes, TCP acks, and voice cells) but that's the price you have to pay to thwart traffic analysis. (Sometimes you can have two virtual circuits, one for big packets and one for small packets. This degrades the max performance in both cases, but raises the minimum performance in both cases.) Remark: FWIW, the MTU (max transmission unit) should just be called the TU in this case, because all transmissions have the same size now!