traffic analyzing Chaum's digital mix
-----BEGIN PGP SIGNED MESSAGE----- I have been thinking about the problem of traffic analysis of a remailer. More specifically, the problem is how can Eve trace Bob, who is communicating with Alice through an ideal Chaumian digital mix? (As most of you know, current remailers are missing many of the features of the digital mix Chaum specified in his CACM paper (at ftp://ftp.csua.berkeley.edu/pub/cypherpunks/papers/chaum.digital-mix.gz ), thus making them extremely vulnerable to anyone with non-trivial resources.) The simplifying assumptions I use here are: 1. there is one mix, which is perfectly secure and trustworthy (note that multiple mixes do not increase untracebility over a single mix if it is perfectly secure and trustworthy) 2. anyone can monitor all traffic in and out of the mix, but no one can link an incoming message with an outgoing one The basic approach is to use this raw traffic information to calculate a SCORE for each user of the remailer with respect to Alice, where the user with the highest SCORE is the person Alice is most probably communicating with. The idea is that with a Chaumian mix, every time Alice sends a message to Bob there is always a pattern of Alice sending a message to the mix, followed by Bob receiving a message from the mix during the next batch. By counting the number of such correlations for each user over a period of time, and taking into account the fact that users who receive more messages from the mix will have higher numbers of coincidental correlations, a SCORE can be calculated so that it would be a good indication over the long run of the probability that a particular user is communicating with Alice. For a digital mix that does batching based on a fixed number of incoming messages, the SCORE for a user U can be calculated in the following way: 1. for each mix batch i, calculate P(i)=lesser(# of messages sent by Alice, # of messages subsequently received by user U) 2. after a period of time t, calculate Q=sum(P(i)) 3. calculate the average value of Q of users with similar usage patterns as user U 4. SCORE(U) = Q / average(Q) Now whether or not this approach actually works depends on whether the number of users with SCORE higher than Bob's SCORE converges to 0 as time t increases, and how quickly it converges. Answering these two questions will require modeling the usage patterns of Alice, Bob, and the mix as a whole. I'll try to do this for some simple cases in a later post. Wei Dai -----BEGIN PGP SIGNATURE----- Version: 2.6.2 iQCVAwUBLx9V8Tl0sXKgdnV5AQFg2gQAhEJ1wgf/XaqMOlVcvYfwgOeR2cKPPyQM fitAJdXKkEXvTtUa3biByvVK86SLQmW/0cLME76UsmaMUY+FVncBoKwlRGKJnDci 6b7VtEW2ZkZKntUieTXFaVbSgI5XL/lIqQu2FFS6wuxH1KayxFeDLiTD6HWfa8t6 sedGrTb5f2I= =Vjum -----END PGP SIGNATURE-----
participants (1)
-
Wei Dai