[tor-talk] End-to-end correlation for fun and profit

Maxim Kammerer mk at dee.su
Fri Aug 24 16:51:29 PDT 2012

On Sat, Aug 25, 2012 at 1:12 AM, Mike Perry <mikeperry at torproject.org> wrote:
> The Raccoon has made a believer out of me, but there are some limits to
> both of his/her proofs.. The full proofs can still be found here:
> http://web.archive.org/web/20100416150300/http://archives.seul.org/or/dev/Sep-2008/msg00016.html

Wrt. the first proof, it seems to me that the assumed correlation
accuracy rate of 99.9% is incredibly low, and I think that the Raccoon
recognized that by referring to sampling and retention at the end of
his post. With the targeted attack that's similar to bExample 3b in
Raccoon's post that I described in my previous comment here, where one
analyzes all exit traffic without missing packets, I would expect the
correlation accuracy (and as a result, match confidence) to
exponentially approach 100% very quickly with the number of relevant
packets seen, and extremely quickly if the traffic is interactive
(i.e., browsing).

Actually, c/n of 30% in bExample 3b is close to the 25% that's
discussed in the OP here, so let's redo the example with c/n=25% and
different correlation accuracies (leaving the other numbers intact):

(using bbc -lb)
ca  = 0.999
pm  = (1/5000) * (0.25)^2
ca*pm / (pm*ca + (1-pm)*(1-ca))

ca  = 0.999
ca = 0.9999
ca = 0.99999
ca = 0.999999
ca = 0.9999999
ca = 0.99999999
ca = 0.999999999

So reducing correlation accuracy error to 10^-9 will give you 99.99%
confidence in end-to-end correlation match. I suspect that a few
seconds of interactive traffic will give you a correlation accuracy
that's much better than a 10^-9 error.

