Remailer Stats (was: Swiss Bank in a Box)

Ryan Lackey ryan at havenco.com
Mon Dec 24 22:00:04 PST 2001


I just did some statistical analysis on a logfile accidentally kept by
an MTA which I didn't know was keeping it a few months ago.

(which I've deleted, and made sure is not being kept anymore)

Remailer traffic: approximately 3000-5000 messages/day (mix II + cpunk)

Unique input addresses over 13 days: 539 (which would include case
variations, aliases, etc.)

This is *not* statistically significant.  At the time the stats were
taken, HavenCo was a new remailer.  Also, if *I* were an end-user
using remailers, I would use havenco as an *exit* remailer, not an
entry remailer, since it is topologically close to no one; one would
wish to get the message into the remailer network as soon as
possible.  HavenCo only has, say, 3000 active tcp sessions at any one
time in or out of the facility, so if someone were monitoring, that's
a small set too.

A completely out of ass argument is that this points to a stable
remailer-user population of 1-10k users, with <1% responsible for most
of the traffic (web-based gateways, pingers, spammers, people using 
scripts to send abuse messages)

IMO most "real" remailer users are actually using web-based gateways,
rather than initial SMTP, to enter the network.  So these stats are
utterly flawed anyway.

A population of even 10k is certainly well within FBI (or even my
personal) resources to investigate, at least to a cursory level.
Given that most likely a lot of those users are located at large and
"compliant" ISPs or mail providers, gaining access to their normal
outgoing mail feed, connection logs, and identity information is just
not that hard.  Especially if you're willing to break the law.

So, I'd estimate the cost to recover the *average* remailer user's
identity at < USD 500 (assuming repeated use).  Monitor the ~15 big
remailers (knocking the ones off which you can't monitor, either
through legal or covert technical means...spamming via them as exit
and complaining to isp might be good enough).  You should be able to
get samples of mail from the same users without much difficulty,
monitoring their non-remailed communications at the originating ISP.
Presumably, the users may be sending some volume of related traffic
via non-remailers; this may be sufficient (could "used an evil
anonymous remailer" be sufficient to get wiretap authority on regular
mail in a terrorist case?)  

Traffic analysis is *hard* to defeat.  Especially when your set is
small enough that you can do out of band analysis on all the participants.

If ZKS had, as some were saying, <500 customers for the freedom
product, it would have been even worse for them.  With a small enough
number of customers, you can blackbag users, intimidate, or whatever.

I think unless there is a compelling [non-privacy reason] for end-users
to participate in a mix, it's hopeless to try for >10k users.  If
participation in a mix were incidental to some other activity, such as
posting normal text postings to usenet, browsing a very popular
website which happens to use padded SSL for all communications, using
an encrypted IM system which did a mix of some kind as an automatic
part of operation, or participated in a large-scale p2p system.  I
think if participation in a mix is incidental to *normal* email use,
that would be sufficient as well.

I think there are a few possible routes to doing this:

0) Cypherpunk applications continuing to be few and far between, not
up to "commercial" standards, and not widely deployed.  The default.

1) Cypherpunks writing widely-used applications: this has been tried
in the past and seems to fail.  Few developers of widely-deployed
commercial software are explicitly cypherpunks, even though they may
be libertarians.  They often don't understand crypto at any deep
level, at BEST they use a pre-packed crypto library like OpenSSL if it
is a mostly drop-in replacement for a non-crypto counterpart, or
otherwise widely documented.  Witness the large number of conventional
security weaknesses in applications today, failure to manage keys
properly in the few crypto applications, etc.

2) Cypherpunks modifying existing open-source applications to
incorporate crypto.  Despite the open source ideal, most developers
don't like taking "outside" patches, especially where security or
network functions are concerned.  Applications designed for modularity
are a lot better.  Many apps are a "moving target".

3) Cypherpunks developing "libcypherpunks" or some other library which
includes cryptographic secure, anonymous, etc. replacements for
existing functions -- the md5 passwords vs. crypt in various unix OSes
is a good example.  Things which have no conventional analogue should
be packaged in a programmer-friendly way.  Some of these could include
distributed operations; general distrbution is really hard, but
certain specialized functions are easy, and having a good way to do it
would support experimentation.

4) Cypherpunks producing a "this is how to develop secure, privacy
protecting applications" document and training for conventional developers.
I assume most people who develop worthwhile software already
understand roughly "why" they should do this kind of thing, but that
don't know "how".

5) Participation in standards bodies to make "MUST" or "SHOULD" security 
for protocols (with the hope that standard/reference implementations
will be available).  I hate standards bodies, but some people seem to
get off on this, and it's not unhelpful.

-- 
Ryan Lackey [RL7618 RL5931-RIPE]	ryan at havenco.com
CTO and Co-founder, HavenCo Ltd.	+44 7970 633 277 
the free world just milliseconds away	http://www.havenco.com/
OpenPGP 4096: B8B8 3D95 F940 9760 C64B  DE90 07AD BE07 D2E0 301F





More information about the cypherpunks-legacy mailing list