New subject: Old Hat 2

17 Dec 2003

      Here's what is likely going on right now, though its
availability to LE is questionable:

NSA probably maintains surveillance of all or nearly all
encrypted remailers. They log and archive the content of
all traffic. First arrivals of messages in the remailer
"cloud" provide source identification in many cases, so
a database is gradually built of all the originating
email addresses and IP addresses that have ever sent
messages to a remailer. The same is true for the messages
exiting the cloud, although many of those go to public
lists or newsgroups. Sometimes, perhaps even often, the
remailer cloud is so little utilised that one-hop or even
multi-hop chained remalings are easily identified as to
source and final destination. All goes into the database.
Spook-style stylometers develop profiles of the users.
Over time, more and more of the unidentified profiles
become linked to known users. All it takes is one slip,
or one piece of bad luck of low traffic levels. Linkage
is also done on a probabilistic basis, many-to-many.
Little human intervention is used except to develop and
tune the correlation mechanisms. Filters aimed at content
that might help correlate messages with senders is more 
important to the spooks than Echelon Dictionary filters.
The priorities in getting a handle on something like the
remailer cloud are quite different than the priorities in
scanning ordinary traffic for content. Ultimately, though,
the regular content filters come into play, but in this
ongoing exercise of cat and mouse, the bulk of the effort
is directed at developing the combination of surveillance
and intelligent processing that can provide the basis for
identifying, at least to some knowable probability, the
sources and destinations of given remailer messages.

Source and destination correlation can benefit greatly
from knowledge of conventional email correspondent 
relationships. Most nonprofessional-spook people who send 
encrypted remailed email to a non-public destination are 
also likely to correspond with that destination in the 
clear. Archival logging of traffic is like a time machine -
even if one ceases conventional correspondence with
another, any past traffic can reveal the correspondence
relationship.

Mathematical correlation, given a very large amount of raw
data consisting of timestamped message events, can reveal
quite a lot about likely correspondence relationships.
Even given a remailer network full of chaff, with reordered
message pools and such, just running correlations on the
end point data - who sent and when, and who received and when
- can develop likely correspondent pairs.

Be assured that the manipulation of such data relies heavily
on probablistic clustering supported by database mechanisms
not seen much in business environments. Instead of having
firm relationships within an order or two of magnitude of
the terminal node population, I would want something that
allows having very large numbers of fuzzy relationships
numbering many orders of magnitude greater than the node
population. Nodes A, B, C and D send and receive remailer
traffic. Everything that A has ever sent is clearly
related to A, but each item can and likely would also
have a probabilistic relationship to B, C, and D. The
probability assignments would be updated as later analysis
and later data more clearly identify who may have received 
which messages. Content comes into play here, too.

At any time it is possible to query the database for
the likely correspondents of A, and the likely messages
A may have sent to B, C, or D, and get a result ordered
by confidence. Similarly, it is possible to query for
the likely sender(s) and recipient(s) of a given message.
Occasionally, data may be firmed up - meaning achieving 
higher confidence levels of the relationships - by access 
to someone's PC, by whatever means, by stylometry, by 
specific content of public messages, and by fortuitous 
traffic analysis successes owing to the paucity of remailer 
traffic.

It's also certain that to provide realistic cases that can
be fully revealed to measure the efficacy of the various
algorithms and methods involved, spooks use the remailer
network themselves. That's the only way they can be sure
to have any traffic that can be fully analyzed as a yardstick
for the guessing games played by their software.

A few rules of thumb result from even cursory examination
of the likely environment:

1. Do not send clear messages through the remailers 
   except to public lists and newsgroups. Sending clear
   messages to private correspondents provides the 
   watchers with rich style and content material linked
   to a correspondent. It is usually easy then to link
   it back to you, and a firm correspondent pair is then
   established in their database.

2. Do not switch between clear and encrypted, remailed 
   communication with the same correspondent. If your
   correspondent relationships are mapped via clear,
   conventional mail, those mappings can be applied to
   your encrypted, remailed mail to greatly narrow 
   down the possible recipients, and quite likely 
   combine with traffic analysis to correlate your
   outgoing message with one arriving at your
   correspondent. If the only communications you have
   with your deep contacts is by encrypted, chained
   remailer, only fortuitous or statistical correlation
   analysis or access to your or your correspondent's
   host machines will tie the two of you together.

3. Do not send traffic to significant correspondents
   when traffic levels are likely to be low. Good
   luck at guessing this. Probably the best way to
   get a handle on traffic levels is to run a remailer
   yourself.

4. Send lots of chaff. Chain some messages through the
   remailer cloud every day. If you have time and the
   ability, write and release something that will allow
   large numbers of people to send lots of chaff.

5. Ultimately, the only way the remailers will provide
   what might be described as Pretty Good Security will
   be when we have software that maintains a regular
   or random rate of messages to and from the remailer
   cloud, a stream into which the meaningful messages
   can be inserted with no visible change in traffic.
   Until then, the best we can do is try to keep traffic
   levels up, and to send and receive frequently enough
   to frustrate end-to-end traffic analysis.

6. Don't send anything that can have grave consequences.

7. Take names. Always take names. Some day...

FUDBusterMonger

It Ain't FUD til I SAY it's FUD!

Re: Old Hat 2

Anonymous

ichudov＠Algebra.COM

tags

participants (2)