TCQ ascii version

Sat Oct 10 19:54:13 PDT 1998

(c) 1998 Frank O'Dwyer. Verbatim copying and redistribution in any
medium is permitted provided this copyright notice is preserved on all
copies.

			TCQ ("They Seek You")
		     Mass-market Strong Anonymity
			   Frank O'Dwyer
                            fod at brd.ie
Introduction

This document sketches the design goals and features of "TCQ", a work
in progress with the aim of providing a mix of features offered by
ICQ, IRC and email -- but using cryptography to provide strong
anonymity and privacy. As the name implies, TCQ "(They Seek You") is
similar in purpose to ICQ ("I Seek You"), but with a very different
emphasis. TCQ is motivated by the recognition that people being sought
do not always wish to be found, and that those doing the seeking
(people, organisations, or governments) can be threatening. TCQ is
also motivated by a desire to make the use of strong anonymity easier
and much more common, riding the current wave of popularity of
"anonymous" services like IRC, ICQ and webmail.

Status

TCQ is in the early stages of being designed and prototyped; this
early draft is being made available now to solicit comments (and
assistance) from anyone with an interest in this project.  Some
initial design decisions and the rationale for them are also
documented here. It is planned to make TCQ freely available in source
and binary form, under a liberal licence (such as MozPL, or a
BSD-style licence). If you can contribute ideas, or code on those
terms, you are welcome to.

Background

Basic political background to this, intended for general consumption
(including people who may not be aware of what GAK is, etc.). This is
just a sketch. - FOD

Anonymous communication is hugely popular. Vast numbers of
non-technical people use services such as ICQ (www.mirabilis.com) and
Hotmail (www.hotmail.com) because they want to communicate privately
and anonymously. However, these systems only offer a weak form of
privacy and anonymity. They use unencrypted links (which can easily be
tapped), unencrypted messages on centralised servers (which can easily
be commandeered or shut down) and well known sites (to which access
can easily be blocked). They also use communication links that can be
traced to individuals, albeit requiring some effort.

Systems which offer much better privacy and anonymity are available
(for example PGP, Cypherpunks remailers), but relatively few people
use those systems. This is partly because relatively few people know
about them, but mainly because they are difficult to use. There is a
lot of additional theory on strongly anonymous and strongly private
communications [ref Cypherpunks literature here], but few
implementations of any sort and no mass-market implementations. (that
I am aware of - FOD).

In the case of strong anonymity, most services thus far have been
based on email. Email has a strong culture of being suspicious of
"fake names", and this is another factor in the lack of mainstream
take-up of strong anonymity. However, with services such as IRC and
ICQ, "fake names" are the norm -- these cultures are now becoming
mainstream and they may be more receptive to strong anonymity.

At the same time, an individual's ability to communicate with true
privacy and anonymity is under threat from various quarters, including
for example:

o Government Access to Keys. Proposals designed to allow governmental
access to private communications, (including encrypted private
communications between individuals) have been circulating for some
time in many jurisdictions including the United States and Europe.
These proposals are typically sold on the back of "hard cases", such
as a claimed need for law enforcement to be able to intercept the
communications of paedophiles and terrorists.

o Physical border inspections. Users who cross borders may be subject
to customs inspection of any computer equipment they may be carrying;
for example, if their laptops contain (or appear to contain) encrypted
information, they may be required to provide the key(s). On the other
hand, if their information is not encrypted, then users' confidential
information (private diaries, letters, business plans) is
exposed. There are rumours that some countries (such as the UK)
already perform such inspections.

o Centralised Services. There is a growing trend to move
security-critical services off users' own PCs and onto the web. Prime
examples include online backup and webmail. Even non-technical people
often instinctively react to this by becoming "anonymous" (e.g. lying
about their names and location). However, this is still a very
hospitable climate for attack by governments and other parties, and in
any case, the level of anonymity is weak. Also, in so far as such
services benefit privacy/anonymity (e.g. webmail), they are easily
monitored, blocked, or shut down.

o Strongly Identifying Protocols. Commercial security products, some
of which apparently offer strong encryption and privacy, are almost
universally based upon "X.509 certificates" and trusted third parties
known as "Certification Authorities" (CAs). This trend is worrying for
a number of reasons:

  o X.509 is typically used as a strongly identifying protocol, and the
  analogy is often made between X.509 certificates and a driving licence
  or passport. Although X.509 can facilitate private communications, it
  effectively prevents anonymous private communications since by design
  it requires you to show your "passport" to strangers.  (X.509 does not
  have to be used this way, but people are often encouraged to use it
  that way.)

  o Use of X.509, and in particular establishment of public CAs, is
  likely to be heavily regulated. Proposals have been made to "licence"
  CAs, for example. Regulated management of individuals' encryption keys
  is not a healthy development for online privacy.

  o Extensions for governmental access to keys have been made to X.509,
  and X.509 is at the heart of some governmental strategies to mandate
  key access.

  o If Public CAs take off then they will turn out to be single points
  of failure and an extremely attractive attack target. Anyone
  (individual, corporation or government) who can compromise a public CA
  is in an excellent position to attack the communications of all users
  served by that CA.

o Logical border inspections. Communications that cross cyberspace
borders may be subject to traffic monitoring and/or filtering. This
might occur at corporate firewalls or at well defined national "entry
points" and "exit points".

o Non-repudiation. Some systems for private communications have as a
side effect that it is difficult for a person to plausibly deny
sending or receiving a particular message. This is sometimes welcome,
but it can be a problem if the person thinks he or she is
anonymous. For example, a warm body might be linked to a message just
by being found in physical possession of related cryptographic keys.

Design goals

These goals are listed to give an idea of the features TCQ is intended
to provide. It is likely that TCQ will meet these goals only in a
piecemeal fashion as it evolves, doing the easy bits first and leaving
the hard bits till later. We will see. -FOD

1. Casual Anonymity. TCQ should offer a messaging service based on
disposable cryptographic identities. This level of anonymity is weak,
but it is easiest to implement and it is a start.

2. Relative Anonymity. It can be useful to have multiple identities
(or "nyms") and for some identities to be less anonymous than
others. A nym might be anonymous, identified to a friend, identified
to a select group, or generally identified to everyone. For example,
this allows for applications where the sender is anonymous but the
receiver is not. TCQ should allow people to selectively reveal their
identities (including partial information about themselves) while
remaining fully anonymous to others. TCQ should also allow multiple
identities to be maintained, with varying degrees of anonymity, and
allow these identities to be kept at arm's length from each
other. Lastly, it should be possible for a TCQ sender to selectively
reveal "also known as" relationships between one nym and
another. (lots of requirements here -- might be worth splitting this
up).

3. Strong Anonymity. TCQ should be extensible to use, or natively
offer, stronger anonymity services (mixes, remailer-type
functionality, etc.).

4. Privacy. It should not be possible for unintended recipients to
read TCQ messages (but see the caveats below).

5. Integrity. It should not be possible for a third party to tamper
with TCQ messages. (again see the caveats below.)

6. Repudiability. It should be possible to plausibly deny having sent
a TCQ message, or at least know when this will not be the case.

7. Ease of use.

   o TCQ should be easy to install and use for average users of ICQ,
   webmail and IRC, and should not require special cryptographic
   expertise.

   o TCQ should also be easy to use for advanced users, and allow them to
   know what is going on "under the hood".

   o It should be easy for average users to become advanced users, and
   they should be guided to documentation of issues that affect their
   privacy.

   o It should be easy for developers to extend TCQ without being
   intimately acquainted with all of its workings (ideally, it should
   offer well-defined extension points wherever possible).

8. Decentralisation. TCQ should not depend upon well-known centralised
servers, since they are easily monitored and shut down. (This is
unfortunate, since many ICQ-style features are very difficult to
provide without central servers; TCQ may have to piggyback on existing
public resources such as IRC and webmail to provide these.)

9. No Special Resources. TCQ should require only resources that are
available to the average dial-up ISP user. (Users should not be
required to set up a server on the Internet for example, though they
should be able to use any public resources that exist, and private
resources if they have them.)

10. Secure Message Store. TCQ should provide encrypted and tamperproof
storage for received messages, and should provide a feature to
securely erase messages from this store. (It is arguable whether
"securely erase" is always useful -- an attacker will usually be able
to record encrypted messages in transit anyhow. However there may be
draft messages in the store that were never sent anywhere.)

11. Mobility. TCQ users will be typical dial-up users. They will not
necessarily have fixed or easily discovered IP addresses. They may
also change ISPs, move from location to location, or switch
computers. While travelling they may cross borders and potentially
they will be subject to inspections of their computing equipment.

12. Channel Independence. It should be possible to extend TCQ (with
code) so that TCQ messages can be sent over whatever channels and
drop-points are available (e.g. direct connections, email, webmail,
IRC, ICQ, ftp sites, newsgroups, mailing lists, remailers, eternity
services, etc.). Initially it should be possible to use at least IRC,
email and webmail, since these are easiest to implement, most people
have easy access to these, and IRC and webmail are relatively
anonymous to begin with. (The expectation of anonymity is also highest
on these services.) After that, use (and ultimately provision) of
anonymous remailer style services is the natural next step.

13. Programmatic Access. It should be possible for programs (robots)
to send messages via TCQ, and to automatically handle received
messages. Command line and API access, coupled with some form of
scriptability and the ability to call out to scripts to handle,
pre-process or send messages is desirable for this.

14. Group communication. Programs such as IRC and ICQ offer the
ability to have N-way chats. If it is to attract users of these
programs, TCQ needs to offer that too. (This is difficult to do
generically and securely, however.)

15. File Transfer. Messages should not be restricted to text messages,
and messages should not be limited in size.

16. Stealth, firewall negotiation. TCQ should be able to avail of
stealthy channels, and should not make its protocols easy to
firewall. Protocols that are easy to firewall are usually easy to
filter and monitor, and always easy to shut down.

17. Persistent pseudonyms, friend location. With IRC and ICQ, it is
straightforward to determine if a friend is online and communicate
with them again (although this is not done securely). Although it has
no central server, TCQ should allow this in a similarly
straightforward fashion, otherwise few will use it.

18. Key Management Independence. TCQ should not presuppose much more
than the availability of a key pair per pseudonym. This allows
different key management systems to be used, including web-of-trust,
reputation capital systems, manual key exchange etc.

19. No patent problems. TCQ should not require the use of patented
techniques or patented algorithms such as RSA and IDEA, as this would
affect its use in some countries.

20. Sugar. Popular chat clients (e.g. Mirc, Vplaces) provide various
"nice to have" features such as the ability to play sound files during
conversations and the ability to have avatars (graphical
representations of participants). TCQ should provide similar features.

Anonymity and Privacy Caveats

This is just an attempt on my part to work out what is and is not
achievable in terms of privacy and integrity when purely anonymous
participants are involved and relationships are formed wholly online
(i.e. no out-of-band contact or face-to-face meetings). I just wrote
it down here to clarify my own thinking and to capture it
somewhere. Comments on this reasoning are very welcome. - FOD.

Anonymous Receivers

When talking to an anonymous receiver, privacy of communications does
not apply in the usual sense, regardless of how much encryption is
present. This is because an anonymous receiver could be anyone (by
definition). In such a scenario, the sender should assume the worst,
namely that the receiver is not someone the sender wishes to
communicate confidential information to.  For example, the receiver
could be some entity out to determine the sender's identity or exact
location. In addition, the receiver could pass the sender's messages
on to others. Thus, not only could you be talking to anyone, you could
be talking to everyone. In a group where all relationships are
anonymous and formed wholly online, no amount of strong cryptography
or "reputation capital" changes this (see Man in the middle attacks
below). So why bother encrypting to an anonymous receiver? The only
reason seems to be that in an anonymous online relationship, privacy
comes from sender anonymity and sender anonymity might fail. If it
does, then message encryption is a second line of defence against
eavesdroppers but only if the anonymous receiver is trustworthy after
all. There seem to be plenty of threat models in which encryption to
anonymous receivers is worthless.

An example may clarify this: Nym1 wishes to admit some embarrassing
but non-identifying fact about themselves to Nym2. The reason Nym1 is
willing to make the admission is because Nym1 believes itself to be
speaking anonymously, not because it believes it is speaking
privately. After all, Nym2 could be anyone/everyone. However, if
Nym1's anonymity fails somehow, then encrypting to Nym2 can be
beneficial if there is some significant chance that Nym2 is not
revealing Nym1's messages, and Nym2 does not know Nym1 or is otherwise
not an adversary of Nym1. Similarly, if there is a significant chance
that Nym2 does know Nym1, or is an adversary, then encryption to Nym2
is pointless. In both cases, encryption is secondary and anonymity is
the point.

Anonymous Senders

The flip side of the above, from the receiver perspective, is that an
anonymous sender could be anyone. The receiver cannot assume that it
is the only person receiving the sender's messages, even if they are
encrypted. The sender could be broadcasting them. Nor can the receiver
assume that the messages are in any sense "from" any particular
sender, even if integrity protection is applied. They could be from
anyone. It seems at first sight that given a sequence of messages from
the same anonymous sender (same key pair) then each message is from
the same individual, and that reputations can be built based on that
fact. In the wholly-online/wholly-anonymous case, this is not quite
true however (see below).

Man in the middle attacks

In a group based on public keys where all relationships are anonymous
and formed wholly online, all communications within the group are
always subject to (active) man-in-the-middle attacks.  There is a
possible defence against this using scale and publicity, but
approaches such as web-of-trust and reputation capital do not work for
this. This is how it seems to me, anyway, assuming a public key system
is used, and I try to prove it below -- are there techniques other
than public key techniques that do any better? - FOD

This can be proved by informal induction; the first relationship
(between Alice and Bob) is necessarily formed by exchange of naked
public keys. No third party can vouch for the keys; there is no third
party. No reputation capital can attach to the keys; Alice and Bob
have just met for the first time. Since the public keys are naked, an
active attacker can substitute its own keys for those of Alice and Bob
during the key exchange. Thereafter the active attacker can eavesdrop
on the conversation between Alice and Bob, and can also inject,
delete, reorder, delay or modify messages at any time. As further
participants join the population, any new relationships formed can be
similarly attacked. Alice and Bob cannot reliably certify any new
participants since they can both be impersonated, and in any case, by
definition they do not know any new anonymous participants and can
certify nothing about them. For the same reason, new participants
cannot certify Alice and Bob either. Therefore, in principle at least,
the attacker can impersonate or eavesdrop on any member of the
population, no matter how large the population grows and no matter how
many messages are exchanged. Such an elaborate spoof is arguably not
feasible once some size of population is reached, but this is not
terrifically comforting. (Publicity on out-of-band channels
(e.g. face-to-face meetings) could be used to make these systems more
reliable, but these are excluded by construction -- relationships are
formed wholly online).  Reputation capital does not help, since it
does not matter how much capital Alice accrues if she can be
impersonated. The attacker can damage or abuse Alice's reputation at
any time, for example by injecting or modifying a message and making
it appear to come from Alice. What reputation capital can do, however,
is eventually defeat an attacker who for some reason is only able to
attack some of Alice's exchanges. In that case, Alice will appear to
have two public keys, one real and one fake (supplied by the
attacker). It is then up to Alice to boost the reputation of the real
key and damage that of the fake (assuming she gets to know of it).

Initial Design Decisions

This is still very incomplete, just a few notes for now - FOD

Programming Language, Platform

o TCQ will be implemented in Java/Swing, targeting Java 1.1/1.2 and
Windows initially.  Rationale: Java code is mobile and can be
digitally signed, which leaves room for the entire TCQ program and its
data to be dumped to cyberspace while travelling, and picked up
securely upon arrival (even on a different computer). Thus a mobile
TCQ user could cross borders with nothing more than a signature
verification key and a memorised passphrase, and still restore a
secure environment in the new location. Java is also cross-platform,
so although most people use Windows a relatively easy port to other
platforms (e.g., Macintosh, UNIX) should be possible, and developers
using other platforms can provide extensions without much in the way
of special tools. It is possible to call out to scripts from Java, and
Java can be driven by script (e.g. TCL script, or E (?)). Java also
has good support for cryptography, and an internationally available
crypto library (Cryptix).

Some problems with Java are that certain aspects of it are difficult
to secure, and Java GUIs are frustrating to port in practice. Secure
random number generation and protection of sensitive information such
as keys are hard problems in Java. However, these problems are not
significantly easier in other languages, and the GUI port is much more
difficult. If necessary native code could be used to provide any
features Java can't handle.

Generic Delivery Channels

o TCQ should presuppose only generic delivery channels for messages,
allowing new delivery channels to be added easily.

Rationale: To be usable over as many concrete delivery channels as
possible (IRC, sockets, webmail, ftp sites, remailers, etc. -- see
above), TCQ should not assume too much about the channel. In
particular this means that security features should be applied to the
messages themselves, not the channels. If channels can be assumed
insecure, then people adding code for new delivery channels do not
need to implement any security features unless it is to strengthen
anonymity.

Some properties that TCQ will need to know about each channel type are:

o Whether it is offline or online. That is, with some channels the
other party will be present at that moment and the communication can
be real-time. This is important mainly because on an online channel an
ephemeral key exchange can take place in order to establish perfect
forward secrecy for that particular conversation. A second set of
temporary keys can also be established while online, for later use in
offline communications. For communication that takes place entirely
offline, one-sided ephemeral key exchanges could be used as a form of
key change (this would allow the related nym to persist across key
changes and also improve forward secrecy.) Whether a channel is on or
offline can be modelled as just binary (on/offline) or alternatively
as some estimate of the channel lag.

o Its anonymity characteristics. Some channel types will be more
identifying than others. For example, a direct socket channel is
relatively traceable, whereas a channel through a remailer-type
network is much harder to trace. Channels via drop-points such as
public news groups are somewhere in the middle, but nearer the
"traceable" end of the spectrum.  Message Format

o TCQ messages will consist of an XML header and a MIME-typed content.
Rationale: (This is all based on more of a hunch than a firm
conviction.) XML is readable, easy to extend and easy to process with
scripting languages. It's a nice framework for prototyping protocols,
and unrecognised detail can easily be skipped. There are also
canonical formats for XML, which lend themselves to hashing, digital
signature, etc. It's good for assertions (certificates, RDF
possibilities, W3C digsig?) too. There's also a pretty simple mapping
from protocols such as SPKI and SDSI to XML, if necessary.

MIME is a well-established standard for messaging and content
identification, and the Java Activation Framework uses it for
drag-and-drop, etc.

Something like S/MIME or PGP/MIME could be used instead of XML+MIME,
but these do not work so well in the online case and S/MIME uses
X.509. In any case, these formats can still be used if desired, since
they can be embedded as MIME-types in the body of a TCQ message.

Generic "Friend Location"

o TCQ should presuppose only generic methods of locating friends.
Rationale: locating friends (i.e. finding if they are online,
determining their current IP address or preferred drop-point) is
difficult to do without a central server network. One approach is to
piggyback on some server network that already exists, such as an IRC
or ICQ network, or netphone directories. For example if two people log
onto an IRC network using one of a set of prearranged handles, they
can readily determine each other's online status and IP address.

Another approach is to post a current IP address in a prearranged
location, such as a web or FTP site. There are also some proposals for
"dynamic DNS" which could be useful, since this is a general problem
for mobile IP. Yet another approach is for the requirement to have no
servers to be relaxed to the extent of allowing "nym servers" to be
used just for the purposes of resolving previously exchanged nonces to
current IP addresses, current drop-points, etc. Such servers could
also double as drop-points themselves or as the basis of a remailer
network, if there were enough of them. A variation on the last
approach is to have sets of users collaborating to provide location
services and message forwarding services for one another. This is
probably the most interesting approach, but it is also the most
difficult, it is vulnerable to disruption, and firewalls are a
problem. By using a generic approach, simple initial solutions can be
replaced with other solutions later.