TCQ ascii version
(c) 1998 Frank O'Dwyer. Verbatim copying and redistribution in any medium is permitted provided this copyright notice is preserved on all copies. TCQ ("They Seek You") Mass-market Strong Anonymity Frank O'Dwyer fod@brd.ie Introduction This document sketches the design goals and features of "TCQ", a work in progress with the aim of providing a mix of features offered by ICQ, IRC and email -- but using cryptography to provide strong anonymity and privacy. As the name implies, TCQ "(They Seek You") is similar in purpose to ICQ ("I Seek You"), but with a very different emphasis. TCQ is motivated by the recognition that people being sought do not always wish to be found, and that those doing the seeking (people, organisations, or governments) can be threatening. TCQ is also motivated by a desire to make the use of strong anonymity easier and much more common, riding the current wave of popularity of "anonymous" services like IRC, ICQ and webmail. Status TCQ is in the early stages of being designed and prototyped; this early draft is being made available now to solicit comments (and assistance) from anyone with an interest in this project. Some initial design decisions and the rationale for them are also documented here. It is planned to make TCQ freely available in source and binary form, under a liberal licence (such as MozPL, or a BSD-style licence). If you can contribute ideas, or code on those terms, you are welcome to. Background Basic political background to this, intended for general consumption (including people who may not be aware of what GAK is, etc.). This is just a sketch. - FOD Anonymous communication is hugely popular. Vast numbers of non-technical people use services such as ICQ (www.mirabilis.com) and Hotmail (www.hotmail.com) because they want to communicate privately and anonymously. However, these systems only offer a weak form of privacy and anonymity. They use unencrypted links (which can easily be tapped), unencrypted messages on centralised servers (which can easily be commandeered or shut down) and well known sites (to which access can easily be blocked). They also use communication links that can be traced to individuals, albeit requiring some effort. Systems which offer much better privacy and anonymity are available (for example PGP, Cypherpunks remailers), but relatively few people use those systems. This is partly because relatively few people know about them, but mainly because they are difficult to use. There is a lot of additional theory on strongly anonymous and strongly private communications [ref Cypherpunks literature here], but few implementations of any sort and no mass-market implementations. (that I am aware of - FOD). In the case of strong anonymity, most services thus far have been based on email. Email has a strong culture of being suspicious of "fake names", and this is another factor in the lack of mainstream take-up of strong anonymity. However, with services such as IRC and ICQ, "fake names" are the norm -- these cultures are now becoming mainstream and they may be more receptive to strong anonymity. At the same time, an individual's ability to communicate with true privacy and anonymity is under threat from various quarters, including for example: o Government Access to Keys. Proposals designed to allow governmental access to private communications, (including encrypted private communications between individuals) have been circulating for some time in many jurisdictions including the United States and Europe. These proposals are typically sold on the back of "hard cases", such as a claimed need for law enforcement to be able to intercept the communications of paedophiles and terrorists. o Physical border inspections. Users who cross borders may be subject to customs inspection of any computer equipment they may be carrying; for example, if their laptops contain (or appear to contain) encrypted information, they may be required to provide the key(s). On the other hand, if their information is not encrypted, then users' confidential information (private diaries, letters, business plans) is exposed. There are rumours that some countries (such as the UK) already perform such inspections. o Centralised Services. There is a growing trend to move security-critical services off users' own PCs and onto the web. Prime examples include online backup and webmail. Even non-technical people often instinctively react to this by becoming "anonymous" (e.g. lying about their names and location). However, this is still a very hospitable climate for attack by governments and other parties, and in any case, the level of anonymity is weak. Also, in so far as such services benefit privacy/anonymity (e.g. webmail), they are easily monitored, blocked, or shut down. o Strongly Identifying Protocols. Commercial security products, some of which apparently offer strong encryption and privacy, are almost universally based upon "X.509 certificates" and trusted third parties known as "Certification Authorities" (CAs). This trend is worrying for a number of reasons: o X.509 is typically used as a strongly identifying protocol, and the analogy is often made between X.509 certificates and a driving licence or passport. Although X.509 can facilitate private communications, it effectively prevents anonymous private communications since by design it requires you to show your "passport" to strangers. (X.509 does not have to be used this way, but people are often encouraged to use it that way.) o Use of X.509, and in particular establishment of public CAs, is likely to be heavily regulated. Proposals have been made to "licence" CAs, for example. Regulated management of individuals' encryption keys is not a healthy development for online privacy. o Extensions for governmental access to keys have been made to X.509, and X.509 is at the heart of some governmental strategies to mandate key access. o If Public CAs take off then they will turn out to be single points of failure and an extremely attractive attack target. Anyone (individual, corporation or government) who can compromise a public CA is in an excellent position to attack the communications of all users served by that CA. o Logical border inspections. Communications that cross cyberspace borders may be subject to traffic monitoring and/or filtering. This might occur at corporate firewalls or at well defined national "entry points" and "exit points". o Non-repudiation. Some systems for private communications have as a side effect that it is difficult for a person to plausibly deny sending or receiving a particular message. This is sometimes welcome, but it can be a problem if the person thinks he or she is anonymous. For example, a warm body might be linked to a message just by being found in physical possession of related cryptographic keys. Design goals These goals are listed to give an idea of the features TCQ is intended to provide. It is likely that TCQ will meet these goals only in a piecemeal fashion as it evolves, doing the easy bits first and leaving the hard bits till later. We will see. -FOD 1. Casual Anonymity. TCQ should offer a messaging service based on disposable cryptographic identities. This level of anonymity is weak, but it is easiest to implement and it is a start. 2. Relative Anonymity. It can be useful to have multiple identities (or "nyms") and for some identities to be less anonymous than others. A nym might be anonymous, identified to a friend, identified to a select group, or generally identified to everyone. For example, this allows for applications where the sender is anonymous but the receiver is not. TCQ should allow people to selectively reveal their identities (including partial information about themselves) while remaining fully anonymous to others. TCQ should also allow multiple identities to be maintained, with varying degrees of anonymity, and allow these identities to be kept at arm's length from each other. Lastly, it should be possible for a TCQ sender to selectively reveal "also known as" relationships between one nym and another. (lots of requirements here -- might be worth splitting this up). 3. Strong Anonymity. TCQ should be extensible to use, or natively offer, stronger anonymity services (mixes, remailer-type functionality, etc.). 4. Privacy. It should not be possible for unintended recipients to read TCQ messages (but see the caveats below). 5. Integrity. It should not be possible for a third party to tamper with TCQ messages. (again see the caveats below.) 6. Repudiability. It should be possible to plausibly deny having sent a TCQ message, or at least know when this will not be the case. 7. Ease of use. o TCQ should be easy to install and use for average users of ICQ, webmail and IRC, and should not require special cryptographic expertise. o TCQ should also be easy to use for advanced users, and allow them to know what is going on "under the hood". o It should be easy for average users to become advanced users, and they should be guided to documentation of issues that affect their privacy. o It should be easy for developers to extend TCQ without being intimately acquainted with all of its workings (ideally, it should offer well-defined extension points wherever possible). 8. Decentralisation. TCQ should not depend upon well-known centralised servers, since they are easily monitored and shut down. (This is unfortunate, since many ICQ-style features are very difficult to provide without central servers; TCQ may have to piggyback on existing public resources such as IRC and webmail to provide these.) 9. No Special Resources. TCQ should require only resources that are available to the average dial-up ISP user. (Users should not be required to set up a server on the Internet for example, though they should be able to use any public resources that exist, and private resources if they have them.) 10. Secure Message Store. TCQ should provide encrypted and tamperproof storage for received messages, and should provide a feature to securely erase messages from this store. (It is arguable whether "securely erase" is always useful -- an attacker will usually be able to record encrypted messages in transit anyhow. However there may be draft messages in the store that were never sent anywhere.) 11. Mobility. TCQ users will be typical dial-up users. They will not necessarily have fixed or easily discovered IP addresses. They may also change ISPs, move from location to location, or switch computers. While travelling they may cross borders and potentially they will be subject to inspections of their computing equipment. 12. Channel Independence. It should be possible to extend TCQ (with code) so that TCQ messages can be sent over whatever channels and drop-points are available (e.g. direct connections, email, webmail, IRC, ICQ, ftp sites, newsgroups, mailing lists, remailers, eternity services, etc.). Initially it should be possible to use at least IRC, email and webmail, since these are easiest to implement, most people have easy access to these, and IRC and webmail are relatively anonymous to begin with. (The expectation of anonymity is also highest on these services.) After that, use (and ultimately provision) of anonymous remailer style services is the natural next step. 13. Programmatic Access. It should be possible for programs (robots) to send messages via TCQ, and to automatically handle received messages. Command line and API access, coupled with some form of scriptability and the ability to call out to scripts to handle, pre-process or send messages is desirable for this. 14. Group communication. Programs such as IRC and ICQ offer the ability to have N-way chats. If it is to attract users of these programs, TCQ needs to offer that too. (This is difficult to do generically and securely, however.) 15. File Transfer. Messages should not be restricted to text messages, and messages should not be limited in size. 16. Stealth, firewall negotiation. TCQ should be able to avail of stealthy channels, and should not make its protocols easy to firewall. Protocols that are easy to firewall are usually easy to filter and monitor, and always easy to shut down. 17. Persistent pseudonyms, friend location. With IRC and ICQ, it is straightforward to determine if a friend is online and communicate with them again (although this is not done securely). Although it has no central server, TCQ should allow this in a similarly straightforward fashion, otherwise few will use it. 18. Key Management Independence. TCQ should not presuppose much more than the availability of a key pair per pseudonym. This allows different key management systems to be used, including web-of-trust, reputation capital systems, manual key exchange etc. 19. No patent problems. TCQ should not require the use of patented techniques or patented algorithms such as RSA and IDEA, as this would affect its use in some countries. 20. Sugar. Popular chat clients (e.g. Mirc, Vplaces) provide various "nice to have" features such as the ability to play sound files during conversations and the ability to have avatars (graphical representations of participants). TCQ should provide similar features. Anonymity and Privacy Caveats This is just an attempt on my part to work out what is and is not achievable in terms of privacy and integrity when purely anonymous participants are involved and relationships are formed wholly online (i.e. no out-of-band contact or face-to-face meetings). I just wrote it down here to clarify my own thinking and to capture it somewhere. Comments on this reasoning are very welcome. - FOD. Anonymous Receivers When talking to an anonymous receiver, privacy of communications does not apply in the usual sense, regardless of how much encryption is present. This is because an anonymous receiver could be anyone (by definition). In such a scenario, the sender should assume the worst, namely that the receiver is not someone the sender wishes to communicate confidential information to. For example, the receiver could be some entity out to determine the sender's identity or exact location. In addition, the receiver could pass the sender's messages on to others. Thus, not only could you be talking to anyone, you could be talking to everyone. In a group where all relationships are anonymous and formed wholly online, no amount of strong cryptography or "reputation capital" changes this (see Man in the middle attacks below). So why bother encrypting to an anonymous receiver? The only reason seems to be that in an anonymous online relationship, privacy comes from sender anonymity and sender anonymity might fail. If it does, then message encryption is a second line of defence against eavesdroppers but only if the anonymous receiver is trustworthy after all. There seem to be plenty of threat models in which encryption to anonymous receivers is worthless. An example may clarify this: Nym1 wishes to admit some embarrassing but non-identifying fact about themselves to Nym2. The reason Nym1 is willing to make the admission is because Nym1 believes itself to be speaking anonymously, not because it believes it is speaking privately. After all, Nym2 could be anyone/everyone. However, if Nym1's anonymity fails somehow, then encrypting to Nym2 can be beneficial if there is some significant chance that Nym2 is not revealing Nym1's messages, and Nym2 does not know Nym1 or is otherwise not an adversary of Nym1. Similarly, if there is a significant chance that Nym2 does know Nym1, or is an adversary, then encryption to Nym2 is pointless. In both cases, encryption is secondary and anonymity is the point. Anonymous Senders The flip side of the above, from the receiver perspective, is that an anonymous sender could be anyone. The receiver cannot assume that it is the only person receiving the sender's messages, even if they are encrypted. The sender could be broadcasting them. Nor can the receiver assume that the messages are in any sense "from" any particular sender, even if integrity protection is applied. They could be from anyone. It seems at first sight that given a sequence of messages from the same anonymous sender (same key pair) then each message is from the same individual, and that reputations can be built based on that fact. In the wholly-online/wholly-anonymous case, this is not quite true however (see below). Man in the middle attacks In a group based on public keys where all relationships are anonymous and formed wholly online, all communications within the group are always subject to (active) man-in-the-middle attacks. There is a possible defence against this using scale and publicity, but approaches such as web-of-trust and reputation capital do not work for this. This is how it seems to me, anyway, assuming a public key system is used, and I try to prove it below -- are there techniques other than public key techniques that do any better? - FOD This can be proved by informal induction; the first relationship (between Alice and Bob) is necessarily formed by exchange of naked public keys. No third party can vouch for the keys; there is no third party. No reputation capital can attach to the keys; Alice and Bob have just met for the first time. Since the public keys are naked, an active attacker can substitute its own keys for those of Alice and Bob during the key exchange. Thereafter the active attacker can eavesdrop on the conversation between Alice and Bob, and can also inject, delete, reorder, delay or modify messages at any time. As further participants join the population, any new relationships formed can be similarly attacked. Alice and Bob cannot reliably certify any new participants since they can both be impersonated, and in any case, by definition they do not know any new anonymous participants and can certify nothing about them. For the same reason, new participants cannot certify Alice and Bob either. Therefore, in principle at least, the attacker can impersonate or eavesdrop on any member of the population, no matter how large the population grows and no matter how many messages are exchanged. Such an elaborate spoof is arguably not feasible once some size of population is reached, but this is not terrifically comforting. (Publicity on out-of-band channels (e.g. face-to-face meetings) could be used to make these systems more reliable, but these are excluded by construction -- relationships are formed wholly online). Reputation capital does not help, since it does not matter how much capital Alice accrues if she can be impersonated. The attacker can damage or abuse Alice's reputation at any time, for example by injecting or modifying a message and making it appear to come from Alice. What reputation capital can do, however, is eventually defeat an attacker who for some reason is only able to attack some of Alice's exchanges. In that case, Alice will appear to have two public keys, one real and one fake (supplied by the attacker). It is then up to Alice to boost the reputation of the real key and damage that of the fake (assuming she gets to know of it). Initial Design Decisions This is still very incomplete, just a few notes for now - FOD Programming Language, Platform o TCQ will be implemented in Java/Swing, targeting Java 1.1/1.2 and Windows initially. Rationale: Java code is mobile and can be digitally signed, which leaves room for the entire TCQ program and its data to be dumped to cyberspace while travelling, and picked up securely upon arrival (even on a different computer). Thus a mobile TCQ user could cross borders with nothing more than a signature verification key and a memorised passphrase, and still restore a secure environment in the new location. Java is also cross-platform, so although most people use Windows a relatively easy port to other platforms (e.g., Macintosh, UNIX) should be possible, and developers using other platforms can provide extensions without much in the way of special tools. It is possible to call out to scripts from Java, and Java can be driven by script (e.g. TCL script, or E (?)). Java also has good support for cryptography, and an internationally available crypto library (Cryptix). Some problems with Java are that certain aspects of it are difficult to secure, and Java GUIs are frustrating to port in practice. Secure random number generation and protection of sensitive information such as keys are hard problems in Java. However, these problems are not significantly easier in other languages, and the GUI port is much more difficult. If necessary native code could be used to provide any features Java can't handle. Generic Delivery Channels o TCQ should presuppose only generic delivery channels for messages, allowing new delivery channels to be added easily. Rationale: To be usable over as many concrete delivery channels as possible (IRC, sockets, webmail, ftp sites, remailers, etc. -- see above), TCQ should not assume too much about the channel. In particular this means that security features should be applied to the messages themselves, not the channels. If channels can be assumed insecure, then people adding code for new delivery channels do not need to implement any security features unless it is to strengthen anonymity. Some properties that TCQ will need to know about each channel type are: o Whether it is offline or online. That is, with some channels the other party will be present at that moment and the communication can be real-time. This is important mainly because on an online channel an ephemeral key exchange can take place in order to establish perfect forward secrecy for that particular conversation. A second set of temporary keys can also be established while online, for later use in offline communications. For communication that takes place entirely offline, one-sided ephemeral key exchanges could be used as a form of key change (this would allow the related nym to persist across key changes and also improve forward secrecy.) Whether a channel is on or offline can be modelled as just binary (on/offline) or alternatively as some estimate of the channel lag. o Its anonymity characteristics. Some channel types will be more identifying than others. For example, a direct socket channel is relatively traceable, whereas a channel through a remailer-type network is much harder to trace. Channels via drop-points such as public news groups are somewhere in the middle, but nearer the "traceable" end of the spectrum. Message Format o TCQ messages will consist of an XML header and a MIME-typed content. Rationale: (This is all based on more of a hunch than a firm conviction.) XML is readable, easy to extend and easy to process with scripting languages. It's a nice framework for prototyping protocols, and unrecognised detail can easily be skipped. There are also canonical formats for XML, which lend themselves to hashing, digital signature, etc. It's good for assertions (certificates, RDF possibilities, W3C digsig?) too. There's also a pretty simple mapping from protocols such as SPKI and SDSI to XML, if necessary. MIME is a well-established standard for messaging and content identification, and the Java Activation Framework uses it for drag-and-drop, etc. Something like S/MIME or PGP/MIME could be used instead of XML+MIME, but these do not work so well in the online case and S/MIME uses X.509. In any case, these formats can still be used if desired, since they can be embedded as MIME-types in the body of a TCQ message. Generic "Friend Location" o TCQ should presuppose only generic methods of locating friends. Rationale: locating friends (i.e. finding if they are online, determining their current IP address or preferred drop-point) is difficult to do without a central server network. One approach is to piggyback on some server network that already exists, such as an IRC or ICQ network, or netphone directories. For example if two people log onto an IRC network using one of a set of prearranged handles, they can readily determine each other's online status and IP address. Another approach is to post a current IP address in a prearranged location, such as a web or FTP site. There are also some proposals for "dynamic DNS" which could be useful, since this is a general problem for mobile IP. Yet another approach is for the requirement to have no servers to be relaxed to the extent of allowing "nym servers" to be used just for the purposes of resolving previously exchanged nonces to current IP addresses, current drop-points, etc. Such servers could also double as drop-points themselves or as the basis of a remailer network, if there were enough of them. A variation on the last approach is to have sets of users collaborating to provide location services and message forwarding services for one another. This is probably the most interesting approach, but it is also the most difficult, it is vulnerable to disruption, and firewalls are a problem. By using a generic approach, simple initial solutions can be replaced with other solutions later.
participants (1)
-
Anonymous