Recipient Anonymity
A group of servers collects messages of equal length for anonymous recipients. All servers exchange messages so that each has copies of all messages. A recipient wishes to retrieve a message from the servers without any server knowing which message he is receiving. The recipient selects a group of n servers. From each server, S_1...S_n-1, he requests a random selection of messages, with a 50% probability that any particular message will be selected. The server returns the xor of all messages requested. He sends the final server a request which is the xor of all the previous requests and the one single message that he wants. The xor of all the responses is the desired message. It is impossible to determine which message was received unless all servers collude. A variant of this scheme where there is only one server and a number of anonymizing proxies will also work, however in this case there must be sufficient delay to obscure the time correlations or the server will know which message was received, and collusion with any one of the proxies could reveal the recipient. It is also possible to eliminate the need for real-time communication and operate the system on a store-and-forward network. The recipient generates a one time pad and sends it to a remailer/messageserver along with a reply-block which is a nested series of encrypted message requests and next-hop addresses. Each remailer along the way xors the requested messages onto the existing message data before forwarding it to the next hop. When the recipient gets his message back, he decrypts it using the one time pad. To account for the situation where first and last remailer collude, it is desireable to have some of the intervening remailers apply a stream cipher to the message using a supplied key. The server-to-server broadcast eliminates the need for cover traffic, but the anonymizing proxy system does not. However, there is no reason that both schemes could not be used concurrently in the same network, and in fact they would look the same to the end-user. Except for the additional server-to-server communications necessary to broadcast new messages into the system, the bandwidth utilization is comparable to sender-anonymous remailers; it scales linearly as the number of parties involved in the delivery of a particular message. The failure rate is also the product of the failure rate for each server; if one server delivers corrupt data, the message is unreadable, and the recipient must identify the bad server and eliminate it from future requests.
Except for the additional server-to-server communications necessary to broadcast new messages into the system, the bandwidth utilization is comparable to sender-anonymous remailers; it scales linearly as the number of parties involved in the delivery of a particular message.
How are messages selected? It seems that the recipient needs to know the message IDs of all messages in the system, and must have a way to identify his messages.
3umoelle@informatik.uni-hamburg.de (Ulf =?iso-8859-1?Q?M=F6ller?=) wrote:
How are messages selected? It seems that the recipient needs to know the message IDs of all messages in the system, and must have a way to identify his messages.
Yes. Message-IDs on usenet average around 30-35 bytes each, so for a typical remailer one might have to download 1000 Message-IDs, which would take about 32K, not much bigger than the message itself. You could also use MD5 hashes of the messages, in which case a list of 1000 message-IDs would take only 16K. (In the unlikely event of a hash collision you could download those two messages seperately. Unless the number of messages was huge (millions), you could probably get away with using only a 32 or 64-bit hash function.)
ghio@temp0099.myriad.ml.org (Matthew Ghio) wrote:
You could also use MD5 hashes of the messages, in which case a list of 1000 message-IDs would take only 16K. (In the unlikely event of a hash collision you could download those two messages seperately. Unless the number of messages was huge (millions), you could probably get away with using only a 32 or 64-bit hash function.)
Okay. Let's suppose that there are 10,000 messages (more realistic for a large remailer i think) And I am going to spread it over five servers, And I use a 32-bit hash function (one in four billion chance I get someone else's message) First I download the list of Message IDs/Hashes. (40,000 bytes) Then I download the recipient list. (another 40,000 bytes) I find a message for me. Let's suppose the messages are 20K each. I send each server a list of the messages I want (10,000 bits, which is 1,250 bytes each, so 6,250 bytes total) Finally, I get back five 20K messages from each of the five servers. So that's a total of 80K to download the IDs/recipients lists, 6.25K to upload the requests, and 100K to download the message pieces, to read my 20K email. I guess that could work.
Anonymous wrote:
Okay. Let's suppose that there are 10,000 messages (more realistic for a large remailer i think) And I am going to spread it over five servers, And I use a 32-bit hash function (one in four billion chance I get someone else's message)
[...]
Finally, I get back five 20K messages from each of the five servers.
You only need to download the XOR of the five messages. OTOH, you should not leak the information that there is exactly one message for you.
So that's a total of 80K to download the IDs/recipients lists, 6.25K to upload the requests, and 100K to download the message pieces, to read my 20K email. I guess that could work.
It's way better than 200,000K for the complete pool...
participants (4)
-
3umoelle@informatik.uni-hamburg.de
-
ghio@temp0097.myriad.ml.org
-
ghio@temp0099.myriad.ml.org
-
nobody@REPLAY.COM