This is a long, complicated, and information dense message. You've
been warned.
I've been working out details of what would be required of an
anonymous return envelope. To make sure I've thought of everything,
I've filled in a matrix with the types of information that might need
to be passed between the various participants during the sending of a
message. The person who created the envelope is the ultimate
recipient of the message (recv). The envelope has somehow been
transmitted to a person (send) who wishes to send a message to recv.
The message will be transmitted via several remailers, collectively
referred to as hops. This results in nine potential transmission
channels, several of which need not be possible for various reasons.
Clearly, the sender does not need a special channel to communicate
back to herself, and likewise the receiver does not need special
provisions to communicate to herself either. The sender should be
unable to receive information from the various hops, as that would
compromise the path that the message takes. The various hops already
communicate directly with their following neighbor through headers,
and we want to prohibit communication back towards the sender. The
remaining five cases are listed below:
| from
| send recv hops
-------------+------------------------
send | - pneed (ack)
to recv | msg - pdue
hops | post addr -
where:
msg is the message being delivered
post is postage paid by the sender
addr is addressing info from the receiver
pneed is info to help sender provide postage
pdue is info on missing postage
(ack) is info that is disallowed
pneed is cleartext on the outside of the envelope. This leaves us
with the message, and three parts of the envelope: the delivery
address, postage paid, and postage due.
Note that information other than what I'm describing here could be
sent along these channels; I am simply using postage as a concrete
example of information that might need to be transmitted.
And now, let's trace a message through to its delivery. Being stuck
with ascii, the notation is not wonderful. Groups of letters
represent sets of similar things. Case is significant. Lower case
letters r and v-z are public keys. Upper case letters preceded by &
are the machines that know the associated secret keys (from the C
address of operator). So machine &Y can decrypt something encrypted
with public key y. Upper case letters A-F are conventional keys.
Keys A-C are generated by the sender, keys D-F by the receiver. The
symbols P, S, Q, $, and # are followed by lower case letters
indicating who the item is associated with. Q and # are conventional
keys, P and S form a public key-secret key pair, and $ is a digicash
stamp.
NOTATION:
x(...) contents encrypted with public key x
&X mail address for remailer using public key x
A(...) contents encrypted with conventional key A
Px public key for delivering postage to &X
Sx secret key for delivering postage to &X
Qx conventional key for postage due from &X
$x a postage stamp for &X to cash
Amt_x an amount of postage to deliver to &X
Due_x postage still due to &X, plus a unique ID
#x conventional key held by &X while postage is due
pad random padding (see below)
&R mail address of the final recipient
Pr, Qr, $r
stuff associated with &R
M the actual message to be delivered to &R
junk padding created by &R as a diversion
ABOUT PADDING:
K(stuff, pad) can be transformed into stuff by decrypting with key K.
Since stuff has a length associated with it inside the encryption, an
external viewer cannot tell the length of stuff. It is also possible
to turn K(stuff) into K(stuff, pad) without knowing K. The encryption
packet contains an external length as well as the internal length.
The external length must be adjusted to cover the added padding, which
is just a random bitstream appended to the cyphertext. Once this
padding has been performed, it is impossible to determine the length
of stuff without decrypting with K. In this manner, a portion of a
message can be either lengthened or shortened at every step along the
way, as long as a decryption is performed at each step. This is the
motivation for the keys A..F in the exchange that follows.
PGP should be augmented with a function to pad a message, and should
explicitly accept padded data. I have tested PGP2.1 on Unix and it
accepts padded data that I manually added.
OK, here we go... The envelope provided by the receiver to the sender
looks like this:
Addr: &X, x, x(...)
Pneed: [Px, Amt_x], [Py, Amt_y], [Pz, Amt_z], [Pr, Amt_r]
Everything except the encrypted segment x(...) is considered public
knowledge. The keys Px, etc... pose a slight problem: One of the hops
can identify which envelope a message is associated with by comparing
the postage key sealed inside the addressing info with this public
string of keys. It's not clear how serious of an issue this is.
The sender decides to send the message through hosts &V and &W before
using the provided envelope. She sends the following message to &V:
Addr: v(A), v(Sv, Qv, B, &W, w(B), w(Sw, Qw, C, &X, x(C), x(...)), pad)
Post: A(Pv($v, Pw($w, Px($x, Py($y, Pz($z, Pr($r)))))), pad)
Pdue: A(pad)
Message: A(M, pad)
She has created keys A-C, Pv, Sv, Pw, Sw, Qv, and Qw. She obtains the
specified postage stamps and wraps them in the various postage keys.
The keys and addresses get wrapped in public keys for the address
field, and all of the other elements of the message are sealed with
key A. The address field consists of two public key encrypted
segments because the sender must create key C, but cannot seal it into
the packet that the recipient has provided for host &X. If C were
public knowledge, host &X could be monitored, and the plaintext of M
revealed to an external watcher. As it is, M still occurs in
plaintext within each remailer, so it should be protected by the
recipient's public key (i.e. M = r(the real message) ).
&V decrypts the v() encryptions to find all of the keys necessary for
it to process the message. The padding is removed from the address
field. The key A unlocks the message M, allowing the stripping of the
padding, which is replaced with new padding before being encrypted
with key B. It notes that the Pdue field is empty. Sv allows it to
extract its postage stamp $v, and strip the padding. The message it
sends to &W looks like this:
Addr: w(B), w(Sw, Qw, C, &X, x(C), x(...), pad)
Post: B(Pw($w, Px($x, Py($y, Pz($z, Pr($r))))), pad)
Pdue: B(pad)
Message: B(M, pad)
&W does likewise, and sends the following to &X (we have revealed the
encrypted part of the original envelope at this point):
Addr: x(C), x(Sx, Qx, D, &Y, y(D), y(Sy, Qy, E, &Z,
z(E), z(Sz, Qz, F, &R, r(junk), r(junk))), pad)
Post: C(Px($x, Py($y, Pz($z, Pr($r)))), pad)
Pdue: C(pad)
Message: C(M, pad)
Postage rates have gone up since the envelope was first issued, so &X,
&W, and &Z will need to use the Pdue field. It works like this:
Addr: y(D), y(Sy, Qy, E, &Z,
z(E), z(Sz, Qz, F, &R, r(junk), r(junk)), pad)
Post: D(Py($y, Pz($z, Pr($r))), pad)
Pdue: D(Qx(Due_x), pad)
Message: D(#x(M), pad)
&Y then sends the following to &Z:
Addr: z(E), z(Sz, Qz, F, &R, r(junk), r(junk), pad)
Post: E(Pz($z, Pr($r)), pad)
Pdue: E(Qy(Due_y, Qx(Due_x)), pad)
Message: E(#y(#x(M)), pad)
&Z sends the following to &R:
Addr: r(junk), r(junk, pad)
Post: F(Pr($r), pad)
Pdue: F(Qz(Due_z, Qy(Due_y, Qx(Due_x)), pad)
Message: F(#z(#y(#x(M))), pad)
Now, &R (the receiver, who created the envelope in the first place)
knows F, Sr, Qx, Qy, Qz, and thus finds out Due_x, Due_y, Due_z,
#z(#y(#x(M))) [the message, with postage due], and gets a stamp $r.
&R then generates a message that is designed to deliver #x, #y, and
#z, and sends it to &X:
Addr: x(C), x(Sx, Qx, D, &Y, y(D), y(Sy, Qy, E, &Z,
z(E), z(Sz, Qz, F, &R, r(junk), r(junk))), pad)
Post: C(Px($x, Due_x, Py($y, Due_y, Pz($z, Due_z, Pr(junk)))), pad)
Pdue: C(pad)
Message: C(pad)
&X unwraps it and sends #x along:
Addr: y(D), y(Sy, Qy, E, &Z, z(E), z(Sz, Qz, F, &R, r(junk), r(junk)), pad)
Post: D(Py($y, Due_y, Pz($z, Due_z, Pr(junk))), pad)
Pdue: D(Qx(#x), pad)
Message: D(pad)
And again:
Addr: z(E), z(Sz, Qz, F, &R, r(junk), r(junk), pad)
Post: E(Pz($z, Due_z, Pr(junk)), pad)
Pdue: E(Qy(#y, Qx(#x)), pad)
Message: E(pad)
And back to &R:
Addr: r(junk), r(junk, pad)
Post: F(Pr(junk), pad)
Pdue: F(Qz(#z, Qy(#y, Qx(#x))), pad)
Message: F(pad)
So &R now knows #x, #y, and #z, and so can recover M.
To keep &Z from knowing it is the tail of the path, extra postage
stamps are required of the sender. These are cashable by the
receiver. The sender thus has no way of knowing the length of the
path, but only has an idea of the upper bound on it. If the sender
does not include sufficient postage on the steps she prepended to the
path, the receiver will not be able to read the message, as there is
no way for the receiver to find out Qv and Qw. Perhaps these could be
affixed to the innermost stamp, along with &V and &W, but this is
probably not a good idea.
Since remailers wouldn't add extra encryption to the header fields of
a postage due message (it would make paying the postage due a lengthy
process), the postage due concept could be circumvented by placing the
message in the Post or Pdue headers disguised as postage info. To
discourage this, remailers would only allow postage due deliveries for
a fixed period after a rate increase, and would still require the
older rate be paid.
Another use for postage due would be to disguise the use of an
expensive remailer. Such a remailer would forward with postage due
when paid the prevailing rate.
Well, I've beaten this thing bloody now and can't find any more flaws.
I admit it's a bit of a monster, but most of it goes away if you don't
require postage. I think the system needs to be designed with postage
in mind from the start, however. Anyway, it's time for you people to
start ripping it apart. Perhaps we can have a discussion of this at
the physical meeting this week if Eric Hughes can fit it into the
schedule.
-eric messick