In article <+cmu.andrew.internet.cypherpunks+IlYDbnW00UfAI10=1K@andrew.cmu.edu> Raph writes:
Now for the mathematical model. Signatures can bind keys to e-mail addresses, or act as assertions that the signed public key is trusted to transitively sign other keys. Let's assume that each signature has a certain probability p of being good, and a 1-p probability of being bogus, and that all probabilities are independent. These are probably bad assumptions in the real world, but that's the difference between theory and practice.
I'm not happy with the independence assumption. Let's say I create a keypair, put "president@whitehouse.gov" in the name field, and try to get people to sign it as valid. (I don't know what you're asserting when you sign a key, but I'd say you're at least binding the key to the name and address attached to it.) Each signature has an /a priori/ probability p of correctly indicating validity, but these probilities are not independent at all: this key isn't valid, period. If one certifying signature is incorrect, all others on the same key must be, and vice versa -- about as correlated as they come.
Now we can actually evaluate the probability of a given key being good. Consider a Monte Carlo process in which each edge in the graph is present with probability 1-p. For each run, we determine whether the recipient's public key (actually the binding between public key and recipient's e-mail address) is reachable from our trust root. The probability over a large number of runs is (given our assumptions) the probability of the key being good.
There are two separate problems: 1) key reachability -- do I think I can trust this key? 2) key validity -- is this key really okay? The graph reachability problem asks whether there exists a valid path. This is what you want for the key reachability problem. But the key validity problem should be asking whether all paths are valid; a single invalid path to me (posing as the Prez, remember) means that I get to read your mail to Bill (big deal, eh?). So you'd need to turn it around, and ask whether there exists an invalid path. From your use of "1-p" for the probability, you may have been thinking along these lines already. So an edge (u,v) in G indicates that u trusts v. With probability q = 1-p, Mallet is able to fool v. That is, Pr[(u,v) in G'] = q. Then we ask whether there's a path from s to t in G' -- that is, from you to the key you pulled off the net. If one exists, you lose. To limit transitivity, constrain the path length. This limits key reachability too, but I think we agree that it's essential in the real world. (It should also make the math simpler!) The model generalizes to non-binary conceptions of trust, but I don't think these can rehabilitate transitivity. Hmm, there are some possible approaches, though. These probabilities q are somewhat dependent: if I'm smart about whom I trust, all of the q_(me, v) values will be somewhat lower, and vice versa. I think they're mostly independent, though. But this is an improvised model; poke holes in it. -- . Eli Brandt usual disclaimers . . eli+@cs.cmu.edu PGP key on request . . violation of 18 U.S.C. 1462: "fuck".