RATINGS: proposal
Eric: I was about to send this to the cpunx list, but got your message first. I'll send this to you first, and maybe we can hash out something better before 'going public' with it... here it is. I had not thought about the possibility of rating multiple messages in one "rating message." My scheme doesn't address this, although simple changes could accomplish it. I'm not sure that I understand why a mail message could be rated multiple times by the same rator, unless you mean that one might define "axes of rating", like "content", "spelling", "novelty", etc. I think that such a scheme is good, but is starting to place more load on the rator. I had hoped that I could use a slider widget, and have the user generate somewhat reasonable ratings just by setting the slider to a value between 0 and 100 and hitting a "rate" button. This would automagically put in motion the scheme outlined below. As an MUA implementor, here's my first cut of a proposal for a rating system that would hopefully meet the goals that eric outlined and be quickly implementable. 1) Mail to cypherpunks-ratings will be gatewayed back to all members of the list if it has the following lines in its body [Headers are ignored...] Lines in brackets are optional: [whitespace] Target-Message-Id: <messageId> Rating: <integer> [Comment: <newline terminated string>] [Subtopic: <one word string>] [Rating-originator: <newline terminated identity string: preferably PGP public key ident] [PGP signature] Body lines which don't have the syntax mentioned above are ignored. The comment field is intended to allow the person doing the rating to give info about his rating or the rated message to the rated message reader, when the message is read. I.e. when I read a message from the cypherpunk list from joe blow which has been rated by iansmith@cc.gatech.edu with a comment of "this message is fantastic" I can get something like: From: joe blow Subject: foobar Rating-Comment: (iansmith@cc.gatech.edu) this message is fantastic! Rating: 100 Similarly the subtopic field is to provide an automatic version of what is now being done by hand with ADMIN: and RATING: in subject fields or the like. This would allow a mail user agent to provide some automatic content help, provided that a displayed mail message meets enough of its criteria for this (have enough ratings been received? do 75% of the ratings agree on the subtopic for the mail message? etc). The PGP information is intended to facilitate "rating reputations" so that MUAs could be configured to "trust" ratings from people with good reputations for rating in ways that meet the user's idea of "goodness." Now, this strategy doesn't say anything about what the numeric rating scale means. I favor a 0-100 scale, with 0 being a "don't read this, its crap" and 100 being "must read." Although ideally this should be a curve with 50 being very common and 100's and 0's being very uncommon, it seems unlikely to me that such a system would occur in practice, as the common case (50) is the MOST unlikely to motivate someone to issue a rating message. I'm not sure what to do about this problem. When the mail messages is echoed out to the ratings mailing list, it would bear the additional header line: X-Mail-Rating: cypherpunks This would allow the MUA of the recipient to easily recognize the "ratings mail" messages and not display them to the user. The MUA should be building a database or otherwise storing the ratings of mail messages, so that when it displays a message it can bring up the appropriate ratings information. This allows clever user interfaces to be constructed based on ratings information, especially reputation-based arragements. Consider our message from Joe Blow above and the message display below: From: joe blow Subject: foobar PGP-Signed-Message-Ratings: 7 PGP-Signed-Rating-Average: 67.5 PGP-Signed-Common-Subtopic: DIGICASH Highest-Rator-Comment: Good discussion of Chaum's methods Lowest-Rator-Comment: Stupid rehash of Chaum's approach Now back the MUA implementation: This database (or other storage) which I have been discussing is somewhat irritating in its complexity and storage requirements, but mostly unavoidable because of the ordering properties of the problem. It is quite likely that one will receive ratings *before* the mail message itself arrives (in fact, this may be exactly what you want!) so the MUA will be forced to "sit on" ratings for messages it hasn't seen yet. To make the database/storage problem a bit more managable, MUA's can delete ratings information when they delete a mail message. However, it seems like a good idea to keep this information lying around for potential searches of an archive of a mailing list. I.e. "Give me messages which have the rating subtopic of DIGICASH and average ratings over 50" Major problems with this scheme: 1) Heavy dependance on Message-Id: field of messages and not all messages bear one of these. I don't see that is avoidable, since we must uniquely identify mail messages and the Message-Id is designed to do this. 2) This scheme rewards people who wait on the mail message ratings to come in then read the mailing list. This could be problematic if we get into a situation where people who should be doing the rating aren't reading messages (and rating them) because they are waiting on others to do so. I'm not really sure how to address this problem; timeliness cannot be used a factor when considering ratings, because of the speed differences in different peoples mail transport system (its unfair to penalize those that have long mail delays or are vacation). I think that perhaps some sort of "carrot" should be used to encourage to rate messages, but I'm not sure what this carrot would be. what do ya think? ian
Forgive my ignorance, but isn't this a lot of overkill? I mean, one could simply set up a filter for subjects/people you don't want to see or press the 'D' key. Or is there a larger picture that I'm still failing to grasp (very probable.) ____ Robert A. Hayden <=> hayden@krypton.mankato.msus.edu \ /__ -=-=-=-=- <=> -=-=-=-=- \/ / Finger for Geek Code Info <=> In the United States, they \/ Finger for PGP 2.3a Public Key <=> first came for us in Colorado... -=-=-=-=-=-=-=- (GEEK CODE 1.0.1) GAT d- -p+(---) c++(++++) l++ u++ e+/* m++(*)@ s-/++ n-(---) h+(*) f+ g+ w++ t++ r++ y+(*)
I'm not sure that I understand why a mail message could be rated multiple times by the same rator, unless you mean that one might define "axes of rating", like "content", "spelling", "novelty", etc.
This is exactly the reason. We've already discussed saliency. There are a few more criteria I can think of immediately (including the one we know): -- salience. What is the article about? -- clarity. In the age of information overload, clarity and brevity are the soul of politeness. Consider this. When you post to cypherpunks, several hundred people may read your message. If you can spend one minute making your words clear, you will save hours in aggregate for all involved. But in fact, if it's not clear enough, I don't want to read it at all, saving even more time in aggregate. Example of a characteristically low clarity rating: L. Detweiler -- novelty. Repeated arguments have as their primary quality that they are ... repeated. Do I want the same rehash over and over? How many times do I want to hear about hidden trapdoors in DES? Zero. Example of a characteristically low novelty rating: Sternlight These two examples are not hypothetical. -- fact/query/opinion. What is the balance between verifiable claims of fact, question or request for help or information, and mere assertion? People who wish to help newbies should be able to do so, and those who wish to ignore them should be able to do that. -- readware. A fellow at Bell Labs is working with 'readware', which is a computer analog of the smudged edges of a reference book in the place where it's opened to most. A simple readware scheme could deliver the number of lines that were read before the article was deleted. This information is pretty easy to collect, and requires almost no user intervention.
I think that such a scheme is good, but is starting to place more load on the rator.
Each rater need not be required to publish a full rating, nor even rate each article. No one is supposed to rate an article, and anybody should be capable of it.
[proposes an email-header based syntax for a rating]
The PGP information is intended to facilitate "rating reputations" so that MUAs could be configured to "trust" ratings from people with good reputations for rating in ways that meet the user's idea of "goodness."
Certainly the ratings format should allow for digital signatures. The identity of the rater is certainly relevant to a decision process. One of the immediate reasons for this is that one might easily want one's ratings to be private, and yet participate publically. Here is a use for pseudonyms that an ordinary person can understand. If you don't want someone to know that you think badly of them, don't tell them. But you can tell the world under a pseudonym. It's like an anonymous referee.
[on 0-100 scale] the common case (50) is the MOST unlikely to motivate someone to issue a rating message. I'm not sure what to do about this problem.
The Central Limit Theorem comes to the rescue. It says that if you add together enough instances of random variables with the same distribution, you always get a Gaussian distribution (a bell curve). [ An aside. This is the secret reason that statistical mechanics works. Add up enough atoms, and you _can_ assume a Gaussian. My physics professor did not tell me this. Grr. ] Get enough raters, and the ratings can be first-approximated to good accuracy by the mean and variance. High variance means it's controversial, sometimes a positive characteristic in its own right. And if you get a bimodal distribution, so much the more.
X-Mail-Rating: cypherpunks
Certainly a list identifier for mail handling would be useful, but that's not part of a rating syntax.
1) Heavy dependance on Message-Id: field of messages and not all messages bear one of these.
You check. Every single one from toad.com does. Message-Id is a required field. If mail doesn't have it, the mailer is misconfigured. What most mailers do is that if they don't see a Message-Id, they add their own; this is what toad.com does.
2) This scheme rewards people who wait on the mail message ratings to come in then read the mailing list.
That is the idea. Some people want to read everything, some don't. Those who read early will tend to get their own words read more often, and this may be reason enough to rate. A good reputation for rating may also translate into a good reputation for writing.
(its unfair to penalize those that have long mail delays or are vacation).
It's also completely unavoidable. Live with it. Eric
participants (3)
-
hughes@ah.com -
iansmith@weasel.cc.gatech.edu -
Robert A. Hayden