Re: what fields to hash with hashcash (Re: A Trial Balloon to Ban Email?)
On Wed, May 14, 2003 at 10:02:42AM -0400, Sunder wrote:
And what happens when there's a network outage and a message gets stuck in the queue for a day on another server? You know a backup MX server when yours is hosed?
Do you not accept the mail because the current day doesn't match what's in the message? Or do you accept mails from a day ago? a week ago? a year ago? 1922?
I was suggesting 30 days. You could up that if you want -- the database won't be that big at say 32 bytes per recieved mail. The day is matched against the day in the token, as Bill said the tokens contain the date and the email address, in fact they look like this: 0:030514:foo@bar.com:482d3c37d5b5c112 where the first field is a version number, 2nd field is date (year,month,day), 3rd field is resource name (for email the recipient's email address) and last field is random junk to make it hash to trailing zeros. if you hash that with sha1: % echo -n 0:030514:foo@bar.com:482d3c37d5b5c112 | sha1 00000bea531c1edbcee4fbb69e094026cd83ed75 You can see that this one has 20 leadings 0s (in binary -- 5x4bit hex digits).
2nd, why wouldn't the spammer just adjust and send an email to each recipient with a random, but properly hashed token to match the target address + today's date? More work for sure, but if enough targets start adopting it, the spammer will adapt. The token doesn't have to contain an actual valid coin, and you'll only find out when you try to cash it.
If the token is random (ie the spammer put no computational work into it), it won't have the required number of bits of collision and the recipient will reject it. eg I'll just type this one in myself: 0:030514:sunder@sunder.net:0123456789abcdef Then my MTA or my mail-client (or any MTA in the path that does checking) will check: % echo -n 0:030514:sunder@sunder.net:0123456789abcdef aeeaf7971f7e30e2485062b17b43189e5361383f and see that there are insufficient leading 0 bits (none in this case). The tokens are only valid to a given recipient, if a spammer sends you a token address to me, you'll reject it because it doesn't have an address you receive mail for in it. If a spammers sends you the same valid token twice, you'll reject it because you keep a little database of received tokens. There is software here (windows GUI, windows cmd line, unix cmd line): http://www.cypherspace.org/hashcash/ Adam
At 7:56 AM -0700 5/14/03, Adam Back wrote:
I was suggesting 30 days. You could up that if you want -- the database won't be that big at say 32 bytes per recieved mail.
The day is matched against the day in the token, as Bill said the tokens contain the date and the email address, in fact they look like this:
0:030514:foo@bar.com:482d3c37d5b5c112
where the first field is a version number, 2nd field is date (year,month,day), 3rd field is resource name (for email the recipient's email address) and last field is random junk to make it hash to trailing zeros.
if you hash that with sha1:
% echo -n 0:030514:foo@bar.com:482d3c37d5b5c112 | sha1 00000bea531c1edbcee4fbb69e094026cd83ed75
You can see that this one has 20 leadings 0s (in binary -- 5x4bit hex digits).
This approach seems like a good direction. However, it does limit me to email per address per day. :-) Cheers - Bill ------------------------------------------------------------------------- Bill Frantz | Due process for all | Periwinkle -- Consulting (408)356-8506 | used to be the | 16345 Englewood Ave. frantz@pwpconsult.com | American way. | Los Gatos, CA 95032, USA
I'm not sure what the comment alludes to as it includes a ;-), but you can find multiple collisions against the same email address on the same day, viz: 0:030514:frantz@pwpconsult.com:49916794a98728f2 0:030514:frantz@pwpconsult.com:ffbead9be92472a3 etc. In fact they are also this way because you want there to be a relatively low probability of there being an accidental collision in the tokens created by different users sending you mail. The example implementation chooses the random string from a 2^64 space, however on average only 2^44 of those will be valid tokens (if you use 20 bit collisions), and so if you imagine someone receiving 256 mails in a day, they have a birthday probability of 2^-29 of having a mail falsely deleted because of an accidental collision. I guess that is a fairly low probability compared to email reliability, but anyway the safety margin can be increased simply by increasing the random string search space. Adam On Wed, May 14, 2003 at 11:14:56AM -0700, Bill Frantz wrote:
This approach seems like a good direction. However, it does limit me to email per address per day. :-)
At 7:56 AM -0700 5/14/03, Adam Back wrote:
The day is matched against the day in the token, as Bill said the tokens contain the date and the email address, in fact they look like this:
0:030514:foo@bar.com:482d3c37d5b5c112
where the first field is a version number, 2nd field is date (year,month,day), 3rd field is resource name (for email the recipient's email address) and last field is random junk to make it hash to trailing zeros.
OK, It wasn't clear (at least to me) that you were starting from a random seed to generate the hash cash. Note that ISPs could look for duplicate hashcash tokens on the input to their mail transfer agents, and save the disk and bandwidth for duplicates. It would be basically the same algorithm I proposed for collecting micropayments. (The couldn't check for the correct addressee because of facilities like .forward, but they could eliminate duplicates.) Cheers - Bill Cheers - Bill At 2:09 PM -0700 5/14/03, Adam Back wrote:
I'm not sure what the comment alludes to as it includes a ;-), but you can find multiple collisions against the same email address on the same day, viz:
0:030514:frantz@pwpconsult.com:49916794a98728f2 0:030514:frantz@pwpconsult.com:ffbead9be92472a3
etc. In fact they are also this way because you want there to be a relatively low probability of there being an accidental collision in the tokens created by different users sending you mail.
The example implementation chooses the random string from a 2^64 space, however on average only 2^44 of those will be valid tokens (if you use 20 bit collisions), and so if you imagine someone receiving 256 mails in a day, they have a birthday probability of 2^-29 of having a mail falsely deleted because of an accidental collision.
I guess that is a fairly low probability compared to email reliability, but anyway the safety margin can be increased simply by increasing the random string search space.
Adam
On Wed, May 14, 2003 at 11:14:56AM -0700, Bill Frantz wrote:
This approach seems like a good direction. However, it does limit me to email per address per day. :-)
At 7:56 AM -0700 5/14/03, Adam Back wrote:
The day is matched against the day in the token, as Bill said the tokens contain the date and the email address, in fact they look like this:
0:030514:foo@bar.com:482d3c37d5b5c112
where the first field is a version number, 2nd field is date (year,month,day), 3rd field is resource name (for email the recipient's email address) and last field is random junk to make it hash to trailing zeros.
------------------------------------------------------------------------- Bill Frantz | Due process for all | Periwinkle -- Consulting (408)356-8506 | used to be the | 16345 Englewood Ave. frantz@pwpconsult.com | American way. | Los Gatos, CA 95032, USA
participants (2)
-
Adam Back
-
Bill Frantz