The spam load at lne.com is WAY too high. Last month for example there was an average of 96 spams per day submitted to cypherpunks@lne.com. "submitted" means that it was the first time we saw that particular mail. Since the CDR system sends everything to all nodes, that means we got and passed on about 4x that number. 800 spams a day is a waste. So I have changed the way we deal with CDR mail. We will no longer pass to other CDRs mail that we would not send to subscribers. Eliminating outbound spam will cut our cypherpunks spam load in half. I am also going to filter the non-subscriber mail more effectively. Lately I have had to check about 50 spams a day to see if they are actually posts to the list. I find one real post every two or three days. Now I will save for human processing only the non-subscriber mail that is PGP signed or encrypted, or that looks like a reply. The majority of non-subscriber mails I've forwarded to the list have been replies. (I'll save everything for a while to make sure my filtering works). As always, other CDR operators and anyone else interested in running a CDR are welcome to my scripts. Eric
At 09:57 AM 05/12/2003 -0700, Eric Murray wrote:
I am also going to filter the non-subscriber mail more effectively. Lately I have had to check about 50 spams a day to see if they are actually posts to the list. I find one real post every two or three days. Now I will save for human processing only the non-subscriber mail that is PGP signed or encrypted, or that looks like a reply. The majority of non-subscriber mails I've forwarded to the list have been replies. (I'll save everything for a while to make sure my filtering works).
Could you at least bouncegram the mail that you're not saving, or else have SMTP use a reject message that says what it's doing? That way the occasional non-whitelisted non-subscriber human who sends mail to the list will get some indication that the mail's been rejected and what to do about it.
On Mon, May 12, 2003 at 10:39:04AM -0700, Bill Stewart wrote:
At 09:57 AM 05/12/2003 -0700, Eric Murray wrote:
I am also going to filter the non-subscriber mail more effectively. Lately I have had to check about 50 spams a day to see if they are actually posts to the list. I find one real post every two or three days. Now I will save for human processing only the non-subscriber mail that is PGP signed or encrypted, or that looks like a reply. The majority of non-subscriber mails I've forwarded to the list have been replies. (I'll save everything for a while to make sure my filtering works).
Could you at least bouncegram the mail that you're not saving, or else have SMTP use a reject message that says what it's doing? That way the occasional non-whitelisted non-subscriber human who sends mail to the list will get some indication that the mail's been rejected and what to do about it.
I'd like to. But bouncing the mail will mean that I have to send 100 bounces and process 100 bounces back, since few spammers use real mail addresses and most of those are already over quota or rejecting mail. So it won't save me much traffic and I'll have to add a hack for dumping the bounce bounces... I don't think I can do an SMTP error message since its getting handled in procmail, after the body is accepted. If I can't filter better than 50% of the non-subscriber spam without filtering some real non-subscriber posts, I'll probably try the bounce message thing. OTOH, the way I review non-subscriber mails after this change lets me see the sender name and subject without looking at the mail like I did before, so it may turn out that it's fast enough for me to review that I don't need to do any filtering at all. Oh, and before Peter asks, next time I mess with it, I'll make messages with the appropriate X-approved header go through without human intervention. Eric
On Mon, 12 May 2003, Eric Murray wrote:
I don't think I can do an SMTP error message since its getting handled in procmail, after the body is accepted.
You can fake it. lne.com forwarding all mail to eg. internal.lne.com and only the internal.lne.com doing the rejections is a valid setup. You can use procmail to feed the body of the rejected mail to some script that will "forge" the bounce message and send it back.
On Mon, May 12, 2003 at 11:05:41AM -0700, Eric Murray wrote, quoting Bill Stewart:
Could you at least bouncegram the mail that you're not saving, or else have SMTP use a reject message that says what it's doing? That way the occasional non-whitelisted non-subscriber human who sends mail to the list will get some indication that the mail's been rejected and what to do about it.
I'd like to. But bouncing the mail will mean that I have to send 100 bounces and process 100 bounces back, since few spammers use real mail addresses and most of those are already over quota or rejecting mail. So it won't save me much traffic and I'll have to add a hack for dumping the bounce bounces...
First we should thank Eric for doing an absolutely spectacular job of running his node. I've been running moderated and unmoderated lists since 1994, and, take it from me, dealing with administrative stuff is a thankless task. You're far more likely to deal with people who are angry than thankful. I like Bill's suggestion, and perhaps it's possible to use disposable email addresses (cp-20030512@lne.com?) for replies? But I don't understand why an extra 100 messages a day or even 1000 messages a day would be a big deal for a server with a fast Net connection. If it's DSL or cable modems, that makes sense. But if it's on a server with bandwidth charges, and that's the issue, does it make sense to take up a collection to offest the cost? (I'm thinking of asking for the same thing with Politech, since server hosting is getting expensive.) -Declan
On Mon, 12 May 2003, Declan McCullagh wrote:
But I don't understand why an extra 100 messages a day or even 1000 messages a day would be a big deal for a server with a fast Net connection. If it's DSL or cable modems, that makes sense.
Even 128kb/s ISDN does not appreciably bog. At current loads I use less than 10% of my bandwidth on average. I do see spikes that hit 100% but they last for minutes.
But if it's on a server with bandwidth charges, and that's the issue, does it make sense to take up a collection to offest the cost? (I'm thinking of asking for the same thing with Politech, since server hosting is getting expensive.)
A SDSL commerical feed is around $250 a month and you can host your own name server and such. Why should users pay for that? You're just trying to get a free ride. -- ____________________________________________________________________ We are all interested in the future for that is where you and I are going to spend the rest of our lives. Criswell, "Plan 9 from Outer Space" ravage@ssz.com jchoate@open-forge.org www.ssz.com www.open-forge.org --------------------------------------------------------------------
On Mon, May 12, 2003 at 09:12:54PM -0400, Declan McCullagh wrote:
But I don't understand why an extra 100 messages a day or even 1000 messages a day would be a big deal for a server with a fast Net connection. If it's DSL or cable modems, that makes sense.
Heh. I'd love DSL. It's a full-time dialup. That's all I can get out here in the woods. But there are advantages to running it here-- Since it's on a machine I own, I can hack up Majordomo all I want, and I don't have to worry about the server's owner getting upset over the list contents (or the ISP either, since I just get straight IP). If someone with a real server with real bandwidth wanted to take over or let me have full reign on their box to run it there, that'd be fine with me. Eric
On Mon, 12 May 2003, Eric Murray wrote:
If someone with a real server with real bandwidth wanted to take over or let me have full reign on their box to run it there, that'd be fine with me.
Once we get the 9P stuff up and running bandwidth issues will go away. 9P has this cool feature called 'lazy update' which means that when you mount a namespace you can move the data you want to local only when you access it. The rest of the time you just have a dir-like structure that handles namespace access and it sits somewhere on real hardware, but you could really care less where (or even how many 'where's' there are ore the size of the chunks the file is broken into ;). In a very real sense the CDR would stop residing anywhere in particular. It will also significantly blur the distinction between operator and subscriber. The author can protect the reader from MITM attacks via normal hashing by providing their key in a real-only namespace. Some other cool features we'll be able to support 'out of the box' is shared multi-media. So, several of us export our sound cards out via a /dev/sb namespace for example. I take my microphone output and simply pipe it to multiple /dev/sb/* entries and wallah you all hear it. One app here would be for a site to run a text-to-speech converter. Then anyone who wanted to listen to the submissions to the list would mount that dev and pipe it to their local sound board. This highlights a cool feature of Plan 9 in that you put services in only one location in the mesh/blanket/grid and 9P takes care of sharing it out to everyone. -- ____________________________________________________________________ We are all interested in the future for that is where you and I are going to spend the rest of our lives. Criswell, "Plan 9 from Outer Space" ravage@ssz.com jchoate@open-forge.org www.ssz.com www.open-forge.org --------------------------------------------------------------------
On Mon, 12 May 2003, Eric Murray wrote:
The spam load at lne.com is WAY too high.
:) All the bozo's subscribing the list don't help either...
Last month for example there was an average of 96 spams per day submitted to cypherpunks@lne.com. "submitted" means that it was the first time we saw that particular mail. Since the CDR system sends everything to all nodes, that means we got and passed on about 4x that number. 800 spams a day is a waste.
So I have changed the way we deal with CDR mail. We will no longer pass to other CDRs mail that we would not send to subscribers. Eliminating outbound spam will cut our cypherpunks spam load in half. I am also going to filter the non-subscriber mail more effectively.
Does this mean inbound to lne.com or to the backbone? It seems unclear. In other words you will filter inbound mail that isn't a subscriber to your node list? How will this be applied to mail on the backbone from other nodes?
Lately I have had to check about 50 spams a day to see if they are actually posts to the list. I find one real post every two or three days.
???? Man, you're dropping a lot of real traffic then. I'd say that I go through a couple hundred a day and there are about 15-20 posts a day. Of course there are days when there are almost zero non-spam mail.
Now I will save for human processing only the non-subscriber mail that is PGP signed or encrypted, or that looks like a reply.
So, if it's not PGP signed (ie has the appropriate ASCII string in it) and it comes from somebody who isn't on your subscriber list then it's headed for the bit bucket?
The majority of non-subscriber mails I've forwarded to the list have been replies. (I'll save everything for a while to make sure my filtering works).
So this means no anonymous email through lne.com then? Pity.
As always, other CDR operators and anyone else interested in running a CDR are welcome to my scripts.
I"m still not clear on how you're going to filter the backbone traffic. I'd certainly like to put them on the SSZ CDR page for reference. I'll move lne.com from 'non-moderated' to 'moderated' then? Thanks for the explanation and heads up. -- ____________________________________________________________________ We are all interested in the future for that is where you and I are going to spend the rest of our lives. Criswell, "Plan 9 from Outer Space" ravage@ssz.com jchoate@open-forge.org www.ssz.com www.open-forge.org --------------------------------------------------------------------
participants (5)
-
Bill Stewart
-
Declan McCullagh
-
Eric Murray
-
Jim Choate
-
Thomas Shaddack