On Fri, Aug 31, 2001 at 07:25:27PM -0700, georgemw@speakeasy.net wrote:
On 31 Aug 2001, at 17:07, Eric Murray wrote:
The various CDRs should deal with this-- they keep track of which Message-IDs they have posted, and don't post messages that they have already posted. They do pass those messages on to the other CDRs that they peer with though, so by posting two messages you're adding to the number of messages that get sent on the "backbone"...
Is it possible to set things up so that duplicate messages are filtered out based on a message digest rather than a message ID?
Just about anything is possible.
I'm dumb as a post and have no clue how message IDs are generated in the first place, but it seems to this simpleton that any hash function on the message body would be guarenteed to catch duplicate messages.
Message-ID is generated by the originating MTA (that's the ISP's sendmail or whatever program they're using to send mail). It's supposed to be unique-- it's often something like 200109012124.f81LObL18955@slack.lne.com where there's a date component (200109012124) and the sending machine's name, so that it's unique in a global namespace. Of course there's nothing keeping someone from sending mail with whatever message-id they want, assuming that they can control their sending MTA. The reason that it's used for identifying already posted messages is that procmail, which the CDR system is based on, has a nice built-in hook for keeping a message-id database and identifying duplicates. Eric