Hello, On Wed, 3 Sep 2003, Thomas Shaddack wrote:
Spammers recently adopted tactics of using randomly generated words, eg. "wryqf", in both the subject and the body of the message. These "pseudowords" are random, which makes them different from real words that are made of syllables.
Could the pseudowords be easily detected by their characteristics, eg. presence of syllables, wovel-consonant sequences/ratio, something like that? This could shift the balance of force in spam detection again, until the adversary will be forced to adopt the tactics of generating the random words from syllables instead of characters. Presence of pseudowords then could be added as one of spam characteristics.
I have, for a year or so now, been wondering about all the odd character strings I am finding in the subjects and body of my spam, and I too thought about keying on these for detection. However, I immediately abandoned the idea, as a quick glance over the content of my legitimate email - to and from developers, technical mailing lists, etc., revealed that almost all of my legitimate email also contains seemingly random bits of gibberish and pseudowords. Try to write the logic that distinguishes this: if_gre in the tree passes the mbuf to netisr_dispatch(), which in turn calls if_handoff(), which does something similar. (hackers@freebsd.org) from this: dyeiluykxoer dyeiluykcqkutknig dyeiluykkrpmhrku dyeiluykngeqx dyeiluykoybim dyeiluykbihlyrelg dyeiluyktwucinmdyeiluykwenmttwvm (actual spam) I must reiterate that, given the relentless efficiency of spam-spiders, merely publishing a shadow email address on all web documents that your real email address reside on, and deleting all email sent to both accounts is my current favorite anti-spam mechanism. Simple to DIY, and requires no centralization. ----- John Kozubik - john@kozubik.com - http://www.kozubik.com