Interesting idea, but it seems like that would be easy enough to foil. Why not just put the "inner" characters in a canonical order when scanning? (searching via google or another strict keyword-based search engine is another matter) Then you can cheaply match on a single form regardless of how they have permuted the word. I think the existing techniques that I've seen on the binaries channels on usenet and some of the spam I've been getting lately are already more effective. They just inject noise characters and use creative phonetic spellings. Maybe reducing the "words" to an improved soundex-like hash would be a more effective technique for dealing with this issue. Anyone know of any work in this area? (spell-checker research would probably yield the most results) Adam Lydick On Mon, 2003-09-22 at 15:39, Thomas Shaddack wrote:
Could be the l33t sp3ak next generation for the cases when the communication is monitored by automated tools for keywords. Could foil both alerting on keywords and keyword searching on intercepted and stored material (unless the keyword search would look also for all the possible permutations of the words).