Textual analysis
John Kelsey
kelsey.j at ix.netcom.com
Sun Dec 14 07:36:02 PST 2003
At 09:44 AM 12/13/03 -0600, Harmon Seaver wrote:
...
> And what is my supposed "three-space paragraph lead-ins?" The concept of
>textual analysis to prove ID has always amused me. A competent writer can
>easily
>change writing styles from moment to moment. I well recall a university
>english
>lit prof almost accusing me of plagarism when I wrote a piece mimicking
>Faulkner
>and doing so well enough that the prof actually started looking thru his works
>trying to find it.
Textual analysis correctly identified the author of _Primary Colors_,
though that was from a pretty small field of people with the right level of
inside knowledge. Does anyone know whether there have been real randomized
trials of any of the textual analysis software or techniques? E.g., is
this an identification technique like DNA, or is it an identification
technique like retrieving repressed memories under hypnosis (or,
equivalently, consulting a ouiji board)?
It's not obvious to me how you'd change your writing style to defeat these
textual analysis schemes--would it really be as simple as changing the
average length of sentences and getting rid of the big words, or would
there still be ways to determine your identity from that text? I'm
thinking especially of long discussions of technical topics--if I wrote a
five page essay on what to look at when trying to cryptanalyze a new block
cipher, I think it would be hard to keep readers who knew me from having a
pretty good guess about the author, even if I tried changing terms, being
more mathematical and less conversational, etc. (Though this is more of a
problem with humans familiar with my writing style, rather than with
automated analysis.)
>Harmon Seaver
>CyberShamanix
>http://www.cybershamanix.com
--John Kelsey, kelsey.j at ix.netcom.com
PGP: FA48 3237 9AD5 30AC EEDD BBC8 2A80 6948 4CAA F259
More information about the Testlist
mailing list