[liberationtech] On the Feasibility of Internet-Scale Author Identification
----- Forwarded message from Steve Weis <steveweis@gmail.com> -----
On Tue, 2012-02-21 at 15:02 +0100, Eugen Leitl wrote:
For a given blog post, the researchers were able to positively identify an individual author from among 100,000 possibilities 20% of the time. However, their method does not work if authors deliberately obfuscate their writing style.
Is it just me, or is this not impressive? 100k is a tiny subset of authors, and a 20% hit rate means they're wrong 80% of the time. And then, if I "deliberately obfuscate my writing style", I can get out of that 20%? [demime 1.01d removed an attachment of type application/pgp-signature which had a name of signature.asc]
Is it just me, or is this not impressive? 100k is a tiny subset of authors, and a 20% hit rate means they're wrong 80% of the time. And then, if I "deliberately obfuscate my writing style", I can get out of that 20%?
I guess it's a proof of concept kind of thing. Stating there is such a thing as a unique writing style, even an identifiable one. Now further research can be done to actually make it work.
participants (3)
-
Eugen Leitl
-
lodewijk andré de la porte
-
Ted Smith