fingerprinting traffic at ISP for big content
John Case
case at sdf.lonestar.org
Tue Jun 8 12:11:17 PDT 2010
Recent events related to "big content" pursuing individual file sharers
based on ISP logs are _very interesting_.
My first thought is that this usage is tracked via filename - you are
guilty until proven otherwise if bittorrent traffic indicates a filename
that matches [Hh][Uu][Rr][Tt].[Ll][Oo][Cc][Kk][Ee][Rr].
But this seems shaky to me ... certainly bittorrent, already demonized,
combined with an incriminating filename can shake loose a quick
settlement, but in the long run this seems unsustainable. Maybe they
extend it to HTTP and FTP, etc., but you've still got an unknown file,
interesting only in its name.
On the other end of things, the ISPs cannot be saving all of the data.
But what about fingerprinting it all ? Let's think of a traffic backbone
... at comcast, for instance ... say 2 gigabits/s aggregate traffic ...
the hashes themselves of all files (or, let's say, all files 100 MB in
size or larger) won't take up much storage, but this is a non-trivial
amount of CPU.
a) Does anyone know what method is being used for these pursuits ? I'm
assuming a low tech "parse filenames in unencrypted BT traffic" but I
haven't heard any details...
b) Once lawyers and ISPs collude to fully exploit this "revenue source",
what is a reasonable course of action ? Can they hash all files at that
rate of data transfer ? I'm wondering if that investment would be worth
the settlements it produces...
More information about the cypherpunks-legacy
mailing list