fingerprinting traffic at ISP for big content

Wed Jun 9 10:47:38 PDT 2010

On Tue, 8 Jun 2010, John Case wrote:

> Recent events related to "big content" pursuing individual file sharers based
> on ISP logs are _very interesting_.
> 
> My first thought is that this usage is tracked via filename - you are guilty
> until proven otherwise if bittorrent traffic indicates a filename that matches
> [Hh][Uu][Rr][Tt].[Ll][Oo][Cc][Kk][Ee][Rr].

This is a very common tactic: I received RIAA complaints of this form 
almost daily at the abuse desk.

> But this seems shaky to me ... certainly bittorrent, already demonized,
> combined with an incriminating filename can shake loose a quick settlement,
> but in the long run this seems unsustainable. 

Unfortunately, if they go to trial on these *losers*, the victim usually 
finds it cheaper to pay the settlement offer (usually around $300) than to 
pay a lawyer to get the thing thrown out of court.  We have a local lawyer 
that does the vast majority of these cases for the victims, and even he 
recommends settlement as the preferred step.

> Maybe they extend it to HTTP and FTP, etc., but you've still got an 
> unknown file, interesting only in its name.
> 
> On the other end of things, the ISPs cannot be saving all of the data.
> 
> But what about fingerprinting it all ?  Let's think of a traffic backbone ...
> at comcast, for instance ... say 2 gigabits/s aggregate traffic ... the hashes

2gb? Are you serious? Not since the 90s. Today I would expect something 
more along the lines of 100gb to 500gb aggregated across a mid to large 
backbone.

> themselves of all files (or, let's say, all files 100 MB in size or larger)
> won't take up much storage, but this is a non-trivial amount of CPU.
> 
> a) Does anyone know what method is being used for these pursuits ?  I'm
> assuming a low tech "parse filenames in unencrypted BT traffic" but I haven't
> heard any details...

Most ISPs dont even bothe with deep inspection, as it unnecessary: they 
already have immunity, so whatever happens is the victims problem not the 
ISPs. The exceptions would be on hosted networks where deep inspection may 
be done for firewalling or statistical purposes (we did both: an inline 
splitter tap for statistics and an inline router for firewalling. 
Interestingly, these high volume lines are only rarely anywhere near 
capacity, and any reasonably current hardware can deal with deep 
inspection easily.

> b) Once lawyers and ISPs collude to fully exploit this "revenue source", what
> is a reasonable course of action ?  Can they hash all files at that rate of
> data transfer ?  I'm wondering if that investment would be worth the
> settlements it produces...

The ISPs are completely uninterested in litigation based on content: doing 
this would remove their current immunity as a simple carrier.  What they 
*are* interested in (at least some of them) are blocking or slowing 
down traffic which directly interferes with their own offerings, for 
example, VOIP.

//Alif

-- 
"Never belong to any party, always oppose privileged classes and public
plunderers, never lack sympathy with the poor, always remain devoted to
the public welfare, never be satisfied with merely printing news, always
be drastically independent, never be afraid to attack wrong, whether by
predatory plutocracy or predatory poverty."

Joseph Pulitzer, 1907 Speech