Recent events related to "big content" pursuing individual file sharers based on ISP logs are _very interesting_. My first thought is that this usage is tracked via filename - you are guilty until proven otherwise if bittorrent traffic indicates a filename that matches [Hh][Uu][Rr][Tt].[Ll][Oo][Cc][Kk][Ee][Rr]. But this seems shaky to me ... certainly bittorrent, already demonized, combined with an incriminating filename can shake loose a quick settlement, but in the long run this seems unsustainable. Maybe they extend it to HTTP and FTP, etc., but you've still got an unknown file, interesting only in its name. On the other end of things, the ISPs cannot be saving all of the data. But what about fingerprinting it all ? Let's think of a traffic backbone ... at comcast, for instance ... say 2 gigabits/s aggregate traffic ... the hashes themselves of all files (or, let's say, all files 100 MB in size or larger) won't take up much storage, but this is a non-trivial amount of CPU. a) Does anyone know what method is being used for these pursuits ? I'm assuming a low tech "parse filenames in unencrypted BT traffic" but I haven't heard any details... b) Once lawyers and ISPs collude to fully exploit this "revenue source", what is a reasonable course of action ? Can they hash all files at that rate of data transfer ? I'm wondering if that investment would be worth the settlements it produces...
On Tue, 8 Jun 2010, John Case wrote:
Recent events related to "big content" pursuing individual file sharers based on ISP logs are _very interesting_.
My first thought is that this usage is tracked via filename - you are guilty until proven otherwise if bittorrent traffic indicates a filename that matches [Hh][Uu][Rr][Tt].[Ll][Oo][Cc][Kk][Ee][Rr].
This is a very common tactic: I received RIAA complaints of this form almost daily at the abuse desk.
But this seems shaky to me ... certainly bittorrent, already demonized, combined with an incriminating filename can shake loose a quick settlement, but in the long run this seems unsustainable.
Unfortunately, if they go to trial on these *losers*, the victim usually finds it cheaper to pay the settlement offer (usually around $300) than to pay a lawyer to get the thing thrown out of court. We have a local lawyer that does the vast majority of these cases for the victims, and even he recommends settlement as the preferred step.
Maybe they extend it to HTTP and FTP, etc., but you've still got an unknown file, interesting only in its name.
On the other end of things, the ISPs cannot be saving all of the data.
But what about fingerprinting it all ? Let's think of a traffic backbone ... at comcast, for instance ... say 2 gigabits/s aggregate traffic ... the hashes
2gb? Are you serious? Not since the 90s. Today I would expect something more along the lines of 100gb to 500gb aggregated across a mid to large backbone.
themselves of all files (or, let's say, all files 100 MB in size or larger) won't take up much storage, but this is a non-trivial amount of CPU.
a) Does anyone know what method is being used for these pursuits ? I'm assuming a low tech "parse filenames in unencrypted BT traffic" but I haven't heard any details...
Most ISPs dont even bothe with deep inspection, as it unnecessary: they already have immunity, so whatever happens is the victims problem not the ISPs. The exceptions would be on hosted networks where deep inspection may be done for firewalling or statistical purposes (we did both: an inline splitter tap for statistics and an inline router for firewalling. Interestingly, these high volume lines are only rarely anywhere near capacity, and any reasonably current hardware can deal with deep inspection easily.
b) Once lawyers and ISPs collude to fully exploit this "revenue source", what is a reasonable course of action ? Can they hash all files at that rate of data transfer ? I'm wondering if that investment would be worth the settlements it produces...
The ISPs are completely uninterested in litigation based on content: doing this would remove their current immunity as a simple carrier. What they *are* interested in (at least some of them) are blocking or slowing down traffic which directly interferes with their own offerings, for example, VOIP. //Alif -- "Never belong to any party, always oppose privileged classes and public plunderers, never lack sympathy with the poor, always remain devoted to the public welfare, never be satisfied with merely printing news, always be drastically independent, never be afraid to attack wrong, whether by predatory plutocracy or predatory poverty." Joseph Pulitzer, 1907 Speech
John Case wrote:
Recent events related to "big content" pursuing individual file sharers based on ISP logs are _very interesting_.
My first thought is that this usage is tracked via filename - you are guilty until proven otherwise if bittorrent traffic indicates a filename that matches [Hh][Uu][Rr][Tt].[Ll][Oo][Cc][Kk][Ee][Rr].
Its complex. the surprising and short answer is - bit torrent traffic does not have *any* file names; the torrent descriptor file contains the file layout, individual file hashes, and an overall hash that is used to reference the torrent in communications (in fact, no torrent client will talk to you unless you reference a file hash it is currently holding "live", either for download or seeding). Alternative distributed peer locating systems and "trackerless" cloud torrents have a secondary system for handling this information, but move the actual data using the torrent protocols. You also get "private" trackers, who require a unique registered token per registered user before they will share peer information. Most of these prohibit alternative peer finding mechanisms, which is good, but conversely also track which torrents users have uploaded and how much (in bytes) which, given those logs are kept, is potentially a goldmine for anyone wishing to link interest in a given file to a list of people who have "distributed" it in part or whole.
participants (3)
-
Dave Howe
-
J.A. Terranson
-
John Case